Comments (76)
Hi @jawz101,
I'm writing an improvement but we have to admit, it's impossible to avoid false positive.
That's why whitelisting is more important than blocking.
Now we have to decide between 2 way:
- We extract and test all possible domains.
or - We extract all domains which are (or may be) relevant.
For now, I'm implementing the second way but we may think about the other way in the future or as an extra option.
from pyfunceble.
Caught another bug:
This section in the original lists are rules that removing elements(only) from legit sites:
After PyFunceble
filter the lists the end result in domain.list
is:
Notice how legit sites are being blocked?
from pyfunceble.
Okay you have to explain me AdBlock then @dnmTX 😸 I'm not a big fan of it as its syntax is confusing.
So how do I differ legit from bad site in adblock ? I though that adblock was only about blocking not whitelisting 🤔
from pyfunceble.
OK......
i'll do the basics only to be more clear:
If you want to block domain you need to add || in the front and ^ at the end(it will catch the subdomains as well)
If you want to block just element in that website,you need to find it(chrome dev-tools helps a lot with that) and add ## after the domain name.Example:
Open yahoo.com
(i removed soooo many elemnts from that page you wouldn't recognize it).
Now...look at your yahoo page and compare to mine:
Much cleaner,no videos,no annoyances.
Rules examples:
yahoo.com###applet_p_50000278
yahoo.com###applet_p_32209491
yahoo.com###applet_p_50000277
yahoo.com###applet_p_63802
yahoo.com###applet_p_63796
yahoo.com###sticky-lrec2-footer
from pyfunceble.
Okay so what about this format ? Which of the following mark the domain as a bad or good boy ?
||google.com$script,image
||api.google.com/papi/action$popup
facebook.com###player-above-2
~github.com,hello.world##
@@||cnn.com/*ad.xml
!||world.hello/*ad.xml
!@@||funceble.world/js
yahoo.com,msn.com,api.hello.world#@#awesomeWorld
!funilrys.com##body
hubgit.com|oohay.com|ipa.elloh.dlorw#@#awesomeWorld
I know you will not find them in real world but they are part of the tests for the decoder.
from pyfunceble.
The ##[href^=....
it's different.it's embedded in the iframe
and this how you blocking those domains
from pyfunceble.
Okay I'm working on that implementation.
So in this
hubgit.com|oohay.com|ipa.elloh.dlorw#@#awesomeWorld
they are all legit right ?
from pyfunceble.
||google.com$script,image -this rule will not allow any scripts or images to be shown or executed on that domain
||api.google.com/papi/action$popup -this rule will stop the popup coming from that link
facebook.com###player-above-2 -this one will hide element(looks like a video player) on that page
~github.com,hello.world## -hmmmm haven't seen this one
@@||cnn.com/*ad.xml -this rule will whitelist that link on the webpage (@@ in front is whitelisting)
!||world.hello/*ad.xml -this will block it(! in the front is comment)
!@@||funceble.world/js -this will whitelist that js script (! in the front is comment)
yahoo.com,msn.com,api.hello.world#@#awesomeWorld -don't know
!funilrys.com##body -this will block element
hubgit.com|oohay.com|ipa.elloh.dlorw#@#awesomeWorld -don't know
from pyfunceble.
Stay put,let me do some research on #@#
rule cause i'm using AdGuard and haven't seen such a rule there
from pyfunceble.
Ok... in the above example the rule #@#
allows(whitelists) that particular element on the listed domains,so yes,all those domains are legit
from pyfunceble.
Okay let me implement this issue first with the current format will then review with you for all tests as those need some hotfix. Never thought about whitelisting 😹
from pyfunceble.
I know,it's Java,more complex.Took me a while to get around it but i'm getting there
from pyfunceble.
@funilrys make it simple.Everything that has || in front and ^ at the end
stays.
Everything that has href
in it stays(filtered of course to leave the domain only).The rest should be removed as it's rules that don't really concern any of us who will use the lists in hosts format.
from pyfunceble.
Yeah but if I do that, I'll invalidate AdBlock/filter list like https://github.com/MajkiIT/polish-ads-filter 😸
from pyfunceble.
Only need to take some time to understand how it works properly then will clean the mess I created!
from pyfunceble.
Look at this one for example.In it,all legit domains with rules to block certain elements only.
from pyfunceble.
Ok,i know it will take time but meanwhile,for everyone who uses the lists with dnsmasq etc etc and not adblockers. Can you PLEASE add https://raw.githubusercontent.com/Dawsey21/Lists/master/main-blacklist.txt to be filtered properly.
from pyfunceble.
Also you can start here,it's very well explained and will help you understand the basics:
https://kb.adguard.com/en/general/how-to-create-your-own-ad-filters
from pyfunceble.
Ok,i know it will take time but meanwhile,for everyone who uses the lists with dnsmasq etc etc and not adblockers. Can you PLEASE add https://raw.githubusercontent.com/Dawsey21/Lists/master/main-blacklist.txt to be filtered properly.
from pyfunceble.
@dnmTX ,
PyFunceble is fixed, please look at the tests for details.
As you mentioned, there was really an issue with my way of handling adblock lists. Therefor here is the eratum:
Please understand by self.expected
the list of extracted domains from the given input (self.lines
).
self.lines = [
"||funilrys.github.io$script,image",
"||google.com^$script,image",
"||twitter.com^helloworld.com",
"||api.google.com/papi/action$popup",
"facebook.com###player-above-2",
"~github.com,hello.world##.wrapper",
"@@||cnn.com/*ad.xml",
"!||world.hello/*ad.xml",
"bing.com,bingo.com#@##adBanner",
"!@@||funceble.world/js",
"yahoo.com,~msn.com,api.hello.world#@#awesomeWorld",
"!funilrys.com##body",
"hello#@#badads",
"hubgit.com|oohay.com|ipa.elloh.dlorw#@#awesomeWorld",
'##[href^="https://funceble.funilrys.com/"]',
"[AdBlock Plus 2.0]",
'##div[href^="http://funilrys.com/"]',
'com##[href^="ftp://funceble.funilrys-funceble.com/"]',
"/banner/*/img^" "|github.io|",
"|github.io|",
"||api.funilrys.com/widget/$",
]
self.expected = [
"funilrys.github.io",
"google.com",
"twitter.com",
"api.google.com",
"funceble.funilrys.com",
"funilrys.com",
"github.io",
"api.funilrys.com",
]
As the tests were passed without any issue (cf.) I can attest that the next release and the current development version do not take any false positive anymore.
Please let me know if there is something else.
This issue will be closed on next release!
Cheers,
Nissar
from pyfunceble.
- Also my tests should now be compliant with https://adblockplus.org/filter-cheatsheet
from pyfunceble.
@funilrys from what i can tell and understand is self.lines
is the example of if there is any domains there not to be added for filtering as they are legit? Am i close?
What about anything with ##div[href^=...
,those are usually bad ones that need blocking?
Another thing(just to make sure).Example:
||api.funilrys.com/widget/$
what this is is partial link that could be api.funilrys.com/widget/bla/bla/bla/ad.js
and the adblocker will catch it but the thing is that because that domain is hosting some ad or telemetry script(google usually does that) that doesn't mean that the actual domain is bad.The question is if there is certain rule how that domain will be considered,as bad or as good?
from pyfunceble.
@dnmTX self.lines
contains random lines that can be found in regular AdBlock. The objective of the code I write/wrote is to get as output the list self.expected
which is in more practical way, what we are going to test (the bad ones).
So from your point of view self.expected
represent the bad one we have to test.
About ##div[href^=...
It's there because usually you have ##[href^=...
but those variant also exist:
##div[href^=...
com##div[href^=...
com##[href^=...
With my review, the domain which is in the href attribute is extracted and formatted (remove protocol and "decorators") 😸
from pyfunceble.
Actually from my point of view the self.expected
should be considered the good ones with exception of everything that has href
in it.
Bad ones should start with ||
and end with ^
including all the href
variations.
from pyfunceble.
Wow you lost me 😹
For clarification, those are example of format do not consider those domains we are only talking about extracted domain from matched format 😸
| Expected/Extracted/Tested by PyFunceble | Line (example) |
|------------------------------------------ |-------------------------------------------- |
| funilrys.github.io | ||funilrys.github.io$script,image |
| google.com | ||google.com^$script,image |
| twitter.com | ||twitter.com^helloworld.com |
| api.google.com | ||api.google.com/papi/action$popup |
| funceble.funilrys.com | ##[href^="https://funceble.funilrys.com/"] |
| funilrys.com | ##div[href^="http://funilrys.com/"] |
| github.io | |github.io| |
| api.funilrys.com | ||api.funilrys.com/widget/$ |
Also if we match for example hello.world##ad-selector
we do not extract hello.world
as a bad one.
from pyfunceble.
Maybe I misunderstood something 🤔
from pyfunceble.
Also if we match for example hello.world##ad-selector we do not extract hello.world as a bad one.
Ok,that's good,that's how it's suppose to be but......
If we match ||api.google.com/papi/action$popup
do we extract api.google.com
as bad one or not?
This is where the tricky part is,cause in this example api.google.com
is very legit domain that hosts ad scripts and so on but also hosts things that without them the web page will be broken.
from pyfunceble.
If we have ||api.google.com/papi/action$popup
or for example ||api.example.org/pap/hello$popup
the system will extract, test and produce result respectively for api.google.com
and api.example.org
.
from pyfunceble.
basically you saying that api.google.com
will be blocked?
What if you have let's say: ||yahoo.com/papi/action$popup
so the end result will be:
0.0.0.0 yahoo.com in ACTIVE
folder.
from pyfunceble.
You're thinking about after.
PyFunceble work with the data you provide. Which means that if you decide to have api.google.com
in your list, PyFunceble will test it. If you decide to have api.google.com
in the hosts file to test, PyFunceble will test it. If you decide to provide ||api.google.com/papi/action$popup
into your adblock list, PyFunceble will extract and test it. PyFunceble is a global tool which does what he was told to: Check the availability of a given domain, IPv4 or URL.
What you do with the results and data is what you want. That's why there is whitelist in project like Ultimate. Because we all know that false positive will always be there in such big compilation. And we are not talking about maintainers who block for example google.com
.
It's their list, our tests, our compilation but we still have to deal with whitelisting because the upstream maintainer may not want to whitelist x or y even if they are legit and not harmful.
from pyfunceble.
By the way if you're looking for a whitelisting script we have one at https://github.com/Ultimate-Hosts-Blacklist/dev-center/tree/whitelisting 😸
from pyfunceble.
Ok,i see now.You were more concerned about how to properly extract the domains,i was worry more about false positives(as it's understandable cause i'm the end user).Basically for me not to worry about that cause i'm really trying to automate everything and not even think about it can't you PLEASE(again) add the other LISTS and let's be done with it.
P.S. I can't use Python scripts on my rouer
@funilrys how hard is to add one more lists? Why so stubborn?
from pyfunceble.
fyi - I have a couple of things to bring up. 1st, I'm going through the process of checking both the EasyList and EasyPrivacy blocklists (which are Adblock+ formatted) and they've been running all day so it may be a little while after but I can attach the outputs of each to this to see where there might be issues.
2nd- if we're talking about using PyFunceble to process an adblockplus formatted list for use as a hosts file, you'd only want to go after a subset of the domains. I assume PyFunceble currently tries to parse out all of the domains referenced in an ABP list for validation. But this would also capture what would then be false-positives if they were to apply to a hosts file.
pfBlockerNG (a pretty awesome package for pfSense firewall) also includes a feature where it parses out the hosts from both the EasyList & EasyPrivacy lists and adds them to a traditional DNS blocklist. I don't know if it might help to see the logic behind it- even though it's PHP and all.
I think I found the part of pfBlockerNG that gives an idea of what they extract from these sorts of lists to only capture domain names:
The short of it to me was to only process lines in ABP- formatted lists against any line that began with and immediately ended with ||
example.com^$third-party
Looking through an adblockplus syntax'd list, for something like PyFunceble, it seems like you'd want to ignore any lines with this junk:
starts with:
- commented
!
- whitelisted
@@
or contains: - block specific parts of pages (the
##
div stuff) aka element hiding - specific web technologies
$ xmlhttprequest, object, popup, fonts, redirect, popunder, generichide,
etc.) - or certain domains only if it's seen or not seen on another certain domain
domain=
(which sounds like domains you wouldn't want to put into a hostsfile.)
if it's a line that is just
||
something.domain.tld^$third-party,
no more, no less - process those.
from pyfunceble.
Will be interested for your output @jawz101.
Between :
-
If it starts with
!
we already ignore. -
If it starts with
@
we already ignore. -
If it starts with
[
we already ignore. -
If it starts with
/
we already ignore. -
If
href^=unilrys.github.io
is present we extractunilrys.github.io
for testing. -
If it is in the format
||funilrys.github.io$script,image
we extractfunilrys.github.io
for testing. -
If it is in the format
|github.io|
we extractgithub.io
for testing -
If it is in the format
||twitter.com^helloworld.com
we extracttwitter.com
testing
I'm conscious that it may be aggressive but I tried hard to comply with https://adblockplus.org/filter-cheatsheet along with the needs of "our field".
Indeed, for the case of href^=unilrys.github.io
if we write that, I do consider that we implicitly consider the referenced href as a bad one so we extract and test it.
What's your inputs on this short statement? 😸
Have a good night.
Cheers,
Nissar
from pyfunceble.
I drove to go get some gas and I was still thinking about it and felt I didn't have a complete idea. I was today, coincidentally, trying to do by hand with regex and Notepad++ what we're talking about so I figured I'd chime in :/
... to add to the
||example.com^$third-party
I'd also want to block
||example.com^
What you said makes sense but is the goal to validate all domains in ABP rules or to also process them with the end result of a blocklist?
The reason I ask is I wouldn't want to block funilrys.github.io in this example because I wouldn't want to block the whole domain if an ABP rule was just trying to block certain bits of its content.
||funilrys.github.io$script,image
just depends on if the goal is to validate domains or also take them and then, say, make a pi-hole blocklist out of them. As is, it sounds like it would have a lot of false positives if I put the output into a blocklist file.
from pyfunceble.
... right now I'm going through https://easylist.to/easylist/easylist.txt by hand and finding examples of the lines I try to exclude and then see what I'm left with...
from pyfunceble.
attached is easyprivacy's list (I zipped up the cached files as well in case you want to set it up on a schedule like some others.)
ran with PyFunceble --adblock --link https://easylist.to/easylist/easyprivacy.txt
The EasyList is still on the K's... it's about x10 larger list than EasyPrivacy.
If the output looks good to you, @funilrys I want to send it on to the list maintainers. The EasyList one looks pretty red so I'm curious how it will turn out.
from pyfunceble.
Thanks @jawz101 will look into that when I have a bit of time.
||example.com^
is already extracted as expected in the test :
Line 361 in 4c36832
I get your point I did not thought about that little third-party option. Will implement 👍
from pyfunceble.
What about the other options @jawz101 ?
from https://adblockplus.org/filter-cheatsheet#options:
script~script | Include or exclude JavaScript files |
---|---|
image~image | Include or exclude image files |
stylesheet~stylesheet | Include or exclude stylesheets (CSS files) |
object~object | Include or exclude content handled by browser plugins like Flash or Java |
object-subrequest~object-subrequest | Include or exclude files loaded by browser plugins |
subdocument~subdocument | Include or exclude pages loaded within pages (frames) |
Exceptions | |
document | Used to whitelist the page itself (e.g. @@||example.com^$document) |
elemhide | Used to prevent element rules from applying on a page (e.g. @@||example.com^$elemhide) |
Domains | |
domain= | Specify a list of domains, separated by bar lines (|), on which a filter should be active. A filter may be prevented from being activated on a domain by preceding the domain name with a tilde (~). |
third-party~third-party | Specify whether a filter should be active on third-party or first domains |
Misc | |
rewrite= | Specify a rewrite rule for the URL to be performed before downloading. If the filter is a regular expression, use $n to insert submatches into the rewritten URL. See JavaScript own String.prototype.replace(). |
Is extracting third-party
only sufficient?
from pyfunceble.
Well, ||example.com^$third-party and ||example.com^ are what I ended up with I think
As for the rest of them, advanced syntax looks like it comes into play if you have conditions
When I see things that would cause breakage.
fancy conditions:
||example.com^$image
only block images from example.com
||example.com^$third-party,script,object
only block it if it's 3rd party or if it's first party block its scripts and objects. Like, I might still need some of example.com 1st party stuff. In fact, if someone tried to process a uBlock list, gorhill actually made tons of additional things to block
||example.com^...elemhide
- make the network connection but just hide the resource (say, you may need to establish a connection to that subdomain to get some parts of the webpage but remove some of the banners and stuff it also wants to show)
||example.com^... domain=somesite.com
only block example.com if it's on somesite.com
less false positives:
||example.com^
block example.com. Basically, use ABP rules as if it were a DNS/hostfile-styled blocker
||example.com^$third-party
block example.com if it is third party. Even though it is a condition they don't seem to block legit sites you'd visit.
Adding the Easylist thing because it finished sometime last night. It's probably more valuable to you than the EasyPrivacy report because it includes a bunch of element hiding junk. It's because of this, pfBlocker doesn't actually use the famous EasyList itself in its processing and instead uses it's little brother called "EasyList no elem hiding" list found on this page since it removes a lot of the fancy conditional stuff and is more suited for strict rules that block the actual connections from occuring.
One thing I noticed with easylist is "if I was using PyFunceble to validate any domain it found in an ABP+ rule, it would be fine. if I was using this to process a list for use as a blocklist, I'd be screwed."
If you search the list of active hosts for google.com or github.com you will see that it checks those domains because they were somewhere in a rule. If I was to throw this into a blocklist it would not be great.
from pyfunceble.
funilrys : but we may think about the other way in the future or as an extra option.
Yes, it would be the most usefull, especially for ad-block filters lists maintentainers to get rid of all dead domains.
funilrys : For now, I'm implementing the second way
The second way is also not bad as for the beginning, it still will extract many domains, also basically I would agree with this: #13 (comment), but further summarizing, these are bad domains which can be converted to hosts:
- All adblockers:
||domain.com
||domain.com^
||domain.com$third-party
||domain.com^$third-party
thehref
ones: #13 (comment) - uBO specific:
||domain.com^$3p
(shortened version of above's:$3p
=$third-party
)
||domain.com^$all
(all-in-1 combination of all options, excluding$important
)
||domain.com^$document
||domain.com^$important
(overrides whitelist filters)
and various combinations:
||domain.com^$3p,important
||domain.com^$important,3p
||domain.com^$all,important
||domain.com^$important,all
and variations without^
as well - AdGuard specific:
I don't use AdGuard (still it's a good adblocker, I'm just sticked to uBO + ND for years)
Notice: variations without ^
are very rare, they're mostly typos, but they still are valid filters
All other filters having anything additional to the above's should not be extraced, examples:
||twitter.com^helloworld.com
from : #13 (comment),||funilrys.github.io$script,image
from: #13 (comment)||domain.com^$domain=domain1.com|domain2.com
||domain.com^$third-party,image
As they don't block the whole domain (neither twitter.com
nor funilrys.github.io
nor domain1.com
, as they still can be visited), which means I agree with ( #13 (comment) ) :
jawz101 : The reason I ask is I wouldn't want to block funilrys.github.io in this example because I wouldn't want to block the whole domain if an ABP rule was just trying to block certain bits of its content.
Of course ||domain.com^$3p
/ ||domain.com^$third-party
can be still visited as well, but they are mostly just ad and tracking servers.
from pyfunceble.
Interesting @kulfoon,
Thanks for your feedback! I still chose to extract twitter.com
, funilrys.github.io
and domain.com
because of the use case described in #42.
I still added your examples to the tests and the current code passes it!
Thanks again for your feedback.
Stay safe and healthy!
from pyfunceble.
Thanks you too, however, it's getting more and more confusing.
funilrys: #13 (comment)
Since you decided to implement the second way, I think you should stop at extracting domains which should be only completely/almost completely blocked, like in my previous comment, what would also cover:
jawz101 : #13 (comment) : just depends on if the goal is to validate domains or also take them and then, say, make a pi-hole blocklist out of them.
By extracting anything more, like:
I still chose to extract
twitter.com
,funilrys.github.io
anddomain.com
because of the use case described in #42. +facebook.com##.search
from #42 (comment)
you will not cover the jawz101's above because causing false positives, just as he said:
jawz101 : #13 (comment) : As is, it sounds like it would have a lot of false positives if I put the output into a blocklist file.
also you are going beyond what it should be as for the second way, what will end up having neither the first way nor the second way and rather a some kind of a strange mix of both the first and the second way. So why not just to implement separately the first way method by simply extracting all domains, to cover all extraordinary domains, instead of partially extracting extraordinary domains into the second way method, also what sense is in extracting just a part of extraordinary domains. Alternatively, you could put all of your extraordinary domains into --agressive
switch #42 (comment) .
funilrys : I still added your examples to the tests and the current code passes it!
I appreciate, but perhaps no need to add at least these:
PyFunceble/tests/test_converter_adblock.py
Line 120 in 083a4cc
PyFunceble/tests/test_converter_adblock.py
Line 122 in 083a4cc
because such (or at least similiar) examples are already present in the tests:
PyFunceble/tests/test_converter_adblock.py
Line 118 in 083a4cc
Greets.
from pyfunceble.
1
Another failures:
Test filter | Extraction result |
---|---|
||site1.com |
site1.com |
||site2.com^ |
site2.com |
||site3.com$ |
site3.com |
||site4.com/ |
site4.com |
||site5.com* |
site5.com* failure (artefact) |
||site6.com$third-party |
site6.com |
||site7.com^$third-party |
site7.com |
||site8.com^$3p |
failure |
||site9.com^$all |
site9.com |
||site10.com^$document |
site10.com |
||site11.com^$important |
failure |
||site12.com^$3p,important |
site12.com |
||site13.com^$important,3p |
failure |
||site14.com^$all,important |
site14.com |
||site15.com^$important,all |
site15.com |
||site16.com^$doc |
failure |
||site17.com^$document |
site17.com |
||site18.com^$domain=site19.com |
site18.com, site19.com |
^adv^$domain=site20.com |
failure |
adv$domain=site21.com |
failure |
adv^$domain=site22.com |
failure |
As for the last 3 failures, many of such failures can be found in
https://easylist-downloads.adblockplus.org/easylistpolish.txt
The list contains about 2961 domains, but only 2459 are found by
Adblock Decoder (with --aggressive
option), which gives 83% efficiency.
failures
^banery-reklamowe^
^reklamy^
^sidebar_reklama^
^1000x390-$domain=tygodnik.szczytno.pl
^1200x1200_$domain=dziennikopolski.pl|dziennikwarszawy.pl|gazetawalbrzych.pl|gazetawielkopolska.pl|glosczestochowy.pl|gloskatowic.pl|gloskrakowa.pl|gloslodzi.pl|glosrybnika.pl|glosrzeszowa.pl|glostorunia.pl|glostrojmiasta.pl|gloswroclawia.pl
^1450x370-$domain=tygodnik.szczytno.pl
^2020bannery^$domain=wysokomazowiecki24.pl
^300^$domain=betglob.pl|betgol.pl|epilka.pl|estadios.pl|kuszotv.pl|mecze24.pl|meczelive.tv|oddslivesport.com|worldofbookmakers.com
^300^$image,domain=meczenazywo.tv|meczlive.pl
^468^$domain=betgol.pl|worldofbookmakers.com
^700x999-$domain=tygodnik.szczytno.pl
^728^$domain=betgol.pl|epilka.pl|extragoals.com|kuszotv.pl|mecze.com|mecze24.pl|meczelive.tv|oddslivesport.com
^728^$image,domain=meczenazywo.tv|meczlive.pl
^750^$image,domain=meczenazywo.tv|meczlive.pl
^750x200.$domain=tygodnik-rolniczy.pl
^750x200_$domain=geekweek.pl|polsatsport.pl
^787fe504fcc37d9fd342c3584614fb9d_$domain=zd24.pl
^a-posters^$domain=minskmaz.com
^AAAreklamy^$domain=zulawytv.pl
^ad^$domain=pieniny24.pl|sportowepodhale.pl
^admaster^$domain=suwalki24.pl
^ads^$domain=oswiecimskie24.pl|polski-tenis.pl
^adsrotator^$domain=korex.net.pl
^adv^$domain=alltube.tv|goryizerskie.pl|naszglospoznanski.pl|radio.kielce.pl|ringpolska.pl|sportsiedlce.pl|zyciegrajewa.pl
^adv_$domain=kcynia24.pl|koscian.net|naklo24.pl|powiat24.pl|sadki24.pl
^advertiseButtons^$domain=smanager.pl
^adwokat.gif?$domain=swarzedz24.pl
^as^$image,domain=xtech.pl
^ban^$domain=ang24.pl|dlagimnazjalisty.info|dlakierowcy.info|dlamaturzysty.info|dlaprzedszkolaka.info|dlaucznia.info|doctoralstudy.eu|edubaza.pl|gramatyka-angielska.info|kampusy.info|kierunki-studiow.info|kierunki-zamawiane.pl|kontastudenckie.pl|kursyjezykowe.biz|kursyletnie.pl|kursyonline.info|kursysemestralne.pl|kwalifikacjezawodowe.info|mba-studies.eu|organizacjestudenckie.pl|postgraduatestudy.eu|pracaikariera.pl|rektorzy.pl|studenckamarka.pl|studenckamoda.pl|studentka.pl|studentmapa.pl|studentnews.eu|studentnews.pl|studiadoktoranckie.info|studiainzynierskie.info|studialicencjackie.info|studiamagisterskie.info|studiamba.info|studianiestacjonarne.info|studiaonline.info|studiapodyplomowe.info|studiaweuropie.info|studies-in-english.pl|studies-in-europe.eu|studies-in-poland.pl|szkolyjezykowe.info|undergraduatestudy.eu|wameryce.info|weuropie.info|wincomparator.com
^banbron.$domain=oczamiostrego.pl
^baner%$domain=magazynregionalny.pl|podlasie24.pl|wdrawskupomorskim.pl
^baner-$domain=allebiznes.pl|automotivesuppliers.pl|bstok.pl|centrumdruku3d.pl|cksport.pl|cowkrakowie.pl|cowwilanowie.pl|crypto-trader.pl|echokamienia.pl|genetyczne.pl|gloskoninski.pl|gorydlaciebie.pl|hi-fi.com.pl|iloveslubice.pl|infokatowice.pl|infokonin.pl|infomalopolska.pl|jazdaprawna.pl|kk24.pl|limanowianin.in|luzyce.info|magazynvip.pl|materialybudowlane.info.pl|mazopolska.pl|miastokobiet.pl|mysnet.pl|niezaleznemediapodlasia.pl|ostrodanews.pl|oswiecim112.pl|oto-samochody.pl|polskacanada.com|portalkujawski.pl|powrotroberta.pl|przewodnicywedkarscy.pl|regionalna.pl|rynekinwestycji.pl|sochaczewianin.pl|starosadeckie.info|szczecinek.com|tarnobrzeskie.eu|tv28.pl|tvbraniewo24.pl|twojaslupca.pl|tygodnik-rolniczy.pl|tygodnikkrag.pl|tygodnikpiski.pl|tylkotorun.pl|tyna.info.pl|wiadomosci.com|wiecbork112.pl|wirtualnelegionowo.pl|wirtualnemazowsze24.pl|wirtualnynowydwor.pl|wolanie.info|zw.pl|zwrotnikraka.pl|zyciezamoscia.pl
^baner.$domain=magazynfakty.pl|ostrodanews.pl|przemyslkosmetyczny.pl|sokolka.tv
^baner2.$domain=edebno.pl
^baner^$domain=bielsko.info|bukowinatatrzanska.pl|chojnice.com|info.stargard.pl|ipon.pl|lubliniec.info|mojreprap.pl|nowiny.rybnik.pl|pfm.pl|progressforpoland.com|progressforpoland.org|pszczyna.tv|sennik-mistyczny.pl|sportslaski.pl|tarnowskiegory24.info|tygodnik-krapkowicki.info|zw.pl
^baner_$domain=7dni.pila.pl|asystentbhp.pl|bloog.pl|chcemybycrodzicami.pl|chemiaibiznes.com.pl|chorzowianin.pl|cksport.pl|czestochowskie24.pl|dzierzgon-twojemiasto.pl|e-pingpong.pl|extra.info.pl|forumfajerwerki.pl|fresh-market.pl|futbolfejs.pl|gazeta-mlawska.pl|gazetacz.com.pl|gazetagazeta.com|gielda-koni.com.pl|glogow-info.pl|grapplerinfo.pl|info.stargard.pl|infofordon.pl|infokatowice.pl|jaw.pl|jazdaprawna.pl|kurier-kolski.pl|kurierostrowski.pl|kurierzamojski.pl|legitymizm.org|lexus-forum.pl|magazynregionalny.pl|materialybudowlane.info.pl|meteoprog.pl|mojeniemcy.de|nowawrzesnia.pl|podgorze.pl|podlaskieagro.pl|portalkosmiczny.pl|przemyskie.info|przewodnicywedkarscy.pl|radiopiekary.pl|radomsport.pl|rdn.pl|regionalna.pl|sporteuro.pl|sweetwedding.pl|swiatoze.pl|swidnik.pl|tc.ciechanow.pl|thinkapple.pl|tvostrow.pl|tygodnik-krapkowicki.info|tygodnik-rolniczy.pl|tygodnikmakowski.pl|tygodnikprzasnyski.com.pl|urodaizdrowie.pl|vwgolf.pl|wdrawskupomorskim.pl|wiadomosci.rii.pl|wiatriwoda.pl|widzewtomy.net|wirtualnelegionowo.pl|wirtualnynowydwor.pl|zapytam.com|zkaszub.info|zpazurem.pl|zyciezamoscia.pl
^Baner_$domain=noweinfo.pl
^baner_$domain=sad24.pl|tc.ciechanow.pl|tygodnikmakowski.pl|tygodnikprzasnyski.com.pl
^banerek-$domain=my3miasto.pl
^banerek.$domain=futbolfejs.pl
^banerki^$domain=olkuski.pl
^banerkiIFRAME^$domain=info.elblag.pl
^banerkiJPG^$domain=info.elblag.pl
^banerm^$domain=co-slychac.pl
^banerpromocja.$domain=zapytam.com
^baners^$domain=konieimy.pl|nowaruda.info|podlasie24.pl|portalpszczelarski.pl|q4.pl|zyciekalisza.pl
^banery/8^$domain=sportowememy.pl
^banery2019^$domain=wysokomazowiecki24.pl
^banery^$domain=24tp.pl|alexjones.pl|archeton.pl|asta24.pl|belchatow.bai.pl|bialogardzianin.pl|bielsko.biala.pl|brodnica.net|bukowinatatrzanska.pl|cieszyninfo.pl|co-slychac.pl|dachy.info.pl|doba.pl|e-dobrydom.pl|e-konkursy.info|ebarlinek.pl|eglos.pl|emysliborz.pl|epszczyna.pl|fachowydekarz.pl|glowny-mechanik.pl|goryonline.com|haczyk.pl|haloursynow.pl|hotelinfo24.pl|igorzow.pl|igostyn.pl|ijarocin.pl|ikalisz.pl|ikamien.pl|ikrotoszyn.pl|infotydzien.info|iostrowwlkp.pl|ipleszew.pl|ipolska.info|ipyrzyce.pl|iswinoujscie.pl|jagiellonia.net|karpacz.net|kibice.net|kozienice24.pl|kuriergarwolinski.pl|kutno.net.pl|lipsko24.pl|lokalna24.pl|maritime.com.pl|mazury24.eu|mebleinfo.pl|miedzyrzec.info|moja-ostroleka.pl|mojaleczyca.pl|mragowo24.info|mymma.pl|nadmorze.pl|naszeopoczno.pl|nasztomaszow.pl|nowaruda.info|nowoczesnywarsztat.pl|opakowania.biz|ostrowmaz24.pl|paliwa.pl|piekarnie24.pl|pionki24.pl|pojezierze24.pl|polskicaravaning.pl|pomyslnadom.pl|portalwrc.pl|ppw.fishing|prudnicka.pl|przemyskie.info|pulskosmosu.pl|pultuszczak.pl|radomsport.pl|rynekpapierniczy.pl|skionline.pl|sluzby-ur.pl|sportgniezno.pl|staleo.pl|strzelecopolski.pl|swiatogloszen.net.pl|swiatopon.info|twojeradio.fm|tworzymyhistorie.pl|tygodnikpodhalanski.pl|tygodnikprudnicki.pl|tygodniksiedlecki.com|typersi.pl|wcf.org.pl|wiadomoscihandlowe.pl|wloclawek.info.pl|wpr24.pl|zpazurem.pl|zrobotyzowany.pl|zwolen24.pl|zywiecsupernowa.pl
^banery_$domain=instalacjebudowlane.pl|iszczecinek.pl|kuchenny.com.pl|maritime.com.pl|ogrodinfo.pl|portalplock.pl|sokolka.tv|tygodnik.pl|zegarkiipasja.pl|zyciesiedleckie.pl
^banery_foto^$domain=expresskaszubski.pl|gle24.pl|gostyn24.pl|gwe24.pl|ki24.info|pulsciechanowa.pl|roztocze.net|rzeczkrotoszynska.pl|terazlipno.pl|tusochaczew.pl
^banery_reklamowe^$domain=rtw.org.pl
^baneryhtml5^$domain=mazury24.eu
^baneryJPG^$domain=pomorskifutbol.pl
^banner%$domain=sanok112.pl
^banner-$domain=24firma.pl|420polska.pl|4pm.pl|businesswomanlife.pl|chinskiraport.pl|cksport.pl|czasnadmorze.pl|dzisiajwgliwicach.pl|gorydlaciebie.pl|juliarozumek.pl|ktoto.info|kurierzamojski.pl|meska-kuchnia.pl|morzaioceany.pl|nadarzyn.tv|nowaruda.info|pielegniarki.info.pl|polacywewloszech.com|przewodnicywedkarscy.pl|psychatog.pl|racingforum.pl|swiatoze.pl|transg.pl|transportchorego.pl|wirlandii.pl
^banner.$domain=archispace.pl|cksport.pl|diablo.phx.pl|porady.mobi|transportchorego.pl|tripybiznesekipy.pl
^banner1.$domain=infoskierniewice.pl
^banner^$domain=4lomza.pl|autoklub.pl|calisia.pl|cashless.pl|centrumdruku3d.pl|chords.pl|dlalejdis.pl|enduhub.com|eurobuildcee.com|exspace.pl|extremehobby.eu|flashscore.pl|geoforum.pl|gospodarz.pl|konin24.info|miastociechocinek.com|moje-dzialdowo.pl|movie-box.pl|opinieouczelniach.pl|palukimogilno.pl|polonika.at|polskieradio.com|probasket.pl|strazacki.pl|tawernaskipperow.pl|tygodnik-krapkowicki.info|tygodnikzamojski.pl|zawodykonne.com
^banner_$domain=7dni.pila.pl|angielskieespresso.pl|chemiaibiznes.com.pl|conamokotowie.pl|coreblog.pl|e-play.pl|exspace.pl|fcinter.pl|forumfajerwerki.pl|kotdoskonaly.pl|kucharze.pl|kurier-bielski.cnm.pl|legitymizm.org|lexus-forum.pl|meblarskapolska.pl|meczenazywo.pl|my3miasto.pl|podlaskieagro.pl|portalmedialny.pl|prestizgliwice.pl|probasket.pl|rdn.pl|sadnowoczesny.pl|slupca.pl|super-warez.eu|sweetwedding.pl|tygodnikpiski.pl|urodaizdrowie.pl|wrc.net.pl|zielonanews.pl
^bannerImages^$domain=40ton.net
^banners^$domain=4trucks.pl|agribiztvonline.com|alarmy.org|anonse-krosno.pl|augustowskireporter.pl|barlinek24.pl|beskidlive.pl|bieganie.pl|bitcoin-online.pl|bitcoin.pl|blacha.biz|budowa.org|budownictwo.org|bytomski.pl|casting4nick.pl|cech-lipno.pl|chlodnictwo.biz|cjo.edu.pl|conadrogach.pl|czasostrzeszowski.pl|cztery-lapy.pl|damagier.pl|dami.walbrzych.pl|daminfo.pl|darkplanet.pl|ddwlkp.pl|dj24.pl|domnaobcasach.com|dzieciaki-testuja.pl|dziennikmazowiecki.pl|e-kg.pl|e-konkursy.info|e-morag.pl|e-sadownictwo.pl|e-szamotuly.pl|ebelchatow.pl|edzieci.pl|elektryka.org|elka.pl|elpcmaniak.pl|em.kielce.pl|eoborniki.pl|erp-view.pl|estrzelce.pl|ezamosc.pl|fakty.nl|filme.pl|fit-online.pl|forumfajerwerki.pl|garsoniera.com.pl|gazetacz.com.pl|gazwoda.pl|gisplay.pl|glogow-info.pl|glogow.info.pl|glucholazyonline.com.pl|gostynin24.pl|gramwzielone.pl|independenttrader.pl|infolinia.com|infoprzasnysz.com|informacjelokalne.pl|ironmangdynia.pl|jakzapamietac.pl|jastrzebieonline.pl|jedzenie.info.pl|jemywlodzi.pl|juniorowo.pl|katowicedzis.pl|kierowcyhgv.uk|killuminati.pl|klimatyzacja.biz|kominy.biz|konecki24.pl|konin24.info|krasnik24.pl|kroplaarganu.com|krosnocity.pl|ksiegowosc.org|l24.lt|leonclub.pl|ligowiec.net|logistyka.net.pl|lowiczanin.info|lsi-lublin.pl|lubiehrubie.pl|luvpop.pl|magazyngitarzysta.pl|masarnieonline.pl|matematykazpasja.pl|materialybudowlane.info.pl|metale.org|mgsm.pl|miastokolobrzeg.pl|miastons.pl|miedziowe.pl|mleczarnieonline.pl|mleczarstwo.com|mojarodzina.org|moje-gniezno.pl|mojejaworzno.pl|moneyafterhours.blogspot.com|moszczenica.info|motofaktor.pl|motonews.pl|motoocena.pl|muzyczneradio.com.pl|narzedziownia.org|naszaostroda.pl|naziemna.info|nczas.com|niedziela.nl|norwegofil.pl|nowawrzesnia.pl|nowemiasto.com.pl|nysa.fm|nysainfo.pl|obiektyw.info|obiektywne.pl|odkrywamyzakryte.com|odpylanie.info|ofg.pl|ogrodnikleszek.pl|ogrzewanie.info.pl|okiemmaleny.pl|opakowania.biz|opoczno.info|otososnowiec.pl|partyzantka.com.pl|pcmusic.pl|petronews.pl|piekarnieonline.pl|piotrkowski24.pl|pnt24.info|podlaskieagro.pl|polskaniepodlegla.pl|poradniki24h.pl|portal-hale.pl|portalmorski.pl|powiatsuski24.pl|pozeramstrony.pl|przeclaw24.pl|przeglad-ogrodniczy.pl|przegladkoninski.pl|przekrojfinansowy.pl|pzl24.info|qulturaslowa.pl|radioplus.com.pl|radioq.fm|radomiak.pl|ratownik-med.pl|receptynadom.pl|ringpolska.pl|ringpolska24.pl|rtw.org.pl|rzeszowairport.pl|sacro.com.pl|samoloty.pl|sanatoria.com.pl|siewie.tv|skarzyski.eu|slaskiesiemianowice.pl|slowosportowe.pl|slubice24.pl|sokolka.tv|stalowka.eu|stolarstwo.org|stonerchef.pl|styropian.biz|sucha24.pl|swiat-kamienia.pl|tanie-loty.com.pl|targowek.info|team29er.pl|telewizjapolska24.pl|telix.pl|terazjaslo.pl|tsk24.pl|tustolica.pl|tv28.pl|tvmazovia.pl|tvostrow.pl|twojejaslo.pl|twojezaglebie.pl|tworzywa.org|tygodnikdzialdowski.pl|wadowiceonline.pl|warsztatpodrozy.com|waszaturystyka.pl|wentylacja.biz|wiadomoscirudzkie.pl|wirtualnygarwolin.pl|wojsko.com.pl|wojskonews.pl|wolna-polska.pl|wroclawskiejedzenie.pl|wschodnik.pl|zegarkiclub.pl|zegarkiipasja.pl|zory24.pl|zppa.org|zycie.me|zywiecinfo.pl
^banners_$domain=lebork24.info
^bannery2020^$domain=przemyslkosmetyczny.pl
^bannery^$domain=e-grajewo.pl|em.kielce.pl|legionisci.com|najlepsze-porady.pl|nbi.com.pl|nocnygdansk.pl|obk.pl|ogrodinfo.pl|olsztyn24.com|pomorskifutbol.pl|supernowosci24.pl|transportchorego.pl|wagaciezka.com|walbrzych24.com|wiz.pl|wroclife.pl
^bannery_$domain=olsztyn24.com|osowa24.pl|walbrzych24.com
^BanneryElka^$domain=elka.pl
^Bez%25C2%25A0tytu%25C5%2582u.$domain=oczamiostrego.pl
^bgz_$domain=oczamiostrego.pl
^boxy_po_prawej^$domain=ang24.pl|dlagimnazjalisty.info|dlamaturzysty.info|dlaprzedszkolaka.info|dlaucznia.info|doctoralstudy.eu|edubaza.pl|gramatyka-angielska.info|kampusy.info|kierunki-studiow.info|kierunki-zamawiane.pl|kontastudenckie.pl|kursyjezykowe.biz|kursyletnie.pl|kursyonline.info|kursysemestralne.pl|kwalifikacjezawodowe.info|mba-studies.eu|organizacjestudenckie.pl|postgraduatestudy.eu|pracaikariera.pl|rektorzy.pl|studenckamarka.pl|studenckamoda.pl|studentka.pl|studentmapa.pl|studentnews.eu|studentnews.pl|studiadoktoranckie.info|studiainzynierskie.info|studialicencjackie.info|studiamagisterskie.info|studiamba.info|studianiestacjonarne.info|studiaonline.info|studiapodyplomowe.info|studiaweuropie.info|studies-in-english.pl|studies-in-europe.eu|studies-in-poland.pl|szkolyjezykowe.info|undergraduatestudy.eu|wameryce.info|weuropie.info
^campaigns^$domain=grodzisknews.pl|radomsko24.pl
^canvas^$image,domain=naszrybnik.com|naszwodzislaw.com
^cene0^$domain=baxu.pl
^com_reklamy^$domain=radiofest.pl
^commerto.$domain=oczamiostrego.pl
^Company-Ads^$domain=informatorpolonijny.se
^d77b170a03da3f5858bf6cfa299f37fa.$domain=otozawiercie.pl
^dekoran-fototapety-$image,domain=ckinfo.pl
^dronmedia.$domain=7dni.pila.pl
^edytom.$domain=legitymizm.org
^gfx^$domain=hrubieszow.info
^img_bannery^$domain=piszanin.pl
^inline__$domain=sejny.net
^inline__1_$domain=24jgora.pl|24wroclaw.pl|ciechanowinaczej.pl|egarwolin.pl|eprzasnysz.pl|gpr24.pl|ibielany.pl|iochota.pl|iotwock.info|nwloclawek.pl|ototorun.pl|tudeblin.pl|turyki.pl|zycie.pila.pl|zyciekalisza.pl
^inline_images/114^$domain=mylomza.pl
^kaczynski_click^$subdocument,domain=wylecz.to
^kampania_reklamowa^$domain=lubliniec.info
^kurkuma185.$domain=wikirose.pl
^logo-$domain=legitymizm.org
^Oklejanie-$domain=dziennikopolski.pl|dziennikwarszawy.pl|gazetawalbrzych.pl|gazetawielkopolska.pl|glosczestochowy.pl|gloskatowic.pl|gloskrakowa.pl|gloslodzi.pl|glosrybnika.pl|glosrzeszowa.pl|glostorunia.pl|glostrojmiasta.pl|gloswroclawia.pl
^partner^$domain=podajdalej.info.pl
^partnerzy^$domain=abilet.pl|nasionakonopi.pl|szkolasen.com|widzewtomy.net
^pasek_reklam^$domain=proszkow.eu
^PKPP-$domain=linia.com.pl
^polecamy^$domain=aktyw14.net|almanak.pl|mysl24.pl
^promocja-$domain=edebno.pl
^py-img^$domain=igrit.pl
^r_-$domain=sloworegionu.pl
^rek^$domain=barlinek24.pl|forummleczarskie.pl|glogow-info.pl|inwestycje-rzeszow.pl|ivrozbiorpolski.pl|lpg-forum.pl|nowy-sacz.pl|tujastrzebie.pl|tuwodzislaw.pl|tuzory.pl
^rekl^$domain=pgt.pl|prostozopolskiego.pl
^reklama-$domain=4programmers.net|bilgorajska.pl|ciekawe.org|debica.tv|fizjoterapeuty.pl|fleschmazowsza.com.pl|icelandnews.is|infofordon.pl|ipulawy.pl|miedzyzdrojskie.info|moja24.pl|naszwybir.pl|ok24.tv|oswiecim112.pl|polacywewloszech.com|sportsiedlce.pl|sweetwedding.pl|tvdg.pl|warka24.pl|wodnyrelax.pl
^Reklama-$domain=noworudzianin.pl
^reklama-$domain=ostrodanews.pl|ptvr.pl|starosadeckie.info|transg.pl|tv-pelplin.pl|tvciechanow.pl
^reklama0_$domain=24ikp.pl
^reklama_$domain=24ikp.pl|eswinoujscie.pl|greencanoe.pl|jaslo4u.pl|kazimierzdolny24.pl|legitymizm.org|midrasz.home.pl|miedzyzdrojskie.info|rdn.pl|schadzka.com|tarnowiak.pl|tg.net.pl|tylkotorun.pl|wrc.net.pl
^reklama_poziom.$domain=biblia.iq24.pl
^reklamowe^$domain=msfera.pl
^reklamy-$domain=ipulawy.pl
^reklamy^$domain=cashless.pl|e-legnickie.pl|expresskaszubski.pl|gniezno24.pl|ino-online.pl|ino.online|limanowa.in|mojaolesnica.pl|mysl24.pl|nasztomaszow.pl|tucholainfo.pl|twojejaslo.pl
^reklamy_$domain=cenyrolnicze.pl|galeriafirm.eu|gazetka.be|pilkanozna.pl|plonskwsieci.pl|tugazeta.pl
^reklamy_baner^$domain=wirtualnygarwolin.pl
^rekrek^$domain=moj-sen.net
^rkm^$domain=skiny.pl
^rotator^$domain=gorzowianin.com|wcipy.pl|zarzadca.pl
^ruchchorzowBAN.$domain=ruchchorzow.com.pl
^rusztowania123.$domain=bmw-klub.pl
^sam-images^$domain=nowinyzabrzanskie.pl
^show-baners.php?$script,domain=24gliwice.pl
^sponsors^$domain=wiruspc.pl
^statystyk_box.$domain=bajery.pl
^STRONKA-1.jpg?$domain=swarzedz24.pl
^Szkolenia-BHP-$domain=dziennikopolski.pl|dziennikwarszawy.pl|gazetawalbrzych.pl|gazetawielkopolska.pl|glosczestochowy.pl|gloskatowic.pl|gloskrakowa.pl|gloslodzi.pl|glosrybnika.pl|glosrzeszowa.pl|glostorunia.pl|glostrojmiasta.pl|gloswroclawia.pl
^szumilo-$domain=radiobonton.pl
^Termotopw_$domain=radiobonton.pl
^therapy.$domain=tvklodzka.pl
^top%2B10%2Biherb.$domain=wikirose.pl
^Trevda/132.$domain=polska-zbrojna.pl
^view.php?key=$image,domain=gpcodziennie.pl
^widgets^$domain=karpacz24.pl
^wordwwwpunkty.$domain=7dni.pila.pl
^wpjslib-sgap.$script,domain=poczta.wp.pl
^banners^$domain=automotivesuppliers.pl
^banners^$domain=nowypm.pl
^baner-$domain=portalwolow.pl
^inline__1_$domain=nowagazeta.pl
^baner-$domain=naszsrem.pl
^banerkiJPG^$domain=zulawy.com
^baner-$domain=linia.com.pl
^homepage-banners^$domain=otomoto.pl
^reklama2-$domain=wirtualnyregion.pl
^300^$image,domain=emecze.pl
^728^$image,domain=emecze.pl
^baner_$domain=nanarty.info
^banner-$domain=nanarty.info
^baner-$domain=naszosie.pl
^inline__1_$domain=sucha24.pl
^cuprum-$domain=retailnet.pl
^grafika2^$domain=tsk24.pl
^bans^$domain=plastech.pl
^areklamy^$domain=e-lokalne.pl
^banners^$domain=polskatradycja.pl
^Betx^$domain=sportowebeskidy.pl
^banery^$domain=trentino.pl
^banery^$domain=reszel.pl
^artykul%20kierunek^$domain=otouczelnie.pl
^banner_$domain=tu.swinoujscie.pl
^materialy_budowlane_kleje_$domain=plytkiceramiczne.info.pl
^emisje^$domain=stalowemiasto.pl
^reklama-$domain=mojgdow.pl
2
funilrys : #13 (comment) : I still chose to extract
twitter.com
,funilrys.github.io
anddomain.com
because of the use case described in #42 (comment).
funilrys : #13 (comment)
Now we have to decide between 2 way:
- We extract and test all possible domains.
or- We extract all domains which are (or may be) relevant.
- could you clairfy the both ways more specific... because as far I see almost all domains are revelant from a point of view of an adblock filter list described in #42 (comment) so almost all domains should be extracted from all filters: almost any filter sticked to a dead domain should be removed from an adblocker list, (or if the domain is sticked to
domain=
, the domain should be removed then), I mean the case mentioned in #42 (comment) applies to almost all domains and filters... - so best would be to limit the 1st way of extraction to domains related / useful for hosts files only, and the second way should extract all domains from all filters which would be related / useful for adblocker lists to clean the lists, currently I see a weird mix of both...
from pyfunceble.
I don't have time yet. @keczuppp but let me reopen this so that I can answer you when I get a bit of time.
from pyfunceble.
For others who stumbles on this thread and wonders how they can solve this, there is a use example here: #219
from pyfunceble.
I've edited my comment, and added another failures.
from pyfunceble.
Hmm what did you change @keczuppp ??
from pyfunceble.
I didn't delete the history of changes, doesn't it work for you?
Anyway, I added last 4 rows of the table and the description + spoiler.
from pyfunceble.
OT response to keczuppp
I didn't delete the history of changes, doesn't it work for you?
Yes it does, however it is not obviously what was changed 😉 GH could do this better, or just do as they always have done, steel other idea's 😏
As an example from https://www.mypdns.org/T2368
As you can see, it highlights the changes (red for deleted)
None the less, thanks for your reply
from pyfunceble.
OT response to spirillen
I don't get your problem, GH does highlight the changes, just like on your screenshoot, with green color:
from pyfunceble.
Mhh, it's true. It was confusing to introduce the --aggressive
argument. But as the time it was first introduced, I wasn't even sure that that thing that I engineered will be actually used.
My objective was to try to be as accurate as possible and to reduce false positives when people are using the output of PyFunceble directly into their workflow ...
The AdBlock decoder itself is self-engineered. So the more input I will get the better it will. I have literally no way to imagine everything.
I'm currently working on v4.0.0bN
and there is in my opinion - after analysis - no real reason to split everything anymore. This tool should decode as much as possible.
Therefore, I'm willing to change the direction: What about PyFunceble trying to code as much as possible - if not all.
Inputs from users are highly welcome because I'm not actively writing blocklists.
@keczuppp thank you for your table which I will use for the tests.
Is this new direction fair enough (for everyone)?
cc @kulfoon @spirillen @dnmTX @
from pyfunceble.
from pyfunceble.
Moved my answer to #227 as It's OT to OP's post and I hope there will be more activities in replies to this topic Therefore, I'm willing to change the direction:
from pyfunceble.
(my reply is also a reply to #227 (comment) at the same time):
- yeah, to create a good quality Adblock Decoder is not an easy task, it's tricky, it's a good challenge in programming skills, to extract good domains and avoid false positives at the same time, if you don't feel up to it or you think it's too complicated and not worth to continue the developement, you can replace it with an easier and simpler "decode everything" Adblock Decoder mode, but it's not a magic potion to solve the Adblock Decoder problem, such mode finally will not be better because it will give too many useless false positives which will clutter the output list, there seem to be no easy way to go (no "magic bullet")...
- the default "HOSTS" mode should stay, because it's limited to parsing only a few kind of HOSTS compatible filters and that should not cause any problems, it's easy to implement and very useful for hosts USERS as a great supplement for their HOSTS lists, if anything cause problems in this mode, just limit extraction to what is listed here #13 (comment), but it seems you extract way too much in this mode on your own and it might cause troubles...
from pyfunceble.
Please take my commit and the underlying tests as the response. Is it still too much @keczuppp ?
Let's discuss the future of that specific decoder. I'll inject any future report about missing decoding into the tests. So the more reports, the better that decoder will be 😄
As I wrote, I'm not one of those who write a filter list... So help or directions are welcome!
from pyfunceble.
Hello, I was already trying to test the new version of Adblock Decoder (4.0.0b35
) but:
- the standalone Adblock Decoder https://github.com/PyFunceble/adblock-decoder hasn't been updated
(I was using it for testing before) - when I was trying to run it from PyFunceble I was unable coz I received some errors in Python (3.7.6):
error
Finished processing dependencies for PyFunceble-dev==4.0.0b35
D:\download_big_temp\_koding\PyFunceble-dev>pyfunceble
Traceback (most recent call last):
File "D:\download_big_temp\_koding\Python37\Scripts\pyfunceble-script.py", line 33, in <module>
sys.exit(load_entry_point('PyFunceble-dev==4.0.0b35', 'console_scripts', 'pyfunceble')())
File "D:\download_big_temp\_koding\Python37\lib\site-packages\pyfunceble_dev-4.0.0b35-py3.7.egg\Py
Funceble\cli\entry_points\pyfunceble\cli.py", line 1022, in tool
File "D:\download_big_temp\_koding\Python37\lib\site-packages\pyfunceble_dev-4.0.0b35-py3.7.egg\Py
Funceble\config\loader.py", line 370, in start
File "D:\download_big_temp\_koding\Python37\lib\site-packages\pyfunceble_dev-4.0.0b35-py3.7.egg\Py
Funceble\config\loader.py", line 331, in get_config_file_content
File "D:\download_big_temp\_koding\Python37\lib\site-packages\pyfunceble_dev-4.0.0b35-py3.7.egg\Py
Funceble\helpers\dict.py", line 290, in from_yaml_file
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\user\\AppData\\Local\\Temp\\tmp_
oox9mr2'
- also even if somehow you help me to fix Python errors, I still don't know how to use it from PyFunceble without doing any DNS queries etc (how to use it just like the standalone decoder: just to put an input file and get an output file), as I want to put some fitler list to test and decode it but I'm not interested spending hours waiting until it finish useless DNS queries (they are useless and garbare in case of testing the Adblock Decoder), was unable to find any info about skipping DNS queries in Adblock Decoder mode, in the documentation.
from pyfunceble.
@keczuppp Thanks for the notice. I'll update the AdBlock decoder project as soon as possible.
The simple way, is the pyfunceble --syntax --adblock --aggressive -f [file]
arguments. 😄
Note to self: Cleanup documentation.
from pyfunceble.
So I've tried the newest version v4.0.0b36.
and:
- PyFuncable has worked finally, it doesn't throw the error I mentioned in my previous comment anymore
(FileNotFoundError: [Errno 2] No such file or directory:
) so at least I was able to run the wholePyFunceble
- however, when I ran the
Adblock Decoder
with the command you provided in your comment, with a few lists (EasyList and also two main polish ads filter lists), it gets broken and throws another errors:- with https://easylist.to/easylist/easylist.txt - see
Errors 1 log
spoiler - it analysed nothing, crashed at the beginning - with https://easylist-downloads.adblockplus.org/easylistpolish.txt - see
Errors 2 log
spoiler - it analysed some domains and then crashed - with https://raw.githubusercontent.com/MajkiIT/polish-ads-filter/master/polish-adblock-filters/adblock_ublock.txt - see
Errors 3 log
spoiler - it analysed nothing, crashed at the beginning
- with https://easylist.to/easylist/easylist.txt - see
Errors 1 log (EasyList)
D:\download_big_temp\_koding>pyfunceble --syntax --adblock --aggressive -f easylist.txt
######## ## ## ######## ## ## ## ## ###### ######## ######## ## ########
## ## ## ## ## ## ## ### ## ## ## ## ## ## ## ##
## ## #### ## ## ## #### ## ## ## ## ## ## ##
######## ## ###### ## ## ## ## ## ## ###### ######## ## ######
## ## ## ## ## ## #### ## ## ## ## ## ##
## ## ## ## ## ## ### ## ## ## ## ## ## ##
## ## ## ####### ## ## ###### ######## ######## ######## ########
You are using the Beta version of PyFunceble 4.0.0!
Please take the time to communicate with us when you notice
something unusual.
Fatal Error: 'bool' object has no attribute 'replace'
Traceback (most recent call last):
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\cli\system\launcher.py",
line 864, in start
self.fill_to_test_queue_from_protocol()
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\cli\system\launcher.py",
line 593, in fill_to_test_queue_from_protocol
handle_file(protocol)
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\cli\system\launcher.py",
line 533, in handle_file
cidr2subject=self.cidr2subject,
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\cli\utils\testing.py", li
ne 228, in get_subjects_from_line
.set_data_to_convert(line)
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\converter\adblock_input_l
ine2subject.py", line 429, in get_converted
result.update(self._decode_v5(self.data_to_convert))
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\converter\adblock_input_l
ine2subject.py", line 382, in _decode_v5
result.update(self._decode_options(options.split(",")))
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\converter\adblock_input_l
ine2subject.py", line 211, in _decode_options
result.add(self.extract_base(matched))
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\converter\adblock_input_l
ine2subject.py", line 156, in extract_base
subject = subject.replace("*", "").replace("~", "")
AttributeError: 'bool' object has no attribute 'replace'
Process pyfunceble_tester_worker_2:
Process pyfunceble_tester_worker_1:
Traceback (most recent call last):
File "d:\download_big_temp\_koding\python37\lib\multiprocessing\connection.py", line 312, in _recv
_bytes
nread, err = ov.GetOverlappedResult(True)
BrokenPipeError: [WinError 109] Potok został zakończony
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "d:\download_big_temp\_koding\python37\lib\multiprocessing\process.py", line 297, in _bootstr
ap
self.run()
D:\download_big_temp\_koding>
Errors 2 log (EasyList Polish)
D:\download_big_temp\_koding>pyfunceble --syntax --adblock --aggressive -f easylistpolish.txt
######## ## ## ######## ## ## ## ## ###### ######## ######## ## ########
## ## ## ## ## ## ## ### ## ## ## ## ## ## ## ##
## ## #### ## ## ## #### ## ## ## ## ## ## ##
######## ## ###### ## ## ## ## ## ## ###### ######## ## ######
## ## ## ## ## ## #### ## ## ## ## ## ##
## ## ## ## ## ## ### ## ## ## ## ## ## ##
## ## ## ####### ## ## ###### ######## ######## ######## ########
You are using the Beta version of PyFunceble 4.0.0!
Please take the time to communicate with us when you notice
something unusual.
Subject
Status Source
----------------------------------------------------------------------------------------------------
----------- ----------
←[30m←[42mczasdzieci.pl
VALID SYNTAX
←[30m←[42mwlbetathome.adsrv.eacdn.com
VALID SYNTAX
←[30m←[42mmbank.pl
VALID SYNTAX
←[30m←[42mrtb.4finance.com
VALID SYNTAX
←[30m←[42mapp.freshmail.com
VALID SYNTAX
←[30m←[42mmandarinodesign.eu
VALID SYNTAX
←[30m←[42mad-work.pl
VALID SYNTAX
←[30m←[42mwirtualnyregion.pl
VALID SYNTAX
←[30m←[42madsearch.pl
VALID SYNTAX
←[30m←[42maffiliates-solutions.com
VALID SYNTAX
←[30m←[42madsnet.pl
VALID SYNTAX
←[30m←[42mbanmax.com
VALID SYNTAX
←[30m←[42mconverti.se
VALID SYNTAX
←[30m←[42mconvertiser.com
VALID SYNTAX
←[30m←[42mhub.com.pl
VALID SYNTAX
←[30m←[42mincontext.pl
VALID SYNTAX
←[30m←[42mleadstar.pl
VALID SYNTAX
←[30m←[42mmedia.stsaff.pl
VALID SYNTAX
←[30m←[42mnetsalesmedia.pl
VALID SYNTAX
←[30m←[42mpocketads.pl
VALID SYNTAX
←[30m←[42mreklamawadowice24.pl
VALID SYNTAX
←[30m←[42mrewords.pl
VALID SYNTAX
←[30m←[42mrenormaliseras.xyz
VALID SYNTAX
←[30m←[42mshopeneo.network
VALID SYNTAX
←[30m←[42mspacead.pl
VALID SYNTAX
←[30m←[42mtvtoss.com
VALID SYNTAX
←[30m←[42mwaytogrow.pl
VALID SYNTAX
←[30m←[42mad.admitad.com
VALID SYNTAX
←[30m←[42mad.e-lider.pl
VALID SYNTAX
←[30m←[42mad.eko-7.com.pl
VALID SYNTAX
←[30m←[42maff.bstatic.com
VALID SYNTAX
←[30m←[42mbusinessclick.biz.pl
VALID SYNTAX
←[30m←[42mavanti.fashion
VALID SYNTAX
←[30m←[42mcdn.leadbit.com
VALID SYNTAX
←[30m←[42mcomperialead.pl
VALID SYNTAX
←[30m←[42mcontexthub.net
VALID SYNTAX
←[30m←[42mczasdzieci.home.pl
VALID SYNTAX
←[30m←[42mec.hub2.com.pl
VALID SYNTAX
←[30m←[42meuphoniserent.xyz
VALID SYNTAX
←[30m←[42mlnaff.pl
VALID SYNTAX
←[30m←[42moffersprovider.widget.onet.pl
VALID SYNTAX
←[30m←[42mpartnerzyapi.ceneo.pl
VALID SYNTAX
←[30m←[42mppwidget.skapiec.pl
VALID SYNTAX
←[30m←[42mqwerty1.co.pl
VALID SYNTAX
←[30m←[42mr.pless.nazwa.pl
VALID SYNTAX
←[30m←[42mreklamy.hostings.pl
VALID SYNTAX
←[30m←[42msmartclick.pl
VALID SYNTAX
←[30m←[42msolutions4ad.com
VALID SYNTAX
←[30m←[42mstatic.travelist.pl
VALID SYNTAX
←[30m←[42msystem.mondeos.pl
VALID SYNTAX
←[30m←[42mthc-thc.com
VALID SYNTAX
←[30m←[42mtmefekt.pl
VALID SYNTAX
←[30m←[42mwydawca.lead.network
VALID SYNTAX
←[30m←[42mpopups.afftrack001.com
VALID SYNTAX
←[30m←[42m24gliwice.pl
VALID SYNTAX
←[30m←[42m24opole.pl
VALID SYNTAX
←[30m←[42mfilmweb.com
VALID SYNTAX
←[30m←[42m40ton.net
VALID SYNTAX
←[30m←[42m7dni.com.pl
VALID SYNTAX
←[30m←[42mad.polskiprzemysl.com.pl
VALID SYNTAX
←[30m←[42mad.prv.pl
VALID SYNTAX
←[30m←[42m300polityka.pl
VALID SYNTAX
←[30m←[42madform.net
VALID SYNTAX
←[30m←[42maferyprawa.eu
VALID SYNTAX
←[30m←[42maferyprawa.eu
VALID SYNTAX
←[30m←[42maferyprawa.eu
VALID SYNTAX
←[30m←[42maferyprawa.eu
VALID SYNTAX
←[30m←[42maferyprawa.eu
VALID SYNTAX
←[30m←[42maferyprawa.eu
VALID SYNTAX
←[30m←[42maferyprawa.eu
VALID SYNTAX
←[30m←[42maferyprawa.eu
VALID SYNTAX
←[30m←[42maferyprawa.eu
VALID SYNTAX
←[30m←[42maferyprawa.eu
VALID SYNTAX
←[30m←[42maktyw14.net
VALID SYNTAX
←[30m←[42maktyw14.net
VALID SYNTAX
←[30m←[42maktyw14.net
VALID SYNTAX
←[30m←[42malicdn.com
VALID SYNTAX
←[30m←[42mtelchina.pl
VALID SYNTAX
←[30m←[42mallebiznes.pl
VALID SYNTAX
←[30m←[42mwlodawa.net
VALID SYNTAX
←[30m←[42mandroidpolska.pl
VALID SYNTAX
←[30m←[42mangielskieespresso.pl
VALID SYNTAX
←[30m←[42mbelekaj.eu
VALID SYNTAX
←[30m←[42mapp.travellead.pl
VALID SYNTAX
←[30m←[42marcheton.pl
VALID SYNTAX
←[30m←[42marpass.nazwa.pl
VALID SYNTAX
←[30m←[42matthost.pl
VALID SYNTAX
←[30m←[42mwarownie.pl
VALID SYNTAX
←[30m←[42maudiostereo.pl
VALID SYNTAX
←[30m←[42mautoline.com.pl
VALID SYNTAX
←[30m←[42mautomotivesuppliers.pl
VALID SYNTAX
←[30m←[42mautomotivesuppliers.pl
VALID SYNTAX
←[30m←[42mautorak.com.pl
VALID SYNTAX
←[30m←[42mb24tv.pl
VALID SYNTAX
←[30m←[42mnadwisla24.pl
VALID SYNTAX
←[30m←[42mb24tv.pl
VALID SYNTAX
←[30m←[42mbaby-shower.pl
VALID SYNTAX
←[30m←[42mbankier.pl
VALID SYNTAX
←[30m←[42mziemiakepinska.pl
VALID SYNTAX
←[30m←[42mbatuu.pl
VALID SYNTAX
←[30m←[42mwizaz.pl
VALID SYNTAX
←[30m←[42mbaxu.pl
VALID SYNTAX
←[30m←[42mbeerpubs.pl
VALID SYNTAX
←[30m←[42mbetonline.net.pl
VALID SYNTAX
←[30m←[42mbezale.pl
VALID SYNTAX
←[30m←[42mbezale.pl
VALID SYNTAX
←[30m←[42mbezale.pl
VALID SYNTAX
←[30m←[42mbezale.pl
VALID SYNTAX
←[30m←[42mbezale.pl
VALID SYNTAX
←[30m←[42mbezale.pl
VALID SYNTAX
←[30m←[42mbezale.pl
VALID SYNTAX
←[30m←[42mbezale.pl
VALID SYNTAX
←[30m←[42mbielskiedrogi.pl
VALID SYNTAX
←[30m←[42mbielskiedrogi.pl
VALID SYNTAX
←[30m←[42mbielskiedrogi.pl
VALID SYNTAX
←[30m←[42mbiotechnologia.pl
VALID SYNTAX
←[30m←[42mbitcoin-online.pl
VALID SYNTAX
←[30m←[42mbithub.pl
VALID SYNTAX
←[30m←[42mblendy.pl
VALID SYNTAX
←[30m←[42mblogomaniak.pl
VALID SYNTAX
←[30m←[42mblogprezesa.pl
VALID SYNTAX
←[30m←[42mblogspot.com
VALID SYNTAX
←[30m←[42mmistrzbranzy.pl
VALID SYNTAX
←[30m←[42mbobrowniki.tv
VALID SYNTAX
←[30m←[42mbobrowniki.tv
VALID SYNTAX
←[30m←[42mbokser.org
VALID SYNTAX
←[30m←[42mbolec.info
VALID SYNTAX
←[30m←[42mbooking.com
VALID SYNTAX
←[30m←[42mkazimierzdolny24.pl
VALID SYNTAX
←[30m←[42mbronradom.pl
VALID SYNTAX
←[30m←[42mbronradom.pl
VALID SYNTAX
←[30m←[42mbronradom.pl
VALID SYNTAX
←[30m←[42mbronradom.pl
VALID SYNTAX
←[30m←[42mbronradom.pl
VALID SYNTAX
←[30m←[42mbstok.pl
VALID SYNTAX
←[30m←[42mburdadigital.pl
VALID SYNTAX
←[30m←[42mfocus.pl
VALID SYNTAX
←[30m←[42mbusiarze.com.pl
VALID SYNTAX
←[30m←[42mbusiarze.com.pl
VALID SYNTAX
←[30m←[42mbytomski.pl
VALID SYNTAX
←[30m←[42mc.spolecznosci.net
VALID SYNTAX
←[30m←[42mdlastudenta.pl
VALID SYNTAX
←[30m←[42mcba.pl
VALID SYNTAX
←[30m←[42mcdn-lubimyczytac.pl
VALID SYNTAX
←[30m←[42mlubimyczytac.pl
VALID SYNTAX
←[30m←[42mcdn.dcsaas.net
VALID SYNTAX
←[30m←[42msklepbazant.pl
VALID SYNTAX
←[30m←[42mswiatmodeli.eu
VALID SYNTAX
←[30m←[42marchigame.pl
VALID SYNTAX
←[30m←[42mszczecinek.com
VALID SYNTAX
←[30m←[42mceneo.pl
VALID SYNTAX
←[30m←[42mexerim.pl
VALID SYNTAX
←[30m←[42mkomputery-pc.info
VALID SYNTAX
←[30m←[42mpmi24.info
VALID SYNTAX
←[30m←[42mpch24.info
VALID SYNTAX
←[30m←[42mnowinylokalne.pl
VALID SYNTAX
←[30m←[42mpgo24.pl
VALID SYNTAX
←[30m←[42mppw.fishing
VALID SYNTAX
←[30m←[42mtromil.pl
VALID SYNTAX
←[30m←[42mgrojec24.net
VALID SYNTAX
←[30m←[42mcentrumkultury.eu
VALID SYNTAX
←[30m←[42mcentrumkultury.eu
VALID SYNTAX
←[30m←[42mceny-zlomu.pl
VALID SYNTAX
←[30m←[42mchemiabudowlana.info
VALID SYNTAX
←[30m←[42mceny-zlomu.pl
VALID SYNTAX
←[30m←[42mchemiaibiznes.com.pl
VALID SYNTAX
←[30m←[42mchlodnictwoiklimatyzacja.pl
VALID SYNTAX
←[30m←[42mciechanowinaczej.pl
VALID SYNTAX
←[30m←[42mciechanowinaczej.pl
VALID SYNTAX
←[30m←[42mcmas.pl
VALID SYNTAX
←[30m←[42mcn-tryton.pl
VALID SYNTAX
←[30m←[42mcodziennikmlawski.pl
VALID SYNTAX
←[30m←[42mcodziennikmlawski.pl
VALID SYNTAX
←[30m←[42mcodziennikmlawski.pl
VALID SYNTAX
←[30m←[42mcontentstream.pl
VALID SYNTAX
←[30m←[42mshareinfo.pl
VALID SYNTAX
←[30m←[42mcowwilanowie.pl
VALID SYNTAX
←[30m←[42mczestochowskie24.pl
VALID SYNTAX
←[30m←[42mcyfrowaekonomia.pl
VALID SYNTAX
←[30m←[42mczestochowskie24.pl
VALID SYNTAX
←[30m←[42mdentoforum.pl
VALID SYNTAX
←[30m←[42mdi.com.pl
VALID SYNTAX
←[30m←[42mdi.com.pl
VALID SYNTAX
←[30m←[42mdirect.money.pl
VALID SYNTAX
←[30m←[42mdobrewiadomosci.eu
VALID SYNTAX
←[30m←[42mdodajauto.pl
VALID SYNTAX
←[30m←[42mdodajauto.pl
VALID SYNTAX
←[30m←[42mdogosfera.pl
VALID SYNTAX
←[30m←[42mdogosfera.pl
VALID SYNTAX
←[30m←[42mdomenergo.com
VALID SYNTAX
←[30m←[42mdopilar.pl
VALID SYNTAX
←[30m←[42mdopilar.pl
VALID SYNTAX
←[30m←[42mdrzewkozabutelke.pl
VALID SYNTAX
←[30m←[42mdx-team.org
VALID SYNTAX
←[30m←[42mdynacrems.wp.pl
VALID SYNTAX
←[30m←[42mdz-ow.pl
VALID SYNTAX
←[30m←[42mdziennikpolski24.pl
VALID SYNTAX
←[30m←[42mdziennikzwiazkowy.com
VALID SYNTAX
←[30m←[42mdzierzgon-twojemiasto.pl
VALID SYNTAX
←[30m←[42mdzisiajwgliwicach.pl
VALID SYNTAX
←[30m←[42me-hotelarz.pl
VALID SYNTAX
←[30m←[42me-hotelarz.pl
VALID SYNTAX
←[30m←[42me-kg.pl
VALID SYNTAX
←[30m←[42me-kolo.pl
VALID SYNTAX
←[30m←[42me-kolo.pl
VALID SYNTAX
←[30m←[42me-petrol.pl
VALID SYNTAX
←[30m←[42me-pingpong.pl
VALID SYNTAX
←[30m←[42me-pingpong.pl
VALID SYNTAX
←[30m←[42me-play.eu
VALID SYNTAX
←[30m←[42me-pingpong.pl
VALID SYNTAX
←[30m←[42me-play.eu
VALID SYNTAX
←[30m←[42me-stargard.pl
VALID SYNTAX
←[30m←[42mebarlinek.pl
VALID SYNTAX
←[30m←[42mebookpoint.pl
VALID SYNTAX
←[30m←[42mswiatczytnikow.pl
VALID SYNTAX
←[30m←[42mautofanatyk.pl
VALID SYNTAX
←[30m←[42mebroker.pl
VALID SYNTAX
←[30m←[42micyfrowypolsat.pl
VALID SYNTAX
←[30m←[42mmojeanonse.pl
VALID SYNTAX
←[30m←[42mec.bankier.pl
VALID SYNTAX
←[30m←[42mmavelo.pl
VALID SYNTAX
←[30m←[42mmiedziak.info.pl
VALID SYNTAX
←[30m←[42mtutajglogow.pl
VALID SYNTAX
←[30m←[42mtutajlegnica.pl
VALID SYNTAX
←[30m←[42mtutajpolkowice.pl
VALID SYNTAX
←[30m←[42mec.bankier.pl
VALID SYNTAX
←[30m←[42micyfrowypolsat.pl
VALID SYNTAX
←[30m←[42mechogorzowa.pl
VALID SYNTAX
←[30m←[46mechogorzowa.pl^
INVALID SYNTAX
←[30m←[42medunews.pl
VALID SYNTAX
←[30m←[42meduson.pl
VALID SYNTAX
←[30m←[42mefilmy.tv
VALID SYNTAX
Fatal Error: 'bool' object has no attribute 'replace'
←[30m←[46meduson.pl^
INVALID SYNTAX
Traceback (most recent call last):
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\cli\system\launcher.py",
line 864, in start
self.fill_to_test_queue_from_protocol()
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\cli\system\launcher.py",
line 593, in fill_to_test_queue_from_protocol
handle_file(protocol)
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\cli\system\launcher.py",
line 533, in handle_file
cidr2subject=self.cidr2subject,
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\cli\utils\testing.py", li
ne 228, in get_subjects_from_line
.set_data_to_convert(line)
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\converter\adblock_input_l
ine2subject.py", line 429, in get_converted
result.update(self._decode_v5(self.data_to_convert))
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\converter\adblock_input_l
ine2subject.py", line 382, in _decode_v5
result.update(self._decode_options(options.split(",")))
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\converter\adblock_input_l
ine2subject.py", line 211, in _decode_options
result.add(self.extract_base(matched))
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\converter\adblock_input_l
ine2subject.py", line 156, in extract_base
subject = subject.replace("*", "").replace("~", "")
AttributeError: 'bool' object has no attribute 'replace'
←[30m←[42mekologia.guru
VALID SYNTAX
←[30m←[42megorzow.pl
VALID SYNTAX
←[30m←[42mekologia.guru
VALID SYNTAX
←[30m←[42mekologia.pl
VALID SYNTAX
←[30m←[42mekorodzice.pl
VALID SYNTAX
←[30m←[42mekstrastats.pl
VALID SYNTAX
Process pyfunceble_tester_worker_2:
Traceback (most recent call last):
Process pyfunceble_producer_worker_1:
File "d:\download_big_temp\_koding\python37\lib\multiprocessing\process.py", line 297, in _bootstr
ap
self.run()
Process pyfunceble_tester_worker_1:
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\cli\processes\workers\bas
e.py", line 434, in run
raise exception
Traceback (most recent call last):
Errors 3 log (Official Polish Filters for AdBlock, uBlock Origin & AdGuard)
D:\download_big_temp\_koding>pyfunceble --syntax --adblock --aggressive -f adblock_ublock.txt
######## ## ## ######## ## ## ## ## ###### ######## ######## ## ########
## ## ## ## ## ## ## ### ## ## ## ## ## ## ## ##
## ## #### ## ## ## #### ## ## ## ## ## ## ##
######## ## ###### ## ## ## ## ## ## ###### ######## ## ######
## ## ## ## ## ## #### ## ## ## ## ## ##
## ## ## ## ## ## ### ## ## ## ## ## ## ##
## ## ## ####### ## ## ###### ######## ######## ######## ########
You are using the Beta version of PyFunceble 4.0.0!
Please take the time to communicate with us when you notice
something unusual.
Fatal Error: 'bool' object has no attribute 'replace'
Traceback (most recent call last):
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\cli\system\launcher.py",
line 864, in start
self.fill_to_test_queue_from_protocol()
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\cli\system\launcher.py",
line 593, in fill_to_test_queue_from_protocol
handle_file(protocol)
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\cli\system\launcher.py",
line 533, in handle_file
cidr2subject=self.cidr2subject,
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\cli\utils\testing.py", li
ne 228, in get_subjects_from_line
.set_data_to_convert(line)
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\converter\adblock_input_l
ine2subject.py", line 429, in get_converted
result.update(self._decode_v5(self.data_to_convert))
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\converter\adblock_input_l
ine2subject.py", line 382, in _decode_v5
result.update(self._decode_options(options.split(",")))
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\converter\adblock_input_l
ine2subject.py", line 211, in _decode_options
result.add(self.extract_base(matched))
File "d:\download_big_temp\_koding\python37\lib\site-packages\PyFunceble\converter\adblock_input_l
ine2subject.py", line 156, in extract_base
subject = subject.replace("*", "").replace("~", "")
AttributeError: 'bool' object has no attribute 'replace'
D:\download_big_temp\_koding>
from pyfunceble.
@keczuppp, b37 is available and it should fix the error you reported.
Thanks again for testing !
from pyfunceble.
@keczuppp, the adblock-decoder is also upgraded to use the 4.0.0bX of PyFunceble.
from pyfunceble.
yep, good work:
- the bugs are fixed, and PyFunceble embeded Adblock-Decoder is finally working, the standalone version is working as well
- the split introduced in the rewrite is good (as it should be since the beginning)
- the new decoder fixes all failures from #13 (comment), except the last 3
- good enhanced manual for the standalone Adblock Decoder #13 (comment)
more tests later
from pyfunceble.
And don't laught at me, fvcktard.
from pyfunceble.
- strange, I tried today the standalone Adblock Decoder and it crashes and throws errors
- also it wasn't crashing yesterday, and I didn't change anything since yesterday, but I'm not sure whether I have been testing the same filter lists yesterday or not, so perhaps the bug was here already yesterday
spoiler errors log
D:\download_big_temp\_koding>adblock2plain --aggressive -o output2.txt easylistpolish.txt
Traceback (most recent call last):
File "d:\download_big_temp\_koding\python37\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "d:\download_big_temp\_koding\python37\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "D:\download_big_temp\_koding\Python37\Scripts\adblock2plain.exe\__main__.py", line 7, in <mo
dule>
File "d:\download_big_temp\_koding\python37\lib\site-packages\adblock_decoder\cli.py", line 104, i
n adblock2plain
args.input_file, args.aggressive, output=args.output
File "d:\download_big_temp\_koding\python37\lib\site-packages\adblock_decoder\core\adblock2plain.p
y", line 80, in process_conversion
for line in self.input:
File "d:\download_big_temp\_koding\python37\lib\encodings\cp1250.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 1350: character maps to <unde
fined>
D:\download_big_temp\_koding>adblock2plain --aggressive -o output2.txt easylist.txt
Traceback (most recent call last):
File "d:\download_big_temp\_koding\python37\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "d:\download_big_temp\_koding\python37\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "D:\download_big_temp\_koding\Python37\Scripts\adblock2plain.exe\__main__.py", line 7, in <mo
dule>
File "d:\download_big_temp\_koding\python37\lib\site-packages\adblock_decoder\cli.py", line 104, i
n adblock2plain
args.input_file, args.aggressive, output=args.output
File "d:\download_big_temp\_koding\python37\lib\site-packages\adblock_decoder\core\adblock2plain.p
y", line 80, in process_conversion
for line in self.input:
File "d:\download_big_temp\_koding\python37\lib\encodings\cp1250.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x83 in position 5977: character maps to <unde
fined>
- the embeded Adblock Decoder doesn't crash when testing same filter lists
- the filter lists are just downloaded in Firefox from the links given here: #13 (comment)
and in Notepad++ it says they are UTF-8 (Without BOM)
from pyfunceble.
And don't laught at me, fvcktard.
Who is laughing at you?
We are all here for some constructive work, enhancement, and discussion in our free time.
I personally take any input I can get regarding the decoder. I don't have time to laugh at someone when they are giving constructive inputs.
If it is because of my emoji, sorry if it offended you. It wasn't meant to harm.
Your last 3 cases are now into the source code that's going to be deployed next.
I'm going to look into the issue of the standalone decoder later.
from pyfunceble.
@keczuppp Please update and test the adblock-decoder
from pyfunceble.
funilrys :
I'm going to look into the issue of the standalone decoder later.
@keczuppp Please update and test the adblock-decoder
- it's fixed now in
1.2.0
funilrys :
Your last 3 cases are now into the source code that's going to be deployed next.
- it's fixed now in the standalone
Adblock Decoder 1.2.0
- but in the embeded
Adblock Decoder PyFunceble 4.0.0b39
it's fixed at 66%
becausesite21.com
is still not being extracted
Also:
- I know you have a reason to not do so in the embeded Adblock Decoder,
but shouldn't at least the standalone Adblock Decoder remove duplicates and then sort in a-z order?
If not internally, then at least in the output.
A parameter like--clean
could be added.
==================================================
OFF-TOPIC
funilrys :
Who is laughing at you?
If it is because of my emoji,
Yes, it was about "simple way" + the emoji, your comment looks like you wanted to show how stupid I am just because I missed something which you describe as "simple" (a parameter + the fact it should be put in the conjunction with other parameters) in the (big) documentation, which might be not so obvious.
Was it more funny to you, than your friend being unable to view history of my comment #13 (comment) ?
Why didn't you laught at him the same like at me?
Oh, because he is your firend, so you can't laught at your friends, just like you did at me.
I could laught (by putting a laugh emoji) at him just like you at me,
furthermore, I could lught at you (by putting a laugh emoji), every time the Adblock Decoder or PyFunceble crashes making you looking like a fool (but your are a good developer and bugs are normal thing in programming, unavoidable by a human being).
And then say: "sorry if it offended you. It wasn't meant to harm." but I'm not a such person.
funilrys : sorry if it offended you. It wasn't meant to harm.
Really? Then why didn't you explain what was the purpose of the emoiji in your comment then.
Do you want to just tell me you put the emoji for no reason, or just because you were in a happy mood and just by accident it was looking like you were laughting at me...Saying "It wasn't meant to harm you" and at the same time avoiding to explain what was the purpose of the emojii then , seems not too clear. I don't believe your cheap explanation, lie to yourself, I spent much time analysing, whether your intention was to laught at me or not, and something said me very clear you were. Just don't do it again at me, or watch where you put laugh emojis, it's not Facebook, but since they implemented emojis in GitHub it seems it turned into FaceHub, abused by trolls, they abuse emojis to troll other people at every occasion., it's a plague and infection of GitHub. I consider you a positive person and great developer overall, but just don't do it again.
from pyfunceble.
As for the domain=
filters (for ex. domain=page1.com|page2.com
):
- the last 3 failures from my previous comment were just example failures, the point is the decoder, instead of fixing particular
domain=
failures, should extract all domains regardless what is on the left or right side ofdomain=
, this inculdes@@
filters (because in--aggressive
mode we should extract all domains) - but currently the decoder extracts about a half of domains, to prove it I copy-pasted all
domain=
lines from several popular / big adblock filter lists + two polish lists:EasyList
,EasyPrivacy
,AdGuard Base
,AdGuard Tracking Protection
,Official Polish Filters for AdBlock, uBlock Origin & AdGuard
,EasyList Polish
and put into a single file / list (seedomain=.zip
), which contain about 11425 domains, but Adblock Decoder extracts only about a half (5426 domains) - by the way, the number of false positives is reasonably low, 39 out of 5426 domains :
false hits
&Type=Event.CPT&
-300x250.
-Background-1280x10241.
-spotify-com.akamaized.net
-tag.js
.
.cdn.digitaloceanspaces.com
.cdnjquery.com
.ch
.com
.criteo.com
.criteo.net
.digitaloceanspaces.com
.engageya.com
.filma24.
.gif
.html
.html|
.imagetwist.com
.impact-ad.jp
.jpg
.js
.js|
.m3u8
.min.js
.mp3
.mp4
.mp4.kakaoad.
.mp4|
.netdna-ssl.com
.php
.pl
.pornhub.com
.r.msn.com
.roofandfloor.com
.smithsonian.museum
.ssl-images-amazon.com
.ts
.xml
from pyfunceble.
Hey @keczuppp
.. the fact it should be put in the conjunction with other parameters) in the (big) documentation, which might be not so obvious.
What are you missing in the docs? and Could you elaborate it in a issue at https://github.com/spirillen/PyFunceble/issues (The repo I make the docs from)
Was it more funny to you, than your friend being unable to view history of my comment #13 (comment) ?
Why didn't you laught at him the same like at me?
Oh, because he is your firend, so you can't laughs at your friends, just like you did at me
I'm sorry you have taken it this way, it is not in any evil ways, and trust me we are laughing of each other "Error 40" we just happens to mostly do this on Keybase 😏. Next to this I can promise you that @funilrys never laughs (evilly) of everyone, it is always with the best intention from a good haert.
From re-reading #13 (comment) I can ensure it is a happy laugh for something went the right way, and you promised to do more tests.
from pyfunceble.
Hi,
spirillen : What are you missing in the docs?
Rather nothing, I think now, I didn't get familiar enought with the PyFunceble, thus I just overlooked.
OFF-TOPIC
spirillen : I'm sorry you have taken it this way, it is not in any evil ways, and trust me we are laughing of each other "Error 40" we just happens to mostly do this on Keybase . Next to this I can promise you that @funilrys never laughs (evilly) of everyone, it is always with the best intention from a good haert.
This time it was not so, I felt uncomfortable, I was trying to take it the way you described but I could not.
spirillen : From re-reading #13 (comment) I can ensure it is a happy laugh for something went the right way, and you promised to do more tests.
Nope, the fact that I wrote back with complaints later does not change anything, I was just trying to ignore it and think about it like you described, but even 2 days later I still felt uncomfortable to the point I had to send it back to the sender, can't ignore my feelings. In the end, the recipient decides whether he felt offended or not, maybe you should be careful where you put emotes.
Best regards.
from pyfunceble.
...maybe you should be careful where you put emotes.
Best regards.
Well, from my own point of view, I can only say: I'm feeling there are missing some emojis for the "fast feedback", meaning some would have to overlap others.
from pyfunceble.
but in the embeded
Adblock Decoder PyFunceble 4.0.0b39
it's fixed at 66%
becausesite21.com
is still not being extracted
I will look at it when I find time to touch that module back.
Yes, it was about "simple way" + the emoji, your comment looks like you wanted to show how stupid I am just because I missed something which you describe as "simple" (a parameter + the fact it should be put in the conjunction with other parameters) in the (big) documentation, which might be not so obvious.
Emojis are not necessarily meant to harm. My emoji was really not meant to harm, nor was it in an evil matter.
The usage of the "The simple way"
was just meant to introduce something which is not supposed to be hard and something that is supposed to also be easy to use and work as easy as possible. I do not expect everyone to know everything. Even I, have to look into my own source code to find something or provide a better answer to a question.
It was not meant to "humiliate", "shame" or "offend" you. If you understood it that way. I'm sorry it was not my intention.
You have to understand that most of the time, the solutions to the problem around here are extremely hard and need a bit of thinking or hacking of me. So I'm extremely happy of myself to present, find or have a simple solution to a problem. I was just happy to have a simple solution. I'm sorry that it was misunderstood.
Was it more funny to you, than your friend being unable to view history of my comment #13 (comment) ?
Why didn't you laught at him the same like at me?
Oh, because he is your firend, so you can't laught at your friends, just like you did at me.
I don't laugh at others. Not that (discrete) way. I'm not like that and it's not my intention. And even when it is the case, it's not in a harmful way.
About the nonreaction to other answers: My time is limited, I don't have time to answer everything.
In fact, at the time I'm writing this, I still have around 200 GitHub notifications (Email excluded) to read, answer, sort, and/or put into my Open Source backlog.
So most of the time, I'm trying to focus on information that is bringing me more information about what needs to be done or what is actually asked (technically). Side discussions into a feed that are not relevant (at a time X) to what is actually my goal or the goal of the issue are not always my priority. In fact, you may have seen me in the past jumping around multiple comments in the past because what is said is becoming relevant after implementation or a few weeks of changes.
I could laught (by putting a laugh emoji) at him just like you at me,
furthermore, I could lught at you (by putting a laugh emoji), every time the Adblock Decoder or PyFunceble crashes making you looking like a fool (but your are a good developer and bugs are normal thing in programming, unavoidable by a human being).
Don't take it too personally, but I'm doing this in my free time. So a good laugh after a long day is not always bad.
And sometimes a laughing emoji can be the beginning of a good joke or friendship.
About crashes, PyFunceble 4.0.0 is actually in beta for a good reason. And you are invited to submit all the fatal errors that PyFunceble produces. PyFunceble is not only doing one thing and most of the time it depends on the inputted dataset. As I can't test all possible imaginable datasets - especially when writing a decoder. I'm somehow "bound" to the error report. That's what you indirectly did. And that's what led to a significant change in the source code.
In fact, there was so much feedback on the 4.0.0 version that is the most tested version ever of PyFunceble. Making it probably. one of the less error-prone versions. Nothing is perfect but there is hope that this new version brings less error and more stability.
Really? Then why didn't you explain what was the purpose of the emoiji in your comment then.
Why should someone lose time to explain all single nontechnical decisions when it's not necessary? I didn't judge it necessary. Now I still took the time to explain...
But the fact is: I do have private, professional, familial, and public (through here) lives. So each minute I lose-d to explain an emoji or the choice of an X or Y word in a sentence is a time I could use to do something else to actually answer more technical questions or simply help move forward.
An emoji shouldn't be given that much time and energy. It happened, I understand it offended you or made you uncomfortable. I can't promise that you won't be offended again as I can't speak for others who use this platform. But be sure that your message and feelings came "laud and clear".
Do you want to just tell me you put the emoji for no reason, or just because you were in a happy mood and just by accident it was looking like you were laughting at me...
I wasn't laughing at you nor at the situation. If I would, I would have used:
Or is it not appropriate enough?
I don't believe your cheap explanation, lie to yourself, I spent much time analysing, whether your intention was to laught at me or not, and something said me very clear you were.
Believe it or not, when I say such a thing, I mean it. My intention was and is not to laugh at you.
abused by trolls, they abuse emojis to troll other people at every occasion
Why would I literally troll others on "every occasion" (or not) on GitHub when I have other things to do. I just want to move forward, help, and code if I get the chance and the time for it.
It's all in my free time. Troll do "their thing" all day long, probably in their free time too but I believe that they are not as busy as some of us.
I consider you a positive person and great developer overall, but just don't do it again.
Thank you for the compliment. I'm indeed a positive person. I'm not looking to harm anyone. I was harmed enough in my life to know.
I didn't think that that sentence with an emoji will harm someone. People who know me, know that I'm not a troll or someone who constantly "humiliate", "shame" and/or "offend" someone because of a lack of knowledge. I know that not everyone has the same knowledge. That's why I'm always happy to provide some of my expertise, knowledge, and help across multiple projects.
By the way, the word "fvcktard" was not necessary. I'm polite enough, and even if an emoji offended you, it's not a reason to use such a word. That's something that offended me but I chose to ignore it at the time I read it. Please avoid such language in the future.
Sorry for the misunderstanding.
Cheers.
from pyfunceble.
OFF-TOPIC
Hi funilrys,
as for the incident: I appreciate your explanations, but I will not change my mind, my answer is the same as to spirillen: #13 (comment), there is too many people who tell sweet lies and manipulate, I trust my feelings. (even if you did it unintentionally, still you did it, and the other had right ot get pissed, but I still think you did it intentionally.)
as for:
funilrys : #13 (comment) : about crashes (...)
Sure, and that's why I mentioned the crashes only as an example where someone could dishonestly and groundlessly laugh at someone for no real reason, so why all the explanations...you see the point...: even despite that it was just an example, you write back as if you took it seriously, and what if I would write it seriously and laught emote it (because for. ex. I would be happy because I spot a bug) I bet it wouldn't be pleasant for you. 😃
as for:
funilrys : #13 (comment) :
keczuppp : Really? Then why didn't you explain what was the purpose of the emoiji in your comment then.
Why should someone lose time to explain all single nontechnical decisions when it's not necessary? I didn't judge it necessary. Now I still took the time to explain...
Nah, either you got it wrong, or it was me who worded it wrongly, the logic of my question was different:
it was not about "explain purpose of the emoji
in the comment
" - it was not about to explain in the comment
,
it was about to: explain purpose of
the emoji in the comment
- it was about to explain emoji in the comment
, (emoji which is in your comment
), the part in the comment
reffers to the actual placement of the emoji, not where to put the explanations. I think I would use a comma ,
if I meant to put the explanations in the comment, an example:
Really? Then why didn't you explain what was the purpose of the emoiji, in your comment then.
or perhaps I should use rather from than in: "of the emoji in from you comment". The whole point of the question was that you begun to explain in your comment: #13 (comment) what was not purpose of the emoji, but at the same time, you did not explain what was the purpose. I have no idea whose fault it is for this misunderstanding in grammar.
funilrys : #13 (comment) :
keczuppp : abused by trolls, they abuse emojis to troll other people at every occasion
Why would I literally troll others on "every occasion" (or not) on GitHub when I have other things to do. I just want to move forward, help, and code if I get the chance and the time for it. It's all in my free time. Troll do "their thing" all day long, probably in their free time too but I believe that they are not as busy as some of us.
I didn't mean to say you are a troll, but the division of people is not limited to the black / white case you provided: to those who always troll and to those who never troll, but also there are those who troll occasionally only.
By the way, the word "fvcktard" was not necessary. I'm polite enough, and even if an emoji offended you, it's not a reason to use such a word. That's something that offended me but I chose to ignore it at the time I read it. Please avoid such language in the future.
I am not a person who avoids defense against harassment, I prefer to defend myself and choose the methods of defense myself. If you feel offended, it's because I felt offended by you first and decided to pay back. Even if you did it unintentionally, still you did it, and the other has right to get pissed, be careful where you paste your pokemons, no one is clairvoyant to guess the intentions of other people's emotes, before you put an emoji, think twice whether it might look like laughting at someone or not, if you still want to put an emoji but at the same time you know it might look like laughting at someone, put some short explanations of purpose of the emoji, do not put your interests ahead of the interests of others, just because you are happy and have a need to put an emoji, doesn't mean you can put it anywhere, anytime, in a way it might offend others. Please be carefull with emojis at me in the future.
Also it would be better if you guys created a separate issue for the off-topic, you made the Adblock Decoder thread a garbage can. After I felt offended, I expressed it in a single short sentence: #13 (comment), if you don't agree with it and wanted to continue discussion and turn it into a longer off-topic thread, you could create a separate issue for the whole incident.
Now, is the off-topic over finally? Can we get back to the Adblock Decoder please? I would rather spend time on something more constructive / technical than long off-topic conversations which are very time consuming and can end in an unpleasant escalation. Why this whole show. Guys come on, lets end this show finally, no need to turn it into a drama festival.
Best regards.
from pyfunceble.
Please, can we park that emoji?? agreeing that you do not agreeing in the explanation and usage of it for the the specific situation??
And let's stick to the Code of conduct and get back to the actual topic, the error produced.
I'll hope so.
from pyfunceble.
funilrys, can we get some cleaning in this thread, could you put in the spoiler your OFF-TOPIC #13 (comment), just like I did with my OFF-TOPICS, thx
from pyfunceble.
OK, so I've just tested the newest PyFunceble dev right now and I've noticed that the reported issues mentioned in :
#13 (comment) and #13 (comment) have been fixed.
The summarision:
keczuppp:
As for the last 3 failures, many of such failures can be found in
https://easylist-downloads.adblockplus.org/easylistpolish.txt
The list contains about 2961 domains, but only 2459 are found by
Adblock Decoder (with--aggressive
option), which gives 83% efficiency.
- the current EasyList Polish contains 3190 domains and the newest Pyfunceble has extracted 3106 what gives 97% efficiency compared to 83% previously
keczuppp:
currently the decoder extracts about a half of domains, to prove it I copy-pasted alldomain=
lines from several popular / big adblock filter lists + two polish lists:EasyList
,EasyPrivacy
,AdGuard Base
,AdGuard Tracking Protection
,Official Polish Filters for AdBlock, uBlock Origin & AdGuard
,EasyList Polish
and put into a single file / list (seedomain=.zip
), which contain about 11425 domains, but Adblock Decoder extracts only about a half (5426 domains)
- the newest PyFunceble has extracted 8298 what gives 73% compared to previously 48%
Good improvement.
from pyfunceble.
Related Issues (20)
- FEATURE: Preload/Continue like the CI workflow ... but without Git HOT 2
- BUG: Cant install latest version with pip HOT 2
- BUG: urls in domain lists.... HOT 2
- FEATURE: Special Rules for forumactif.com HOT 1
- Contribution Tracking
- DOC: Moving away from restructured text HOT 8
- BUG: URL in file header should be changed
- BUG: dead domain query HOT 9
- BUG: sqlalche braekes after finishing...
- FEATURE: Reputation filter using the proxy connection HOT 3
- FEATURE: Sharing WHOIS
- FEATURE: meilisearch or redis support vs RamDrive
- BUG: TypeError(f"<data> should be {dict}, {type(data)} given.") HOT 5
- BUG: log file not created... HOT 2
- Special Rules, are they working as expected? HOT 2
- Unstable special rules HOT 1
- BUG: Object of type datetime is not JSON serializable HOT 2
- FEATURE: Timestamp in CLI output
- BUG: domains can't start with a dot... HOT 2
- pyfunceble.funilrys.com
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyfunceble.