Giter Club home page Giter Club logo

pyfunceble's Introduction

image

The tool to check the availability or syntax of domain, IP or URL

image

image

image

image

image

image

image

image

PyFunceble aims to provide an accurate availability check through the usage of multiple sources which are for example - to only list a few:

  • the WHOIS record(s).
  • the DNS record(s).
  • the HTTP status code.

PyFunceble can be included in your existing project through:

  • its standard built-in CLI implementation.
  • its Python API.
  • the PyFunceble web-worker project that provides the core functionalities of PyFunceble behind a web API.

The PyFunceble CLI can test from a hosts file, a plain list of subjects, an AdBlock filter list or even an RPZ record.

As of today, PyFunceble is running actively - if not daily - within several servers, laptops, PCs, and Raspberry Pis. It is even used - thanks to our auto continue mechanism - with CI engines like GitHub Action, Travis CI, or GitLab CI.

Happy testing with PyFunceble!

image

Installation

pip

$ pip install --upgrade pyfunceble
$ pyfunceble --version

docker

$ docker pull pyfunceble/pyfunceble
$ docker run -it pyfunceble/pyfunceble --version

Documentation as the place to be!

Want to know more about details PyFunceble? I invite you to read the documentation at https://pyfunceble.readthedocs.io/en/dev/!

Want a local copy? I get you covered!

Simply run the following and enjoy the documentation!

$ pip install --user -r requirements.docs.txt # Install dependencies.
$ cd docs/
$ make clean html
$ palemoon _build/html/index.html # palemoon or whatever browser you use.

Note

You are also invited to submit changes and improvement to the documentation through a new Pull Request.

Supporting the project

PyFunceble, Dead-Hosts, and all other analog projects are powered by free time and a lot of coffee!

This project helps you and/or you like it?

GitHub Sponsor

@funilrys is part of the GitHub Sponsor program!

image

Sponsor me!

Ko-Fi

Don't want to use the GitHub Sponsor program ? Single donations are welcome too!

image

Buy me a coffee!

Contributors

Thanks to those awesome peoples for their awesome and crazy idea(s), contribution(s) and or issue report which made or make PyFunceble a better tool.

_______ _                 _          _                              _
| | | | | | |
| | | _____ | |
| | '_ / _` | '_ | | __/ _ | | | | | | |
| | | | | (_| | | | | <__ | || (_) | | | (_) | |
__,_ ____/ __,_| (_)

__/ |

|___/

Special Thanks

Thanks to those awesome organization(s), tool(s) and or people(s) for

  • Their awesome documentation
  • Their awesome repository
  • Their awesome tool/software/source code
  • Their breaking reports
  • Their contributions
  • Their current work/purpose
  • Their promotion of Py-Funceble
  • Their support
  • Their testings reports

which helped and/or still help me build, test and or make PyFunceble a better tool.

_______ _                 _          _                              _
| | | | | | |
| | | _____ | |
| | '_ / _` | '_ | | __/ _ | | | | | | |
| | | | | (_| | | | | <__ | || (_) | | | (_) | |
__,_ ____/ __,_| (_)

__/ |

|___/

License

Copyright 2017, 2018, 2019, 2020, 2022, 2023, 2024 Nissar Chababy

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

pyfunceble's People

Contributors

funilrys avatar mitchellkrogza avatar porn-records avatar smed79 avatar spirillen avatar veracioux avatar ybreza avatar zerodot1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pyfunceble's Issues

error when updating. 'pyfunceble' in section 'console_scripts' already exists

user@computer:~/repos/PyFunceble$ . venv/bin/activate && pip3 install -e .
Obtaining file:///home/user/repos/PyFunceble
Collecting colorama>=0.3.9 (from PyFunceble-dev==0.133.1)
  Downloading https://files.pythonhosted.org/packages/4f/a6/728666f39bfff1719fc94c481890b2106837da9318031f71a8424b662e12/colorama-0.4.1-py2.py3-none-any.whl
Collecting domain2idna>=1.6.1 (from PyFunceble-dev==0.133.1)
  Using cached https://files.pythonhosted.org/packages/4e/27/b7336824583e26d3e33f7b6917c00e51b7c8a94bc1d4b78d6aa1eb9c7e8b/domain2idna-1.6.1-py3-none-any.whl
Collecting PyYAML>=3.13 (from PyFunceble-dev==0.133.1)
Collecting requests>=2.19.1 (from PyFunceble-dev==0.133.1)
  Using cached https://files.pythonhosted.org/packages/ff/17/5cbb026005115301a8fb2f9b0e3e8d32313142fe8b617070e7baad20554f/requests-2.20.1-py2.py3-none-any.whl
Collecting setuptools>=40.4.3 (from PyFunceble-dev==0.133.1)
  Using cached https://files.pythonhosted.org/packages/e7/16/da8cb8046149d50940c6110310983abb359bbb8cbc3539e6bef95c29428a/setuptools-40.6.2-py2.py3-none-any.whl
Collecting urllib3>=1.23 (from PyFunceble-dev==0.133.1)
  Using cached https://files.pythonhosted.org/packages/62/00/ee1d7de624db8ba7090d1226aebefab96a2c71cd5cfa7629d6ad3f61b79e/urllib3-1.24.1-py2.py3-none-any.whl
Collecting certifi>=2017.4.17 (from requests>=2.19.1->PyFunceble-dev==0.133.1)
  Downloading https://files.pythonhosted.org/packages/9f/e0/accfc1b56b57e9750eba272e24c4dddeac86852c2bebd1236674d7887e8a/certifi-2018.11.29-py2.py3-none-any.whl (154kB)
    100% |████████████████████████████████| 163kB 2.9MB/s 
Collecting chardet<3.1.0,>=3.0.2 (from requests>=2.19.1->PyFunceble-dev==0.133.1)
  Using cached https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl
Collecting idna<2.8,>=2.5 (from requests>=2.19.1->PyFunceble-dev==0.133.1)
  Using cached https://files.pythonhosted.org/packages/4b/2a/0276479a4b3caeb8a8c1af2f8e4355746a97fab05a372e4a2c6a6b876165/idna-2.7-py2.py3-none-any.whl
Installing collected packages: colorama, setuptools, domain2idna, PyYAML, certifi, chardet, idna, urllib3, requests, PyFunceble-dev
  Found existing installation: PyFunceble-dev 0.127.5
Exception:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/pip/basecommand.py", line 215, in main
    status = self.run(options, args)
  File "/usr/lib/python3/dist-packages/pip/commands/install.py", line 360, in run
    prefix=options.prefix_path,
  File "/usr/lib/python3/dist-packages/pip/req/req_set.py", line 778, in install
    requirement.uninstall(auto_confirm=True)
  File "/usr/lib/python3/dist-packages/pip/req/req_install.py", line 734, in uninstall
    FakeFile(dist.get_metadata_lines('entry_points.txt'))
  File "/usr/lib/python3.6/configparser.py", line 763, in readfp
    self.read_file(fp, source=filename)
  File "/usr/lib/python3.6/configparser.py", line 718, in read_file
    self._read(f, source)
  File "/usr/lib/python3.6/configparser.py", line 1092, in _read
    fpname, lineno)
configparser.DuplicateOptionError: While reading from '<???>' [line  3]: option 'pyfunceble' in section 'console_scripts' already exists

Add the possibility to use DoT

Is your feature request related to a problem? Please describe.
It is not possible to use DoT.

Describe the solution you'd like
I would like to use DoT to protect my DNS requests.
Using the DNS of the host system with DoT often leads to problems because some applications simply don't work anymore.

This is why DoT should be implemented in PyFunceble.
I recommend to provide a way to configure the DoT settings in the configuration file, so that you don't have to enter everything manually every time you need to use it.

https://developers.cloudflare.com/1.1.1.1/dns-over-tls/

Additional context
It should be possible to specify the URL and server for DoT.

Example:

#Configfile

DoTurl = Your_ID.dns.nextdns.io
Server 1= ip.ip.ip.ip
Server 2= ip.ip.ip.ip
Server 3= ipv6:ipv6:1pv6:ipv6
Server 4= ipv6:ipv6:1pv6:ipv6

Using the raw Python installation method only installs 1.0.0

Describe the bug
I am unable to use PyFunceble, because trying to install or update PyFunceble, through the "pure Python" methods that are described in the installation and update guides, isn't going all that well. This is because it only installs 1.0.0 instead of 1.8.0, and then refuses to run due to a message that tells me to update PyFunceble, which I can't.

Modifications under .PyFunceble.yaml
No changes that I know about.

To Reproduce
Steps to reproduce the behavior:

  1. cd into a PyFunceble folder (Presumably one that has been set up with the "pure Python" installation method).
  2. Run git checkout master && git fetch origin && git merge origin/master
  3. See that it gets updated, or that it says that it's fully updated already.
  4. Run pyfunceble -v
  5. See that it says pyfunceble 1.0.0. (Blue Bontebok)

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
image

Versions (please complete the following information):

  • OS: Windows 10 October 2018 Update (Last installed update: KB4487044)
  • Python Version: 3.6.4
  • PyFunceble Version: 1.0.0 (in use), 1.8.0 (the one that I can't update to)

Additional context
I am unable to test with either of the pip3 methods, because of something of some sort:
image

I have absolutely no idea how to get the new GitHub compressed-archive download method to work either, with this incomprehensible error being shown:
image

Inclusion of WHOIS data

I am always thinking about how we can finally get this script accepted as THE go-to domain list curation script. Some of my ideas may seem completely impractical, and I realize that, but I just thought I'd dump my current thoughts as they are. Maybe they're useful or maybe not.

I have talked about using proxies and VPNs and such in the past to get around network vector and general connectivity issues causing erroneous results. However, independent of the inevitable technical failures and other mysteries that befall network-connected devices, inclusion of WHOIS data could also be factored in as a more stable resource to check if a domain is still a baddie or not. Obviously saving WHOIS data for every domain would take a long time, so maybe just a column with a hash of the WHOIS data for each domain. And when the hash changes, the domain should be marked to be manually rechecked. Some events that cause WHOIS data to change, such as transferring ownership, could signal a domain changing its ways and no longer following the Dark Side, so it should be manually checked for reconsideration of being on the list because its category could possibly change. This would also be independent of network connectivity issues and possible random outages or whatnot.

Other events, such as a simple domain renewal and the updating of years and contact information for an owner that is retaining ownership, shouldn't happen frequently enough to make this extra check too annoying, maybe once a year to once every few years per domain. And a quick manual check to verify the current state of the domain shouldn't take too much of the curator's time. To be more thorough, the WHOIS data could possibly be stored either to a file structure of some type or to a SQL/SQLite database to speed things up and only be updated when the hash changes to signify a change. Obviously then the first time the script is run it would take a long time to dump all the WHOIS data, but subsequent runs would be much faster as only WHOIS data that has changed would be updated. If the WHOIS data itself is saved, the curator could then just quickly compare the data and see if it's actually an ownership transfer or just a number here or there changing.

I have also been thinking about the inclusion of possibly scraping key values off of websites to check if a website has changed enough to warrant manual intervention, such as a header or footer changing brand/company names or actually looking for particular malware scripts included in the page. These would be stored as a simple search string of what to look for and what the value should be, so a basic key-value pair database. If the value scraped doesn't match the value stored in the database, then the domain could then be marked for manually checking. Scraping, however, would still be prone to connectivity and network vector issues.

Is it possible to only display "INACTIVE" domains?

Is your feature request related to a problem? Please describe.

Since my use of PyFunceble (as of at the time of writing) relies on me manually editing out domains from my lists that PyFunceble have declared to be inactive, I wonder if there's a way to make the terminal window only display red/inactive domains that it has parsed?

That way I won't have to scroll quite as much through walls of green/active domains as I've had to do so far, and that I won't have to stretch the terminal window to 1400~1500px height in order to spot all the inactive domains.

Describe the solution you'd like

A way to only have the invalid domains that PyFunceble have parsed show up in the terminal window, or some quick advice on how to turn on (or use a modifier to achieve) such a function if it already exists.

Describe alternatives you've considered

If the above is not possible to implement, then I suppose it could be possible to add an adblock subfolder to the output foldertree or something like that, or perhaps that I'd look into PyFunceble\output\splited\invalid much more often, but it'd feel like a detour in my eyes since it'd not be in PyFunceble's de-facto UI itself.

Additional context

None that I'm aware of.

Coloration is wrong

Bug description

As per https://twitter.com/zero_dot1/status/1193291314319765506 (from @ZeroDot1), we can see that when testing the coloration (at the end) is wrong and does not reflect the test result correctly.

Modifications under .PyFunceble.yaml

Nothing relevant.

Reproduction

Steps to reproduce the behavior:

Run a test with the --syntax argument.

Expected behavior**

It should be like green (like this) if VALID or ACTIVE > 50% and red (like in the link) for INACTIVE or INVALID > 50%

2019-11-09_23-32

some active domains showing as inactive

version 1.7.0

amongst these I show the main a4.tl domain as inactive but it should work:

a4.tl		INACTIVE *** 
apptrk.a4.tl	ACTIVE 302 
els.a4.tl	ACTIVE 403 
jrs.a4.tl	ACTIVE *** 
ldap.a4.tl	ACTIVE *** 
preroll.a4.tl	ACTIVE 403 
sdk.a4.tl	ACTIVE 403

https://www.instra.com/en/whois/whois-result/a4_tl

Another 2:

adform.net
adformdsp.net

https://reports.internic.net/cgi/whois?whois_nic=adform.net&type=domain
https://reports.internic.net/cgi/whois?whois_nic=adformdsp.net&type=domain

more

adkmob.com	INACTIVE *** 
bp.adkmob.com	ACTIVE 403 
ssdk.adkmob.com	ACTIVE 403 
adleads.com	INACTIVE *** 

https://reports.internic.net/cgi/whois?whois_nic=adleads.com&type=domain

0.0.0.0 a4.tl
0.0.0.0 adformdsp.net
0.0.0.0 adform.net
0.0.0.0 adkmob.com
0.0.0.0 adleads.com
0.0.0.0 admoda.com
0.0.0.0 adsmogo.mobi
0.0.0.0 adsmogo.net
0.0.0.0 adywind.com
0.0.0.0 adzerk.net
0.0.0.0 alexajstrack.com
0.0.0.0 applifier.info
0.0.0.0 appnexus.net
0.0.0.0 apxadtracking.net
0.0.0.0 atti.com
0.0.0.0 avazunativeads.com
0.0.0.0 cpro.baidu.cn
0.0.0.0 bayctrk.com
0.0.0.0 billymobile.com
0.0.0.0 cb-cdn.com
0.0.0.0 cedexis-radar.net
0.0.0.0 chartboosts.com
0.0.0.0 clickkydsp.com
0.0.0.0 cnbc7.com

package as a snap/flatpak package?

I was wondering if PyFunceble could be packaged as a snap package or flatpak. I'm not a python update expert and it looks like the documentation shows 3 ways to install and 3 ways to update depending on the python, pip or github way you do it. I just got a message that I needed to update my PyFunceble to a new version and I don't remember how I installed it in the 1st place :/

I hate to propose yet another way to implement it but if it was an isolated installation it could be packaged with its dependencies, its environment would be self-contained- regardless of Linux flavor, and updating would be as simple as "sudo snap refresh" which finds updates for any snaps that need updating.

Just a thought :)

Feature Request: sort hostfile output in a hierarchical order

The app currently sorts the domains in alphabetical order but I was wondering if it would be more readable in a hierarchical order.

example host file sorted with domain hieararchical order.

So if I have a full domain and its subdomains as AAAA.BBBB.CCCC.DDDD.TLD
instead of sorting it purely like that I'd sort by DDDD then CCCC,BBBB,AAAA and finally the TLD. This puts all of the subdomains grouped together in the list.

instead of all of google's entries spread all over a host file they'd get clumped together based on the google.com domain.

0.0.0.0 adservice.google.com
0.0.0.0 googleadapis.l.google.com
0.0.0.0 s0-2mdn-net.l.google.com
0.0.0.0 ssl-google-analytics.l.google.com
0.0.0.0 www-google-analytics.l.google.com
0.0.0.0 pagead2.googleadservices.com
0.0.0.0 partner.googleadservices.com
0.0.0.0 www.googleadservices.com
0.0.0.0 googleadservices.com
0.0.0.0 ssl.google-analytics.com
0.0.0.0 www.google-analytics.com
0.0.0.0 google-analytics.com
0.0.0.0 chart.googleapis.com
0.0.0.0 ad-creatives-public.commondatastorage.googleapis.com
0.0.0.0 imasdk.googleapis.com
0.0.0.0 ade.googlesyndication.com
0.0.0.0 pagead2.googlesyndication.com
0.0.0.0 tpc.googlesyndication.com
0.0.0.0 www.googletagmanager.com
0.0.0.0 www.googletagservices.com
0.0.0.0 redirector.googlevideo.com

--quiet needs some output to keep Travis-CI awake

Is your feature request related to a problem? Please describe.
--quiet of course gives no output (as expected) 😹 but it causes Travis-CI to fail due to no input received.

Describe the solution you'd like
A one line feedback message to Travis-CI every 30-60 seconds "PyFunceble - Testing in Progress"

Describe alternatives you've considered

Additional context
With multiprocess some builds can create very big logs so we eventually want PyFunceble to be quiet and not create build logs that are too big for Travis-CI

.pyfunceble.yaml

Bug description

Installing the -dev version for obtaining your latest commit from ``

Modifications under .PyFunceble.yaml

None, it's empty....

Reproduction

Steps to reproduce the behavior:

  1. sudo -H python3 -m pip install --upgrade git+https://gitlab.com/funilrys/PyFunceble@dev
  2. pyfunceble -m -p 4 -db --database-type mariadb -f
Try to merge upstream configuration file into /home/$user/.config/PyFunceble/.PyFunceble.yaml ? [y/n] y
Traceback (most recent call last):
  File "/usr/local/bin/pyfunceble", line 8, in <module>
    sys.exit(tool())
  File "/usr/local/lib/python3.7/dist-packages/PyFunceble/cli/__init__.py", line 1041, in tool
    raise exception
  File "/usr/local/lib/python3.7/dist-packages/PyFunceble/cli/__init__.py", line 1024, in tool
    PyFunceble.cconfig.Merge(PyFunceble.CONFIG_DIRECTORY)
  File "/usr/local/lib/python3.7/dist-packages/PyFunceble/config/merge.py", line 108, in __init__
    self._load()
  File "/usr/local/lib/python3.7/dist-packages/PyFunceble/config/merge.py", line 188, in _load
    self._save()
  File "/usr/local/lib/python3.7/dist-packages/PyFunceble/config/merge.py", line 143, in _save
    PyFunceble.helpers.Dict(self.new_config).to_yaml_file(self.path_to_config)
  File "/usr/local/lib/python3.7/dist-packages/PyFunceble/helpers/dict.py", line 345, in to_yaml_file
    sort_keys=sort_keys,
  File "/usr/lib/python3/dist-packages/yaml/__init__.py", line 200, in dump
    return dump_all([data], stream, Dumper=Dumper, **kwds)
TypeError: dump_all() got an unexpected keyword argument 'sort_keys'
ll /home/$USER/.config/PyFunceble/.PyFunceble.yaml
-rw-rw-r-- 1 $USER $USER 0 Nov 17 23:03  /home/$USER/.config/PyFunceble/.PyFunceble.yaml

Expected behavior**

Leaving this to your imagination 😄

Screenshots

🌟 🎉 🍰 We now support IPv6! 🍰 🎉 🌟
A configuration key is missing.
Try to merge upstream configuration file into /home/$USER/.config/PyFunceble/.PyFunceble.yaml ? [y/n] n
Traceback (most recent call last):
  File "/usr/local/bin/pyfunceble", line 8, in <module>
    sys.exit(tool())
  File "/usr/local/lib/python3.6/dist-packages/PyFunceble/cli/__init__.py", line 1041, in tool
    raise exception
  File "/usr/local/lib/python3.6/dist-packages/PyFunceble/cli/__init__.py", line 1024, in tool
    PyFunceble.cconfig.Merge(PyFunceble.CONFIG_DIRECTORY)
  File "/usr/local/lib/python3.6/dist-packages/PyFunceble/config/merge.py", line 108, in __init__
    self._load()
  File "/usr/local/lib/python3.6/dist-packages/PyFunceble/config/merge.py", line 202, in _load
    raise PyFunceble.exceptions.ConfigurationFileNotFound()
PyFunceble.exceptions.ConfigurationFileNotFound

Versions

OS: for example Arch Linux (5.0.5-arch1-1-ARCH)
Python Version: Python 2.7.16, 3.7.3, 3.7.4
PyFunceble Version: for example 1.2.0

pyfunceble -v
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/PyFunceble/cli/__init__.py", line 95, in tool
    PyFunceble.load_config(generate_directory_structure=False)
  File "/usr/local/lib/python3.7/dist-packages/PyFunceble/__init__.py", line 682, in load_config
    cconfig.Load(CONFIG_DIRECTORY, custom)
  File "/usr/local/lib/python3.7/dist-packages/PyFunceble/config/load.py", line 90, in __init__
    self.__load_it()
  File "/usr/local/lib/python3.7/dist-packages/PyFunceble/config/load.py", line 101, in __load_it
    self._load_config_file()
  File "/usr/local/lib/python3.7/dist-packages/PyFunceble/config/load.py", line 300, in _load_config_file
    self._install_iana_config()
  File "/usr/local/lib/python3.7/dist-packages/PyFunceble/config/load.py", line 409, in _install_iana_config
    iana_link = self.data["links"]["iana"]
TypeError: 'NoneType' object is not subscriptable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/pyfunceble", line 8, in <module>
    sys.exit(tool())
  File "/usr/local/lib/python3.7/dist-packages/PyFunceble/cli/__init__.py", line 1039, in tool
    PyFunceble.LOGGER.exception()
AttributeError: 'NoneType' object has no attribute 'exception'

Additional context

Add any other context about the problem here.

Bug while testing (INVALID) URLs

@mitchellkrogza said:

Broken builds suddenly - https://travis-ci.org/mitchellkrogza/Phishing-URL-Testing-Database-of-Link-Statuses/jobs/532790699#L365-L366

Full traceback:

Traceback (most recent call last):
  File "/home/travis/virtualenv/python3.7.1/bin/PyFunceble", line 10, in <module>
    sys.exit(_command_line())
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/PyFunceble/__init__.py", line 1460, in _command_line
    link_to_test=ARGS.link,
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/PyFunceble/dispatcher.py", line 122, in __init__
    FileCore(url_file_path, "url").read_and_test_file_content()
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/PyFunceble/file_core.py", line 613, in read_and_test_file_content
    self._test_line(line)
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/PyFunceble/file_core.py", line 517, in _test_line
    status = self.__process_test(subject)
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/PyFunceble/file_core.py", line 367, in __process_test
    return self.url(subject)
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/PyFunceble/file_core.py", line 256, in url
    subject, subject_type="file_url", filename=self.file
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/PyFunceble/status.py", line 700, in __init__
    "http_status_code": HTTPCode(self.subject, "url").get(),
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/PyFunceble/http_code.py", line 188, in get
    http_code = self._access()
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/PyFunceble/http_code.py", line 146, in _access
    verify=PyFunceble.CONFIGURATION["verify_ssl_certificate"],
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/requests/api.py", line 101, in head
    return request('head', url, **kwargs)
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/requests/sessions.py", line 519, in request
    prep = self.prepare_request(req)
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/requests/sessions.py", line 462, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/requests/models.py", line 313, in prepare
    self.prepare_url(url, params)
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/requests/models.py", line 387, in prepare_url
    raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL '=======': No schema supplied. Perhaps you meant http://=======?

Print version of PyFunceble into percentage.txt file

Is your feature request related to a problem? Please describe.
Would be useful to have version number printed into the percentage.txt file helpful for diagnosing big time differences between tests and versions of dev.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

-uf all invalid, buts ok as local file

Bug description

If you use the Pyf with -uf 'url' all records is marked as invalid, but if you download the same file and then test it with the -fthinks runs as excepted

Modifications under .PyFunceble.yaml

nothing

Reproduction

Steps to reproduce the behavior:

pyfunceble -m -p 8 -db --database-type mariadb -uf 'https://gitlab.com/my-privacy-dns/external-sources/antipopads/raw/master/hosts'

Subject                                                                                              Status      HTTP Code 
---------------------------------------------------------------------------------------------------- ----------- ----------
jyahmckzsbh.com                                                                                      INVALID     ***       
nuowoczmvits.com                                                                                     INVALID     ***       
dsdiztki.bid                                                                                         INVALID     ***       
lwtsrwwlfd.com                                                                                       INVALID     ***       
wftduglf.com                                                                                         INVALID     ***       
wget 'https://gitlab.com/my-privacy-dns/external-sources/antipopads/raw/master/hosts'
pyfunceble -m -p 8 -db --database-type mariadb -f hosts

Subject                                                                                              Status      HTTP Code 
---------------------------------------------------------------------------------------------------- ----------- ----------
xrkfqpbubaq.com                                                                                      ACTIVE      ***       
htabtzmi.bid                                                                                         INACTIVE    ***       

Expected behavior**

Tets running equally

Versions

OS: Ubuntu Bionic
Python Version: 3.7
PyFunceble Version: pyfunceble 2.2.0. (Green Galago: Skitterbug)

Additional context

sudo -H python3 -m pip install --upgrade PyFunceble prior to the test...

with sudo -H python3 -m pip install --upgrade PyFunceble-dev

resolts is:

pyfunceble -m -p 8 -db --database-type mariadb -uf 'https://gitlab.com/my-privacy-dns/external-sources/antipopads/raw/master/hosts'
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/PyFunceble/cli/__init__.py", line 95, in tool
    PyFunceble.load_config(generate_directory_structure=False)
  File "/usr/local/lib/python3.6/dist-packages/PyFunceble/__init__.py", line 682, in load_config
    cconfig.Load(CONFIG_DIRECTORY, custom)
  File "/usr/local/lib/python3.6/dist-packages/PyFunceble/config/load.py", line 90, in __init__
    self.__load_it()
  File "/usr/local/lib/python3.6/dist-packages/PyFunceble/config/load.py", line 101, in __load_it
    self._load_config_file()
  File "/usr/local/lib/python3.6/dist-packages/PyFunceble/config/load.py", line 300, in _load_config_file
    self._install_iana_config()
  File "/usr/local/lib/python3.6/dist-packages/PyFunceble/config/load.py", line 409, in _install_iana_config
    iana_link = self.data["links"]["iana"]
TypeError: 'NoneType' object is not subscriptable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/pyfunceble", line 8, in <module>
    sys.exit(tool())
  File "/usr/local/lib/python3.6/dist-packages/PyFunceble/cli/__init__.py", line 1039, in tool
    PyFunceble.LOGGER.exception()
AttributeError: 'NoneType' object has no attribute 'exception'

-dns switch to use custom dns server is not working in Windows, still uses OS settings

Tested on - pyfunceble 2.6.6.dev (Green Galago: Skitterbug) with Python 3.7.4 (Python 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 19:29:22) [MSC v.1916 32 bit (Intel)] on win32)

-dns switch to use custom dns server is not working in Windows 7, still uses OS settings for DNS.

I also tried setting in .PyFunceble.yaml but it still goes to OS defined DNS server.

Is this a PyFunceble or Python issue?

[GUIDE] Running PyFunceble in Conda Virtual Environments

This is my recommended way of running PyFunceble on just about any distro.

@funilrys guided me on this some time ago and I would actually never run PyFunceble now in any other way so kudos on this goes to him.

I in fact don't run anything to with Python now without it running inside a Conda virtual environment. Distributions like Ubuntu are especially troublesome with Python issues which are easily solved by just running Python in Conda environments.

@funilrys feel free to add to improve this in any way.

# -------------------------------
# Setup Conda Python Environments
# -------------------------------

# 1. Add Conda Path to .bashrc (add line below to bottom of bashrc)
export PATH="${HOME}/miniconda/bin:${PATH}"

# 2. Reload your bashrc
source .bashrc

# 3. Download Conda
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh

# 4. Install Conda
bash miniconda.sh -b -p ${HOME}/miniconda

# 5. Setup Conda
hash -r
conda config --set always_yes yes --set changeps1 no

# 6. Update Conda
conda update -q conda

# 7. Create an Environment (EXAMPLE: creating an environment called pyfuncebletesting with Python version 3.7.3)
conda create -q -n pyfuncebletesting python="3.7.3"

# 8. Activate this environment you just created
source activate pyfuncebletesting

# 9. Query Python and Pip versions inside this environment
python -VV
pip --version

# 10. Install PyFunceble in this environment (pyfuncebletesting)
pip install PyFunceble

# 11. Create the directory where you are going to run PyFunceble and output results
mkdir /home/myuser/pyfuncebletesting

# 12. Export the Working Path to PyFunceble before running PyFunceble
export PYFUNCEBLE_CONFIG_DIR=/home/myuser/pyfuncebletesting/

# 13. Run PyFunceble Testing
PyFunceble -m -p 50 -ex --plain --idna -f mylist.txt

# 14. When finished - Deactivate the environment
source deactivate pyfuncebletesting

# Your results ?? Exactly where they should be in the folder you created in step 11 - inside the /output folder

# ----------------------------
# Run tests again another day?
# ----------------------------

# 1. First Update Conda
conda update -q conda

# 2. Activate your environment
source activate pyfuncebletesting

# 3. Upgrade your environment
pip install --upgrade pip
pip install PyFunceble --upgrade

# 4. Export the Path to PyFunceble before running PyFunceble
export PYFUNCEBLE_CONFIG_DIR=/home/myuser/pyfuncebletesting/

# 5. Run PyFunceble
PyFunceble -m -p 50 -ex --plain --idna -f mylist.txt

# 6. When finished - Deactivate the environment
source deactivate pyfuncebletesting

Print execution time into percentage.txt output file

Is your feature request related to a problem? Please describe.
Would be a nice feature to have the execution time recorded into the percentage.txt file.

Describe the solution you'd like
Just print the final execution time from the -ex paramater into the percentage.txt file as in screengrab below shown from console.

Describe alternatives you've considered

Additional context
Screenshot_2019-05-26_16-49-37

Have PyFunceble prepare Travis-CI for committing

Without us having to do this in our repo scripts as below, it would be great if PyFunceble could always prep itself for committing back to the repo by just pulling the environment variables and running this sequence before it runs --cmd or --cmd-before-end

    git remote rm origin
    git remote add origin https://${GH_TOKEN}@github.com/${TRAVIS_REPO_SLUG}.git
    git config --global user.email "${GIT_EMAIL}"
    git config --global user.name "${GIT_NAME}"
    git config --global push.default simple
    git checkout "${GIT_BRANCH}"

IndexError: list index out of range

Installed latest stable and get this msg:

$ PyFunceble --version
PyFunceble 1.3.0. (Blue Bontebok: Dragonfly)
user@hp2570p:~/repos/PyFunceble-lists/blackjack$ PyFunceble -uf https://raw.githubusercontent.com/BlackJack8/iOSAdblockList/master/Hosts.txt

Traceback (most recent call last):
  File "/home/user/.local/bin/PyFunceble", line 11, in <module>
    sys.exit(_command_line())
  File "/home/user/.local/lib/python3.6/site-packages/PyFunceble/__init__.py", line 1120, in _command_line
    link_to_test=ARGS.link,
  File "/home/user/.local/lib/python3.6/site-packages/PyFunceble/core.py", line 129, in __init__
    self._entry_management()
  File "/home/user/.local/lib/python3.6/site-packages/PyFunceble/core.py", line 301, in _entry_management
    self.file_url()
  File "/home/user/.local/lib/python3.6/site-packages/PyFunceble/core.py", line 1064, in file_url
    PyFunceble.repeat(list_to_test[-1]),
IndexError: list index out of range

Appkit path guessing not working

Describe the bug
AppKit for macos not working or even needed.

As far as i can see, appkit python module is used to figure out what the config directory on macos should be. I do not get why this needs to be any different from what linux does, so i changed line 108 to

if system().lower() == "linux" or system().lower() == "darwin":

Works flawlessly

Modifications under .PyFunceble.yaml
nothing

To Reproduce
Steps to reproduce the behavior:

  1. python setup.py install
  2. pyfunceble google.de

Expected behavior
Working script telling me google.de is valid domain

Screenshots
srsly? Screenshots of a terminal?

Traceback (most recent call last):
  File "./pyfunceble", line 11, in <module>
    load_entry_point('PyFunceble==1.0.0', 'console_scripts', 'pyfunceble')()
  File "/Users/sebastian/.pyenv/versions/funceble/lib/python3.6/site-packages/pkg_resources/__init__.py", line 487, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/Users/sebastian/.pyenv/versions/funceble/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2728, in load_entry_point
    return ep.load()
  File "/Users/sebastian/.pyenv/versions/funceble/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2346, in load
    return self.resolve()
  File "/Users/sebastian/.pyenv/versions/funceble/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2352, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "/Users/sebastian/.pyenv/versions/funceble/lib/python3.6/site-packages/PyFunceble-1.0.0-py3.6.egg/PyFunceble/__init__.py", line 141, in <module>
    from AppKit import (  # pylint: disable=import-error
ModuleNotFoundError: No module named 'AppKit'

Versions (please complete the following information):

  • OS: macos mojave
  • Python Version: 3.6.7
  • PyFunceble Version: master

Additional context
Add any other context about the problem here.

[Feature request]: Generate WWW list

Describe the solution you'd like
Generate an additional list with WWW subdomains.
Some lists have subdomains with WWW.BEISPIEL.COM
And with many missing these WWW subdomains simply, therefore I suggest for each repository an additional WWW list to the Active Invalid etc. to generate.
This is very helpful for many people.

Additional context
The folder structure should then look like this.
ACTIVE
INACTIVE
INVALID
VALID
WWW
.keep

The program freezes every 2 - 5 hours

Bug description

The program freezes every 2 or 3 hours.

Data Sheet:
Hardware / Software: Intel Xeon 2 core, 16 GB RAM, HDD 2 TB, Ubuntu 18.04.2 LTS x64
python -V: 2.7.15+
pip3 --version: pip 9.0.1 from /usr/lib/python3/dist-packages (python 3.6)
pip --version: pip 9.0.1 from /usr/lib/python2.7/dist-packages (python 2.7)
run: PyFunceble -m -p 150 -f list
Installation method: pip3 install -r requirements.txt && pip3 install PyFunceble
File to be processed: Unix/Linux List format (UTF-8 .txt) with 5 M of lines of domains

Measures taken: CTRL + C and run it again
The -p parameter has been changed (-p 200, -p 150, -p 100, -p 50 and -p has been removed), and the result is the same. I have tested on other computers with superior hardware and the result is the same

PD: What is the recommended installation method (error-proof)?
Problem with uninstall with this method:
#38 (comment)

Git Large File Storage (LFS) Support

Is your feature request related to a problem? Please describe.

Yes indeed a problem, commits fail when a file size exceeds the 100mb GIT Limit - As has been seen on https://github.com/mitchellkrogza/Phishing.Database all of yesterday with no error message helping us to discover what was going wrong in the PyFunceble commits.

I traced the error this morning by reintroducing my own commit script which then revealed the error message being thrown back when we push our commit.

remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com. 
remote: error: File input-source/ALL-feeds-URL.list is 127.30 MB; this exceeds GitHub's file size limit of 100.00 MB

See > https://travis-ci.org/mitchellkrogza/Phishing.Database/builds/534100063

Describe the solution you'd like
I fixed this and got commit to work by simply adding this into my commit script

git lfs install
git lfs track "*.list"

We should have this LFS support above included into PyFunceble

Additionally we should improve the way PyFunceble gets errors back from GIT by having PyFunceble capture any message with remote: error: and report them back to us so we know we have encountered a problem that right now is leading us in circles to diagnose.

Describe alternatives you've considered
Alternatives would be

a) to split all our list files on various projects into chunks of 20mb and adapt our scripts to loop through them thereby keeping all list files of domains and urls well under the 20mb limit

b) zip them before commiting and unzip them as a new build starts - this could lead to very large objects causing our repo sizes to become big quick and require regular use of BFG Repo Cleaner

Additional context

Working commit using LFS > https://travis-ci.org/mitchellkrogza/Phishing.Database/builds/534100948

Error in INVALID output

Bug description

Found a few misses in the INVALID/list after testing https://raw.githubusercontent.com/Clefspeare13/pornhosts/master/domains%20to%20check.txt

cat output/domains/INVALID/list
# File generated by PyFunceble (v2.45.0.dev) / https://github.com/funilrys/PyFunceble
# Date of generation: 2019-11-18T01:14:07.846084

a
added
and
checked
collectionofbestporn
command,
curl
did
double
file
hosts
list
needed.
not
of
properly
respond
the
which
will
www.adultcashtraffic.com
www.blog.gfrevenge.com
www.hqporner.comstudiowow-girls
www.largehdtube.comen
www.media.the-adult-company.com
www.pornblade.comcategoryanal-porn
www.porndoepremium.comcategories
www.sexdating
www.spankbang
yallainternethotnig

The errors I see is:

www.adultcashtraffic.com
www.blog.gfrevenge.com
www.media.the-adult-company.com
gfrevenge.com
   Domain Name: GFREVENGE.COM
   Registry Domain ID: 1312074137_DOMAIN_COM-VRSN
   Registrar WHOIS Server: whois.eurodns.com
   Registrar URL: http://www.EuroDNS.com
   Updated Date: 2019-10-27T04:31:37Z
   Creation Date: 2007-11-02T00:31:15Z
   Registry Expiry Date: 2020-11-02T00:31:15Z
   Registrar: EuroDNS S.A.
   Registrar IANA ID: 1052
   Registrar Abuse Contact Email: [email protected]
   Registrar Abuse Contact Phone: +352.27220150
   Domain Status: clientTransferProhibited https://icann.org/epp#clientTransferProhibited
   Name Server: DNS1.P03.NSONE.NET
   Name Server: DNS2.P03.NSONE.NET
   Name Server: DNS3.P03.NSONE.NET
   Name Server: DNS4.P03.NSONE.NET
   Name Server: SDNS3.ULTRADNS.BIZ
   Name Server: SDNS3.ULTRADNS.COM
   Name Server: SDNS3.ULTRADNS.NET
   Name Server: SDNS3.ULTRADNS.ORG
   DNSSEC: unsigned
whois adultcashtraffic.com
Expired 
whois the-adult-company.com

   Domain Name: THE-ADULT-COMPANY.COM
   Registry Domain ID: 1370101655_DOMAIN_COM-VRSN
   Registrar WHOIS Server: whois.safenames.net
   Registrar URL: http://www.safenames.net
   Updated Date: 2019-08-04T05:05:18Z
   Creation Date: 2008-01-03T13:29:55Z
   Registry Expiry Date: 2021-01-03T13:29:55Z
   Registrar: SafeNames Ltd.
   Registrar IANA ID: 447
   Registrar Abuse Contact Email: [email protected]
   Registrar Abuse Contact Phone: +44.1908200022
   Domain Status: clientDeleteProhibited https://icann.org/epp#clientDeleteProhibited
   Domain Status: clientTransferProhibited https://icann.org/epp#clientTransferProhibited
   Domain Status: clientUpdateProhibited https://icann.org/epp#clientUpdateProhibited
   Name Server: NS1.XMODELS-LIVE.CH
   Name Server: NS2.XMODELS-LIVE.CH
   Name Server: NS3.XMODELS-LIVE.CH
   DNSSEC: unsigned

This means two out of 3 should have been added to the ACTIVE/list

Reproduction

pyfunceble --plain -h -m -p 4 -db --database-type mariadb -f 'https://raw.githubusercontent.com/Clefspeare13/pornhosts/master/domains%20to%20check.txt'

Expected behavior**

Only invalid entries, or invalid tld's are added to invalid/list

Versions

OS: Disco
Python Version: 3.7.3
PyFunceble Version: pyfunceble -v pyfunceble 2.45.0.dev (Green Galago: Skitterbug)

Additional context

Have you seen issue at gitlab?

Add a function to check a list of URLs for invalid URLs only.

Is your feature request related to a problem? Please describe.
It takes far too long to test all URLs of a list (Slow Internet).
It would therefore be very good if you could check a list of URLs only for incorrect URLs.
Without DNS query or so, only a check if all URLs are correct and if no errors have occurred e.g. a dot at the end of a URL.

Describe the solution you'd like
When I check a list of domains I only want to check if all URLs are correct.
And if there are invalid URLs these should be output in a file.

Describe alternatives you've considered
The alternative is to check everything manually, but it takes too much time.

Additional context
Example domains for testing: ()

0.0.0.0 194.58.122.146\032stratum.aikapool.com
0.0.0.0 POOL.moneropool.com
0.0.0.0 aikapool.com.
0.0.0.0 VPWCHCDEVWEB001.cryptopia.co.nz
0.0.0.0 aikapool.comwww.aikapool.com

All these URLs should be output to an extra file with a switch --invalidcheck.
Such a feature is very useful because it allows you to create accurate updates and save a lot of time.

Expiration date not detected

@hawkeye116477 said:

I'd like to inform you also that your script wrongly detects some domains, for example pogotowie-komputerowe-warszawa.com.pl is still registered. I assume that's probably, cuz it doesn't correctly extracts expiration date for that domains.

Domain Syntax test not 100% accurate

Describe the bug
When testing for domain syntax a valid case is rejected with "INVALID".

Modifications under .PyFunceble.yaml
None

To Reproduce
Steps to reproduce the behavior:

$ pyfunceble -d google.de. -s
google.de. INVALID
$ curl google.de.
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.de/">here</A>.

Expected behavior
Domains ending with a dot are verified valid and tested if they are available

Versions (please complete the following information):

  • OS: macos mojave
  • Python Version: 3.6.7
  • PyFunceble Version: pyfunceble 1.3.0. (Blue Bontebok: Dragonfly)

Additional context
I guess this is nitpicking over what is a valid domain syntax and what not. Also the mentioned syntax might be quiet uncommon.
But if syntax validation is offered, the check should return reliable results.
Additional Information http://www.dns-sd.org./TrailingDotsInDomainNames.html

Proposals to improve PyFunceble

Problems:

  1. The installation method (described here) is confusing and does not match the manual (some commands require privileges and others do not, the installation is mixed with the execution, and the env, etc)
  2. The minimum hardware and OS resources are unknown
  3. It has no debug mode or logs, therefore, there will be no information when an error occurs (For example, sometimes it freezes, without being able to determine the cause)
  4. There are inconsistencies between what the manual says and the creator's suggestions on issues
  5. There are no technical performance data, no warnings about program consumption and how to control it. I have consulted other projects that use this program and do not provide this technical data either
  6. It becomes unstable and collapses or freezes if large lists (+ 3 M) are used

Possible bugs:

  1. Freezing: The program crashes on Ubuntu 18.04.x x64 and large lists (+ 3 M) and the only way to unlock it is with ctrl+c. It happens with small and large lists. The cause is unknown because the program has no debug mode or logs
  2. Wrong instructions: According to the instructions, when ctrl+c is pressed to interrupt the program, the program must be executed with the --clean flag. This is very bad because all work is lost
  3. Warnings: The --clean flag must have a warning of what it does, to avoid partial or total loss of work
  4. auto-continue system fail: The auto-continue system is failing, since when the program is interrupted or frozen, it does not start where it was left, and as a result it is generating duplicates in the output.
  5. Inconsistencies in the output: When processing a list, 3 files are generated in the hosts folder (ACTIVE / hosts, INACTIVE / hosts, INVALID / hosts). However, once the processing of the source list is finished, we take, for example, the INACTIVE / hosts file, and we reprocess it and in theory the output should be the same, but this is not the case, because this inactive list, It can become partially active. So the result is not reliable.
  6. Run in modes and log file: It is necessary that the program has execution modes (debugging, safe, normal, minimal, etc.) so that it does not compromise the stability of the system and know more thoroughly the problems that may arise. The program also needs a log file to facilitate the audit and diagnosis of problems.
  7. Virtual Env: The suggested virtual environment (python3-virtualenv) is not working as it should

Hardware Test:
I have performed different tests in physical environments with Ubuntu 18.04.3 x64 and large lists (+ 3 M). This is the result::

PC1: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz, RAM 32028 MiB
a. PyFunceble -m -p 200 -f file = system collapses
b. PyFunceble -m -p 150 -f file = freezes after a while running
c. PyFunceble -m -p 100 -f file = freezes after a while running
d. PyFunceble -m -p 50 -f file = test abort. Read 'CPU Usage'
e. PyFunceble -f file = Stable but slower than a bash

PC2: Intel(R) Xeon(TM) CPU ES-2603 v4 @ 1.70 GHz, RAM 15903 MiB
a. PyFunceble -m -p 200 -f file = system collapses
b. PyFunceble -m -p 150 -f file = freezes after a while running
c. PyFunceble -m -p 100 -f file = freezes after a while running
d. PyFunceble -m -p 50 -f file = freezes after a while running
e. PyFunceble -f file = Stable but slower than a bash

CPU usage:
In all tests the program reaches 100% CPU usage with a large lists (+ 3 M).
Captura de pantalla -2019-08-07 11-48-21

Speed test: PyFunceble vs bash
Bash:
#!/bin/bash
while read LINE; do
curl -o /dev/null --silent --head --write-out '%{http_code}' "$LINE"
echo " $LINE"
done < source.txt
PyFunceble:
PyFunceble -f source.txt
Results after +1 hour:
PyFunceble: 1364 processed lines (in hosts/ACTIVE hosts/INACTIVE hosts/INVALID)
Bash: 2930 processed lines

Conclusion:
This application is only faster than a simple bash with the "-m -p" flag, but it becomes unstable and freezes or collapses the system. I suggest that it be improved in this regard so that it is usable. regards

database-type mysql

Bug description

When I try to use the --database-type mysql on travis I get the following error

Modifications under .PyFunceble.yaml

cat dev-tools/.pyfunceble-env 
PYFUNCEBLE_DB_CHARSET=utf8mb4
PYFUNCEBLE_DB_HOST=localhost
PYFUNCEBLE_DB_NAME=PyFunceble
PYFUNCEBLE_DB_PASSWORD=
PYFUNCEBLE_DB_PORT=3306
PYFUNCEBLE_DB_USERNAME=root

These should be the currect info to use according to travis doc

Reproduction

Test string:

PyFunceble --travis -h -db -m -p 4 -ex --dns 127.0.0.1 --cmd-before-end \
        "bash ${TRAVIS_BUILD_DIR}/dev-tools/FinalCommit.sh" --plain --autosave-minutes 20 \
        --commit-autosave-message "V1.${yeartag}.${monthtag}.${TRAVIS_BUILD_NUMBER}  [Auto Saved]" \
        --commit-results-message "V1.${yeartag}.${monthtag}.${TRAVIS_BUILD_NUMBER}" \
        -f ${testfile}

Tested on the dev version by install command

install:
  - pip3 install --upgrade pip
  - pip3 install PyFunceble-dev

Expected behavior**

A clear and concise description of what you expected to happen.

Screenshots

If applicable, add screenshots to help explain your problem.

Versions

PyFunceble Version: Dev

Additional context

Onion addresses are not an invalid URL

Bug description

When checking the URLs in a list, .onion addresses are always recognized as invalid.

Expected behavior**

I expect that .onion addresses will simply be skipped.

Example of a .onion address from my list.

# File generated by PyFunceble (v2.17.9.dev) / https://github.com/funilrys/PyFunceble
# Date of generation: Sun 03 Nov 23:11:59 CET 2019 

xrqwig7erykgll4z.onion

System

  • Kernel: 5.3.8-arch1-1 x86_64
  • Desktop: Cinnamon 4.2.4
  • Distro: Arch Linux
  • Shell: bash 5.0.11

Url Testing --url-file marking domains with port numbers like :80 :81 :8080 etc as INVALID

Is your feature request related to a problem? Please describe.
Url testing marks links with port numbers after the domain name like :80 :81 :8080 etc are marked INVALID and never tested
https://github.com/mitchellkrogza/Phishing.Database/tree/master/phishing-links/output/domains/INVALID/list

Describe the solution you'd like
Test all links that include a random port number

Describe alternatives you've considered

Additional context
Add any other context or screenshots about the feature request here.

We should not produce output in a certain case

We should not write or produce output if an element which is in the database is still ACTIVE or INVALID on retest.

@dnmTX said (anudeepND/blacklist#27 (comment)):

@funilrys i got your point but it makes me wonder what good are they doing in a folder that is design to collect invalid domains that came from the original lists during filtering.In our case here they're no longer present there(in the orig. lists).Maybe a sub folder for collecting a "old,no longer present invalid domains"? So they can pile up there and keep the main folder tight,with only the fresh ones.

Failing after few test

Proposal logo for PyFunceble

Hi Sir
If you are interested, I will donate a logo for your project. However, before I start it I need to ask your permission first ;). if I get permission, I need details of the logo like what you want 😄

PyFunceble needs some improvements in its adblocker syntax support

Bug description

PyFunceble 2.2.0 does not seem to me to parse:

Modifications under .PyFunceble.yaml

None that I'm aware of.

Reproduction

Steps to reproduce the behavior:

  1. Run PyFunceble -ad -f https://raw.githubusercontent.com/uBlockOrigin/uAssets/master/filters/badware.txt
  2. See that only 19 of the ~180 domains in that list get processed by PyFunceble.

Expected behavior**

See that ~180 domains get processed by PyFunceble.

Screenshots

image

Versions

OS: Windows 10 May 2019 Update
Python Version: 3.6.8, according to Cygwin
PyFunceble Version: 2.2.0

Additional context

||example.org$document is a uBlock Origin-created syntax that remains unsupported in ABP-likes or in AdGuard, and it's considered a very important syntax to use in uBO-specific lists, especially anti-malware ones.

Memory leak? at least a huge memory consumption

Bug description

A clear and concise description of what the bug is.
I'm running this app (only) on a virtual host with 4GB, but after a while it have eaten all the memory to final fail

$ free -m
              total        used        free      shared  buff/cache   available
Mem:           3850        3014         326          29         509         596
Swap:             0           0           0
5 minutes later...
free -m
              total        used        free      shared  buff/cache   available
Mem:           3850        3097         242          29         510         513
Swap:             0           0           0

Modifications under .PyFunceble.yaml

Ask @mitchellkrogza running on his Badd-Boyz-Hosts

Reproduction

Steps to reproduce the behavior:
Open source at https://github.com/spirillen/Dead-Domains

Expected behavior**

freeing memory as each query is processed and written to disk

Versions

OS: Debian 10, (Buster)
Python Version: Python 3.7.3
PyFunceble Version: pyfunceble 2.2.0. (Green Galago: Skitterbug)

Additional context

None

Feature Request: convert IDN's to punycode

I've seen this done in a pfSense add-on called pfBlockerNG. It will consume a bunch of DNSBL host files and also convert any International Domain Names into the appropriate punycode.

https://en.wikipedia.org/wiki/Punycode

https://nakedsecurity.sophos.com/2017/04/19/phishing-with-punycode-when-foreign-letters-spell-english-words/

https://thehackernews.com/2017/04/unicode-Punycode-phishing-attack.html

https://arstechnica.com/information-technology/2017/04/chrome-firefox-and-opera-users-beware-this-isnt-the-apple-com-you-want/

So instead of outputting https://www.аррӏе.com you'd get https://www.xn--80ak6aa92e.com/ since it's actually using Cyrillic to obfuscate www.apple.com.

You can copy & paste that first apple url into your address bar to see that it takes you to a proof-of-concept site.

sample code links to convert unicode to ASCII on the wikipedia page and here in the pfB add-on

Feature request - Testing of urls

Would be great to see a feature added to achieve the following

  • test a full url (or input file or urls) not just the domains

  • test only for http status codes 200 OK, 404 NOT FOUND, 410 GONE and 403 FORBIDDEN (no whois checks so will be much faster)

  • produce simple output files of URLs ACTIVE, URLS INACTIVE

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.