Giter Club home page Giter Club logo

autoproxy's People

Contributors

dchrostowski avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

autoproxy's Issues

GET //listjobs.json fails

When running docker-compose up scrapyd spider_scheduler

[+] Running 4/4
 ⠿ Container autoproxy_redis      Recreated                                                                                       0.3s
 ⠿ Container autoproxy_db         Recreated                                                                                       0.3s
 ⠿ Container autoproxy_scrapyd    Recreated                                                                                       0.1s
 ⠿ Container autoproxy_scheduler  Recreated                                                                                       0.1s
Attaching to autoproxy_scheduler, autoproxy_scrapyd
autoproxy_scrapyd    | Adding password for user scrapy
autoproxy_scrapyd    | /start/scrapy.cfg
autoproxy_scrapyd    | Starting nginx: nginx.
autoproxy_scrapyd    | 2023-01-11T00:49:42+0000 [-] Loading /usr/local/lib/python3.7/dist-packages/scrapyd/txapp.py...
autoproxy_scrapyd    | 2023-01-11T00:49:43+0000 [-] Basic authentication disabled as either `username` or `password` is unset
autoproxy_scrapyd    | 2023-01-11T00:49:43+0000 [-] Scrapyd web console available at http://127.0.0.1:6801/
autoproxy_scrapyd    | 2023-01-11T00:49:43+0000 [-] Loaded.
autoproxy_scrapyd    | 2023-01-11T00:49:43+0000 [twisted.scripts._twistd_unix.UnixAppLogger#info] twistd 20.3.0 (/usr/bin/python3 3.7.3) starting up.
autoproxy_scrapyd    | 2023-01-11T00:49:43+0000 [twisted.scripts._twistd_unix.UnixAppLogger#info] reactor class: twisted.internet.epollreactor.EPollReactor.
autoproxy_scrapyd    | 2023-01-11T00:49:43+0000 [-] Site starting on 6801
autoproxy_scrapyd    | 2023-01-11T00:49:43+0000 [twisted.web.server.Site#info] Starting factory <twisted.web.server.Site object at 0x7fc560f26e10>
autoproxy_scrapyd    | 2023-01-11T00:49:43+0000 [Launcher] Scrapyd 1.3.0 started: max_proc=32, runner='scrapyd.runner'
autoproxy_scrapyd    | 2023-01-11T00:49:44+0000 [_GenericHTTPChannelProtocol,0,127.0.0.1] Unhandled Error
autoproxy_scrapyd    | 	Traceback (most recent call last):
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/twisted/web/http.py", line 2284, in allContentReceived
autoproxy_scrapyd    | 	    req.requestReceived(command, path, version)
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/twisted/web/http.py", line 946, in requestReceived
autoproxy_scrapyd    | 	    self.process()
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/twisted/web/server.py", line 235, in process
autoproxy_scrapyd    | 	    self.render(resrc)
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/twisted/web/server.py", line 302, in render
autoproxy_scrapyd    | 	    body = resrc.render(self)
autoproxy_scrapyd    | 	--- <exception caught here> ---
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapyd/webservice.py", line 21, in render
autoproxy_scrapyd    | 	    return JsonResource.render(self, txrequest).encode('utf-8')
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapyd/utils.py", line 21, in render
autoproxy_scrapyd    | 	    r = resource.Resource.render(self, txrequest)
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/twisted/web/resource.py", line 265, in render
autoproxy_scrapyd    | 	    return m(request)
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapyd/webservice.py", line 88, in render_POST
autoproxy_scrapyd    | 	    spiders = get_spider_list(project, version=version)
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapyd/utils.py", line 134, in get_spider_list
autoproxy_scrapyd    | 	    raise RuntimeError(msg.encode('unicode_escape') if six.PY2 else msg)
autoproxy_scrapyd    | 	builtins.RuntimeError: Traceback (most recent call last):
autoproxy_scrapyd    | 	  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
autoproxy_scrapyd    | 	    "__main__", mod_spec)
autoproxy_scrapyd    | 	  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
autoproxy_scrapyd    | 	    exec(code, run_globals)
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapyd/runner.py", line 46, in <module>
autoproxy_scrapyd    | 	    main()
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapyd/runner.py", line 43, in main
autoproxy_scrapyd    | 	    execute()
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapy/cmdline.py", line 145, in execute
autoproxy_scrapyd    | 	    cmd.crawler_process = CrawlerProcess(settings)
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapy/crawler.py", line 267, in __init__
autoproxy_scrapyd    | 	    super(CrawlerProcess, self).__init__(settings)
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapy/crawler.py", line 145, in __init__
autoproxy_scrapyd    | 	    self.spider_loader = _get_spider_loader(settings)
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapy/crawler.py", line 347, in _get_spider_loader
autoproxy_scrapyd    | 	    return loader_cls.from_settings(settings.frozencopy())
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapy/spiderloader.py", line 61, in from_settings
autoproxy_scrapyd    | 	    return cls(settings)
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapy/spiderloader.py", line 25, in __init__
autoproxy_scrapyd    | 	    self._load_all_spiders()
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapy/spiderloader.py", line 47, in _load_all_spiders
autoproxy_scrapyd    | 	    for module in walk_modules(name):
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapy/utils/misc.py", line 73, in walk_modules
autoproxy_scrapyd    | 	    submod = import_module(fullpath)
autoproxy_scrapyd    | 	  File "/usr/lib/python3.7/importlib/__init__.py", line 127, in import_module
autoproxy_scrapyd    | 	    return _bootstrap._gcd_import(name[level:], package, level)
autoproxy_scrapyd    | 	  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
autoproxy_scrapyd    | 	  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
autoproxy_scrapyd    | 	  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
autoproxy_scrapyd    | 	  File "<frozen importlib._bootstrap>", line 668, in _load_unlocked
autoproxy_scrapyd    | 	  File "<frozen importlib._bootstrap>", line 638, in _load_backward_compatible
autoproxy_scrapyd    | 	  File "/tmp/autoproxy-1673398183-akv6mkes.egg/autoproxy/spiders/ip_adress.py", line 7, in <module>
autoproxy_scrapyd    | 	ModuleNotFoundError: No module named 'scrapy_autoproxy'
autoproxy_scrapyd    |
autoproxy_scrapyd    |
autoproxy_scrapyd    | 2023-01-11T00:49:44+0000 [twisted.python.log#info] "127.0.0.1" - - [11/Jan/2023:00:49:43 +0000] "POST /addversion.json HTTP/1.0" 200 2308 "-" "Python-urllib/3.7"
autoproxy_scheduler  | Packing version 1673398183
autoproxy_scheduler  | Deploying to project "autoproxy" in http://scrapyd:6800/addversion.json
autoproxy_scheduler  | Server response (200):
autoproxy_scheduler  | {"node_name": "3a4c186ac61f", "status": "error", "message": "Traceback (most recent call last):\n  File \"/usr/lib/python3.7/runpy.py\", line 193, in _run_module_as_main\n    \"__main__\", mod_spec)\n  File \"/usr/lib/python3.7/runpy.py\", line 85, in _run_code\n    exec(code, run_globals)\n  File \"/usr/local/lib/python3.7/dist-packages/scrapyd/runner.py\", line 46, in <module>\n    main()\n  File \"/usr/local/lib/python3.7/dist-packages/scrapyd/runner.py\", line 43, in main\n    execute()\n  File \"/usr/local/lib/python3.7/dist-packages/scrapy/cmdline.py\", line 145, in execute\n    cmd.crawler_process = CrawlerProcess(settings)\n  File \"/usr/local/lib/python3.7/dist-packages/scrapy/crawler.py\", line 267, in __init__\n    super(CrawlerProcess, self).__init__(settings)\n  File \"/usr/local/lib/python3.7/dist-packages/scrapy/crawler.py\", line 145, in __init__\n    self.spider_loader = _get_spider_loader(settings)\n  File \"/usr/local/lib/python3.7/dist-packages/scrapy/crawler.py\", line 347, in _get_spider_loader\n    return loader_cls.from_settings(settings.frozencopy())\n  File \"/usr/local/lib/python3.7/dist-packages/scrapy/spiderloader.py\", line 61, in from_settings\n    return cls(settings)\n  File \"/usr/local/lib/python3.7/dist-packages/scrapy/spiderloader.py\", line 25, in __init__\n    self._load_all_spiders()\n  File \"/usr/local/lib/python3.7/dist-packages/scrapy/spiderloader.py\", line 47, in _load_all_spiders\n    for module in walk_modules(name):\n  File \"/usr/local/lib/python3.7/dist-packages/scrapy/utils/misc.py\", line 73, in walk_modules\n    submod = import_module(fullpath)\n  File \"/usr/lib/python3.7/importlib/__init__.py\", line 127, in import_module\n    return _bootstrap._gcd_import(name[level:], package, level)\n  File \"<frozen importlib._bootstrap>\", line 1006, in _gcd_import\n  File \"<frozen importlib._bootstrap>\", line 983, in _find_and_load\n  File \"<frozen importlib._bootstrap>\", line 967, in _find_and_load_unlocked\n  File \"<frozen importlib._bootstrap>\", line 668, in _load_unlocked\n  File \"<frozen importlib._bootstrap>\", line 638, in _load_backward_compatible\n  File \"/tmp/autoproxy-1673398183-akv6mkes.egg/autoproxy/spiders/ip_adress.py\", line 7, in <module>\nModuleNotFoundError: No module named 'scrapy_autoproxy'\n"}
autoproxy_scheduler  |
autoproxy_scheduler  | DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): scrapyd:6800
autoproxy_scrapyd    | 2023-01-11T00:49:44+0000 [twisted.python.log#info] "127.0.0.1" - - [11/Jan/2023:00:49:44 +0000] "GET /daemonstatus.json HTTP/1.0" 200 89 "-" "python-requests/2.22.0"
autoproxy_scheduler  | DEBUG:urllib3.connectionpool:http://scrapyd:6800 "GET //daemonstatus.json HTTP/1.1" 200 89
autoproxy_scheduler  | DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): scrapyd:6800
autoproxy_scrapyd    | 2023-01-11T00:49:44+0000 [twisted.python.log#info] "127.0.0.1" - - [11/Jan/2023:00:49:44 +0000] "GET /listprojects.json HTTP/1.0" 200 71 "-" "python-requests/2.22.0"
autoproxy_scheduler  | DEBUG:urllib3.connectionpool:http://scrapyd:6800 "GET //listprojects.json HTTP/1.1" 200 71
autoproxy_scheduler  | DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): scrapyd:6800
autoproxy_scrapyd    | 2023-01-11T00:49:45+0000 [_GenericHTTPChannelProtocol,3,127.0.0.1] Unhandled Error
autoproxy_scrapyd    | 	Traceback (most recent call last):
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/twisted/web/http.py", line 2284, in allContentReceived
autoproxy_scrapyd    | 	    req.requestReceived(command, path, version)
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/twisted/web/http.py", line 946, in requestReceived
autoproxy_scrapyd    | 	    self.process()
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/twisted/web/server.py", line 235, in process
autoproxy_scrapyd    | 	    self.render(resrc)
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/twisted/web/server.py", line 302, in render
autoproxy_scrapyd    | 	    body = resrc.render(self)
autoproxy_scrapyd    | 	--- <exception caught here> ---
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapyd/webservice.py", line 21, in render
autoproxy_scrapyd    | 	    return JsonResource.render(self, txrequest).encode('utf-8')
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapyd/utils.py", line 21, in render
autoproxy_scrapyd    | 	    r = resource.Resource.render(self, txrequest)
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/twisted/web/resource.py", line 265, in render
autoproxy_scrapyd    | 	    return m(request)
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapyd/webservice.py", line 114, in render_GET
autoproxy_scrapyd    | 	    spiders = get_spider_list(project, runner=self.root.runner, version=version)
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapyd/utils.py", line 134, in get_spider_list
autoproxy_scrapyd    | 	    raise RuntimeError(msg.encode('unicode_escape') if six.PY2 else msg)
autoproxy_scrapyd    | 	builtins.RuntimeError: Traceback (most recent call last):
autoproxy_scrapyd    | 	  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
autoproxy_scrapyd    | 	    "__main__", mod_spec)
autoproxy_scrapyd    | 	  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
autoproxy_scrapyd    | 	    exec(code, run_globals)
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapyd/runner.py", line 46, in <module>
autoproxy_scrapyd    | 	    main()
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapyd/runner.py", line 43, in main
autoproxy_scrapyd    | 	    execute()
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapy/cmdline.py", line 114, in execute
autoproxy_scrapyd    | 	    settings = get_project_settings()
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapy/utils/project.py", line 69, in get_project_settings
autoproxy_scrapyd    | 	    settings.setmodule(settings_module_path, priority='project')
autoproxy_scrapyd    | 	  File "/usr/local/lib/python3.7/dist-packages/scrapy/settings/__init__.py", line 294, in setmodule
autoproxy_scrapyd    | 	    module = import_module(module)
autoproxy_scrapyd    | 	  File "/usr/lib/python3.7/importlib/__init__.py", line 127, in import_module
autoproxy_scrapyd    | 	    return _bootstrap._gcd_import(name[level:], package, level)
autoproxy_scrapyd    | 	  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
autoproxy_scrapyd    | 	  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
autoproxy_scrapyd    | 	  File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
autoproxy_scrapyd    | 	  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
autoproxy_scrapyd    | 	  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
autoproxy_scrapyd    | 	  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
autoproxy_scrapyd    | 	  File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
autoproxy_scrapyd    | 	ModuleNotFoundError: No module named 'autoproxy'
autoproxy_scrapyd    |
autoproxy_scrapyd    |
autoproxy_scrapyd    | 2023-01-11T00:49:45+0000 [twisted.python.log#info] "127.0.0.1" - - [11/Jan/2023:00:49:44 +0000] "GET /listspiders.json?project=default HTTP/1.0" 200 1662 "-" "python-requests/2.22.0"
autoproxy_scheduler  | DEBUG:urllib3.connectionpool:http://scrapyd:6800 "GET //listspiders.json?project=default HTTP/1.1" 200 1662
autoproxy_scheduler  | DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): scrapyd:6800
autoproxy_scrapyd    | 2023-01-11T00:49:45+0000 [twisted.python.log#info] "127.0.0.1" - - [11/Jan/2023:00:49:44 +0000] "GET /listjobs.json HTTP/1.0" 200 92 "-" "python-requests/2.22.0"
autoproxy_scheduler  | DEBUG:urllib3.connectionpool:http://scrapyd:6800 "GET //listjobs.json HTTP/1.1" 200 92
autoproxy_scheduler  | DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): scrapyd:6800
autoproxy_scrapyd    | 2023-01-11T00:49:45+0000 [twisted.python.log#info] "127.0.0.1" - - [11/Jan/2023:00:49:44 +0000] "GET /listjobs.json HTTP/1.0" 200 92 "-" "python-requests/2.22.0"
autoproxy_scheduler  | DEBUG:urllib3.connectionpool:http://scrapyd:6800 "GET //listjobs.json HTTP/1.1" 200 92
autoproxy_scheduler  | Traceback (most recent call last):
autoproxy_scheduler  |   File "/scheduler/spider_scheduler.py", line 268, in <module>
autoproxy_scheduler  |     for spider in itertools.cycle(spider_gen):
autoproxy_scheduler  |   File "/scheduler/spider_scheduler.py", line 215, in spider_generator
autoproxy_scheduler  |     for spider in self.project_spiders[project]:
autoproxy_scheduler  | KeyError: 'autoproxy'

Error in StorageManager.sync_to_db

autoproxy_db | 2019-12-20 18:05:11.738 UTC [135] ERROR: duplicate key value violates unique constraint "address_port_unique"
autoproxy_db | 2019-12-20 18:05:11.738 UTC [135] DETAIL: Key (address, port)=(103.88.234.251, 53085) already exists.
autoproxy_db | 2019-12-20 18:05:11.738 UTC [135] STATEMENT: INSERT INTO "proxies" ("address", "port", "protocol") VALUES ('103.88.234.251', '53085', 'http') RETURNING "proxy_id"

RedisManager.get_proxy_by_address_and_port method

Returns a list of established proxies from the database but neglects to search for new ones (prefixed with the redis key pt_

This causes a duplicate key database error.
The line

        proxy_keys = self.redis.keys('p*')

should probably be changed to

proxy_keys = self.redis.keys('p*') + self.redis.keys('pt*')

gatherproxy's domain has changed

Found proxygather.com on Google. Site is nearly the same as when it had the old domain, but there are a few differences and the spider needs to be updated a bit.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.