Giter Club home page Giter Club logo

Comments (12)

hao-lee avatar hao-lee commented on May 31, 2024 2

@kezhenxu94 前天昨天搞了两个晚上,今晚花了20分钟终于搞定了,跑起来了,强大的难以置信,早就听说ElasticSearch功能强大,当世利器无出其右,今日一见果然了得。我觉得这个项目非常伟大,对租房来说太有用了,我本来想自己写个爬虫的,但是肯定搞不到这么完善,我觉得这个项目可以继续发展,从代码本身可以作为学习的好例子,从实用角度也可以吊打各种中介。我的Docker知识是现学现用,菜鸟一个,但是Python咱还是会的,我想研究一下代码然后陆续提交一点pr,我还想完善一下文档,在知乎打打广告,这个项目的价值理应得到发掘,以后应该建立一个Organization专门维护这项目。

现在有个公众号叫“暖房”,使用机器学习过滤中介信息,我觉得可以借鉴,不过这是后话了。

总之,这个项目太叼了,搭起来之前觉得好麻烦,搭起来之后觉得碉堡了。

from house-renting.

hao-lee avatar hao-lee commented on May 31, 2024 1

再补充一点,

爬虫异常退出

核心错误信息:

    def write(self, data, async=False):
                              ^
SyntaxError: invalid syntax

该错误是由于 scrapy 所依赖的 twisted 库对 Python 3.7 的支持有问题导致的,目前上游没有解决,所以没办法搞。

该错误完整的信息如下:

2018-09-06 13:37:19 [scrapy.utils.log] INFO: Scrapy 1.4.0 started (bot: house_renting)
2018-09-06 13:37:19 [scrapy.utils.log] INFO: Overridden settings: {'AUTOTHROTTLE_DEBUG': True, 'AUTOTHROTTLE_ENABLED': True, 'AUTOTHROTTLE_MAX_DELAY': 10, 'AUTOTHROTTLE_START_DELAY': 10, 'AUTOTHROTTLE_TARGET_CONCURRENCY': 2.0, 'BOT_NAME': 'house_renting', 'COMMANDS_MODULE': 'house_renting.commands', 'CONCURRENT_REQUESTS_PER_DOMAIN': 1, 'COOKIES_ENABLED': False, 'DOWNLOAD_DELAY': 10, 'DOWNLOAD_TIMEOUT': 30, 'LOG_LEVEL': 'INFO', 'NEWSPIDER_MODULE': 'house_renting.spiders', 'RETRY_TIMES': 3, 'SPIDER_MODULES': ['house_renting.spiders'], 'TELNETCONSOLE_ENABLED': False, 'USER_AGENT': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.1 Safari/605.1.15 '}
Traceback (most recent call last):
  File "/usr/local/bin/scrapy", line 11, in <module>
    sys.exit(execute())
  File "/usr/local/lib/python3.7/site-packages/scrapy/cmdline.py", line 149, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "/usr/local/lib/python3.7/site-packages/scrapy/cmdline.py", line 89, in _run_print_help
    func(*a, **kw)
  File "/usr/local/lib/python3.7/site-packages/scrapy/cmdline.py", line 156, in _run_command
    cmd.run(args, opts)
  File "/house-renting/crawler/house_renting/commands/crawl.py", line 17, in run
    self.crawler_process.crawl(spider_name, **opts.spargs)
  File "/usr/local/lib/python3.7/site-packages/scrapy/crawler.py", line 167, in crawl
    crawler = self.create_crawler(crawler_or_spidercls)
  File "/usr/local/lib/python3.7/site-packages/scrapy/crawler.py", line 195, in create_crawler
    return self._create_crawler(crawler_or_spidercls)
  File "/usr/local/lib/python3.7/site-packages/scrapy/crawler.py", line 200, in _create_crawler
    return Crawler(spidercls, self.settings)
  File "/usr/local/lib/python3.7/site-packages/scrapy/crawler.py", line 52, in __init__
    self.extensions = ExtensionManager.from_crawler(self)
  File "/usr/local/lib/python3.7/site-packages/scrapy/middleware.py", line 58, in from_crawler
    return cls.from_settings(crawler.settings, crawler)
  File "/usr/local/lib/python3.7/site-packages/scrapy/middleware.py", line 34, in from_settings
    mwcls = load_object(clspath)
  File "/usr/local/lib/python3.7/site-packages/scrapy/utils/misc.py", line 44, in load_object
    mod = import_module(module)
  File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/usr/local/lib/python3.7/site-packages/scrapy/extensions/telnet.py", line 12, in <module>
    from twisted.conch import manhole, telnet
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/manhole.py", line 154
    def write(self, data, async=False):
                              ^
SyntaxError: invalid syntax

from house-renting.

kezhenxu94 avatar kezhenxu94 commented on May 31, 2024

多谢,后面我加到 Wiki 中

from house-renting.

hao-lee avatar hao-lee commented on May 31, 2024

拉取 redis 镜像非常慢

换用国内Docker镜像,方法为打开或创建 /etc/docker/daemon.json 文件,写入以下内容:

{
    "registry-mirrors": [
        "http://18817714.m.daocloud.io"
    ],
    "insecure-registries": []
}

注意,http://18817714.m.daocloud.io 这网址是我从一博主的博客内找的,如果不能用了请自行申请。

from house-renting.

hao-lee avatar hao-lee commented on May 31, 2024

@kezhenxu94 我不太熟悉Docker,不知道能不能做到将Python的版本定死成3.6,不然爬虫没法运行啊

from house-renting.

hao-lee avatar hao-lee commented on May 31, 2024

解决了,在Dockerfile里指定Python版本就行了

from house-renting.

kezhenxu94 avatar kezhenxu94 commented on May 31, 2024

@hao-lee 已经修改 Python 版本

from house-renting.

andkirby avatar andkirby commented on May 31, 2024

Oh, man, have you ever heard about English language?.. :)

from house-renting.

hao-lee avatar hao-lee commented on May 31, 2024

@andkirby emmm...As this is a tool for renting house in China, I think this may be not very useful to people from US or other countries... 😅

from house-renting.

andkirby avatar andkirby commented on May 31, 2024

@hao-lee, yeah, indeed. :D I found out what's this for later. :)
I just found the same error here and would like to get a solution but... :) Anyway, GoogleTranslate works fine, and sometimes funny. :^D

from house-renting.

hao-lee avatar hao-lee commented on May 31, 2024

@andkirby 😅 Okay......

from house-renting.

Maxiaoyu0 avatar Maxiaoyu0 commented on May 31, 2024

想问一下,我用k8s部署到azure中,也是出现这样的问题,但是不能每个node中都去更改吧。

from house-renting.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.