Comments (12)
@kezhenxu94 前天昨天搞了两个晚上,今晚花了20分钟终于搞定了,跑起来了,强大的难以置信,早就听说ElasticSearch功能强大,当世利器无出其右,今日一见果然了得。我觉得这个项目非常伟大,对租房来说太有用了,我本来想自己写个爬虫的,但是肯定搞不到这么完善,我觉得这个项目可以继续发展,从代码本身可以作为学习的好例子,从实用角度也可以吊打各种中介。我的Docker知识是现学现用,菜鸟一个,但是Python咱还是会的,我想研究一下代码然后陆续提交一点pr,我还想完善一下文档,在知乎打打广告,这个项目的价值理应得到发掘,以后应该建立一个Organization专门维护这项目。
现在有个公众号叫“暖房”,使用机器学习过滤中介信息,我觉得可以借鉴,不过这是后话了。
总之,这个项目太叼了,搭起来之前觉得好麻烦,搭起来之后觉得碉堡了。
from house-renting.
再补充一点,
爬虫异常退出
核心错误信息:
def write(self, data, async=False):
^
SyntaxError: invalid syntax
该错误是由于 scrapy
所依赖的 twisted
库对 Python 3.7 的支持有问题导致的,目前上游没有解决,所以没办法搞。
该错误完整的信息如下:
2018-09-06 13:37:19 [scrapy.utils.log] INFO: Scrapy 1.4.0 started (bot: house_renting)
2018-09-06 13:37:19 [scrapy.utils.log] INFO: Overridden settings: {'AUTOTHROTTLE_DEBUG': True, 'AUTOTHROTTLE_ENABLED': True, 'AUTOTHROTTLE_MAX_DELAY': 10, 'AUTOTHROTTLE_START_DELAY': 10, 'AUTOTHROTTLE_TARGET_CONCURRENCY': 2.0, 'BOT_NAME': 'house_renting', 'COMMANDS_MODULE': 'house_renting.commands', 'CONCURRENT_REQUESTS_PER_DOMAIN': 1, 'COOKIES_ENABLED': False, 'DOWNLOAD_DELAY': 10, 'DOWNLOAD_TIMEOUT': 30, 'LOG_LEVEL': 'INFO', 'NEWSPIDER_MODULE': 'house_renting.spiders', 'RETRY_TIMES': 3, 'SPIDER_MODULES': ['house_renting.spiders'], 'TELNETCONSOLE_ENABLED': False, 'USER_AGENT': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.1 Safari/605.1.15 '}
Traceback (most recent call last):
File "/usr/local/bin/scrapy", line 11, in <module>
sys.exit(execute())
File "/usr/local/lib/python3.7/site-packages/scrapy/cmdline.py", line 149, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "/usr/local/lib/python3.7/site-packages/scrapy/cmdline.py", line 89, in _run_print_help
func(*a, **kw)
File "/usr/local/lib/python3.7/site-packages/scrapy/cmdline.py", line 156, in _run_command
cmd.run(args, opts)
File "/house-renting/crawler/house_renting/commands/crawl.py", line 17, in run
self.crawler_process.crawl(spider_name, **opts.spargs)
File "/usr/local/lib/python3.7/site-packages/scrapy/crawler.py", line 167, in crawl
crawler = self.create_crawler(crawler_or_spidercls)
File "/usr/local/lib/python3.7/site-packages/scrapy/crawler.py", line 195, in create_crawler
return self._create_crawler(crawler_or_spidercls)
File "/usr/local/lib/python3.7/site-packages/scrapy/crawler.py", line 200, in _create_crawler
return Crawler(spidercls, self.settings)
File "/usr/local/lib/python3.7/site-packages/scrapy/crawler.py", line 52, in __init__
self.extensions = ExtensionManager.from_crawler(self)
File "/usr/local/lib/python3.7/site-packages/scrapy/middleware.py", line 58, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "/usr/local/lib/python3.7/site-packages/scrapy/middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
File "/usr/local/lib/python3.7/site-packages/scrapy/utils/misc.py", line 44, in load_object
mod = import_module(module)
File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/usr/local/lib/python3.7/site-packages/scrapy/extensions/telnet.py", line 12, in <module>
from twisted.conch import manhole, telnet
File "/usr/local/lib/python3.7/site-packages/twisted/conch/manhole.py", line 154
def write(self, data, async=False):
^
SyntaxError: invalid syntax
from house-renting.
多谢,后面我加到 Wiki 中
from house-renting.
拉取 redis 镜像非常慢
换用国内Docker镜像,方法为打开或创建 /etc/docker/daemon.json
文件,写入以下内容:
{
"registry-mirrors": [
"http://18817714.m.daocloud.io"
],
"insecure-registries": []
}
注意,http://18817714.m.daocloud.io
这网址是我从一博主的博客内找的,如果不能用了请自行申请。
from house-renting.
@kezhenxu94 我不太熟悉Docker,不知道能不能做到将Python的版本定死成3.6,不然爬虫没法运行啊
from house-renting.
解决了,在Dockerfile里指定Python版本就行了
from house-renting.
@hao-lee 已经修改 Python 版本
from house-renting.
Oh, man, have you ever heard about English language?.. :)
from house-renting.
@andkirby emmm...As this is a tool for renting house in China, I think this may be not very useful to people from US or other countries... 😅
from house-renting.
@hao-lee, yeah, indeed. :D I found out what's this for later. :)
I just found the same error here and would like to get a solution but... :) Anyway, GoogleTranslate works fine, and sometimes funny. :^D
from house-renting.
@andkirby 😅 Okay......
from house-renting.
想问一下,我用k8s部署到azure中,也是出现这样的问题,但是不能每个node中都去更改吧。
from house-renting.
Related Issues (20)
- elasticsearch 莫名 exit HOT 9
- 添加可浏览的网页 (Adding browsable webpage)
- 添加单元测试 (Adding unit test)
- elasticsearch 换源 HOT 2
- 运行scrapy crawl lianjia出错 HOT 1
- 城市设置没生效 HOT 4
- 爬取链家的spider 启动失败 HOT 4
- docker启动失败 HOT 2
- 启动docker失败 HOT 2
- Google搜索引擎的搜索技巧足够寻找住房信息 HOT 4
- Kibana 不能正常运行 HOT 18
- 无法安装和创建服务: invalid volume specification HOT 3
- 用docker启动项目总是连接不上es
- 运行很长时间,一直抓不到数据 HOT 2
- 图片下载错误 HOT 2
- docker-compose启动的容器scrapyd和crawler会立即退出 HOT 4
- ModuleNotFoundError: No module named 'scrapy.conf' HOT 2
- 设置了cities=(深圳,) 但爬出来的都是广州的好像? HOT 2
- Let's migrate to Py3 HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from house-renting.