Giter Club home page Giter Club logo

sinaspider's Introduction

Sina_Spider1: 《新浪微博爬虫分享(一天可抓取 1300 万条数据)
Sina_Spider2: 《新浪微博分布式爬虫分享
Sina_Spider3: 《新浪微博爬虫分享(2016年12月01日更新)

Sina_Spider1为单机版本。
Sina_Spider2在Sina_Spider1的基础上基于scrapy_redis模块实现分布式。
Sina_Spider3增加了Cookie池的维护,优化了种子队列和去重队列。

三个版本的详细介绍请看各自的博客。 遇到什么问题请尽量留言,方便后来遇到同样问题的同学查看。也可加一下QQ交流群:微博爬虫交流群




20161215更新:
有人反映说爬虫一直显示爬了0页,没有抓到数据。
1、把settings.py里面的LOG_LEVEL = 'INFO'一行注释掉,使用默认的"DEBUG"日志模式,运行程序可查看是否正常请求网页。
2、注意程序是有去重功能的,所以要清空数据重新跑的话一定要把redis的去重队列删掉,否则起始ID被记录为已爬的话也会出现抓取为空的现象。清空redis数据 运行cleanRedis.py即可。
3、另外,微博开始对IP有限制了,如果爬的快 可能会出现403,大规模抓取的话需要加上代理池。



---------------------------------------------------------------------------
20170323更新:
微博从昨天下午三点多开始做了一些改动,原本免验证码获取Cookie的途径已经不能用了。以前为了免验证码登录,到处找途径,可能最近爬的人多了,给封了。
那么就直面验证码吧,走正常流程登录,才没那么容易被封。此次更新主要在于Cookie的获取途径,其他地方和往常一样(修改了cookies.py,新增了yumdama.py)。
加了验证码,难度和复杂程度都提高了一点,对于没有编程经验的同学可能会有一些难度。
验证码处理主要有两种:手动输入和打码平台自动填写(手动输入配置简单,打码平台输入适合大规模抓取)。

手动方式流程:
1、下载PhantomJS.exe,放在python的安装路径(适合Windows系统,Linux请找百度)。
2、运行launch.py启动爬虫,中途会要求输入验证码,查看项目路径下新生成的aa.png,输入验证码 回车,即可。

打码方式流程:
1、下载PhantomJS.exe,放在python的安装路径。
2、安装Python模块PIL(请自行百度,可能道路比较坎坷)
3、验证码打码:我使用的是 http://www.yundama.com/ (真的不是打广告..),将username、password、appkey填入yumdama.py(正确率挺高,weibo.cn正常的验证码是4位字符,1元可以识别200个)。
(如果一直出现302,调试发现yumdama.py一直返回空字符串,可将yumdama.py中的apiurl改成 'http://api.yundama.net:5678/api.php' 试试,在第38行前后,原值是 'http://api.yundama.com/api.php' 。)
4、cookies.py中设置IDENTIFY=2,运行launch.py启动爬虫即可。



---------------------------------------------------------------------------
20170405更新:
微博从4月1日开始对IP限制更严了,很容易就403 Forbidden了,解决的办法是加代理。从16年12月更新代码后爬微博的人多了许多,可能对weibo.cn造成了挺多无效访问。所以此次代码就不更新了,过滤一些爬虫新手,如果仍需大量抓取的,在middleware.py中加几行代码,带上代理就行了,难度也不大。没加代理的同学将爬虫速度再降低一点,还是能跑的。
可能有挺多同学需要微博数据写论文,在群里找一下已有数据的同学吧,购买代理也不便宜。
(我也没怎么跑微博,手上也没什么数据)



---------------------------------------------------------------------------
20170407更新:
有些同学还用着SinaSpider1,现将SinaSpider1中获取Cookie的代码也作了更新,使用方法和SinaSpider3的一样,见上面的更新说明。



---------------------------------------------------------------------------
20170410更新:
许多同学问微博帐号哪里买,淘宝上禁的有一点严,所以直接搜可能没搜到。需要的同学可以搜店铺名称:账号素材生产基地 或 互联网账号营销中心,看店铺里的商品,有老客户链接。偶尔会断货,购买多少自行斟酌。非广告,不需要的请忽略。



---------------------------------------------------------------------------
20170426更新:
从昨天下午开始,weibo.cn的登录方式又变了,关闭了原来的登录页面,采用m.weibo.com的登录途径,登录过程中可能会出现图形解锁的验证码。隐约感觉有几个微博官方反爬虫的人正在暗处默默地盯着我,说不定什么时候就要请我去喝茶了。。 唉,图形解锁应该也是可以破解的,但是最近事多,要过两个星期才有空研究,有需要的可以等等,或者大伙自己可以研究一下,按像素识别。



---------------------------------------------------------------------------
20170509更新:
1、http://weibo.cn改成了https://weibo.cn。
2、图形解锁验证码的破解见博客 [《图形解锁破解(附Python代码)》](http://blog.csdn.net/bone_ace/article/details/71056741) 。微博爬虫的Cookie获取模块请自行更新。


sinaspider's People

Contributors

benheart avatar bone-ace avatar jiawen-yan avatar liuxingming avatar magic282 avatar tangbotony avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sinaspider's Issues

跑通了,总结下从零开始的部署步骤

安装完python2.7之后,
sudo pip install scrapy==1.0.5
sudo pip install pymongo
sudo pip install pyasn1
sudo pip install scrapy-redis

这四个库一定要装上,尤其是**# scrapy限定1.05版本**,最新版的是会报错的。

如果在单机模式玩分布式版本的话需要改settings.py:
settings.py中加上
DUPEFILTER_CLASS = "scrapy_redis.dupefilter.RFPDupeFilter"

不然会报
Failed to instantiate dupefilter class '%s': %s", 'scrapy.dupefilters.RFPDupeFilter

process_response() 函数中有语法错误

image

红圈的地方没有"location"这个关键词。
200的response如下:
{'Proc_Node': ['web359.mweibo.bx.sinanode.com'], 'Set-Cookie': ['_T_WM=339ff7f97fb5c147c8e465d4e9eb1aaa; expires=Fri, 07-Apr-2017 03:27:31 GMT; path=/; domain=.weibo.cn; httponly', 'WEIBOCN_FROM=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/; domain=.weibo.cn'], 'Lb_Node': ['layer7-002.mweibo.hk.sinanode.com'], 'Vary': ['Host,Accept-Encoding'], 'X-Log-Uid': ['6039348406'], 'Server': ['Tengine'], 'Date': ['Wed, 08 Mar 2017 03:27:31 GMT'], 'Content-Type': ['text/html; charset=utf-8'], 'Age': ['3']}

另外
raise IgnoreRequest会让scrapy直接忽略这个语法错误,不利于debug。

mac 运行 sinaspider3 出错

请教大神们,代码出现以下报错是什么原因,redis启动了,weibo账号也能读取,但获取cookie之前检查redis中是否存在账号就会报错。。。。

/anaconda/bin/python SinaSpider-master/Sina_spider3/launch.py
2017-02-09 18:55:25 [scrapy] INFO: Scrapy 1.0.5 started (bot: ['Sina_spider3'])
2017-02-09 18:55:25 [scrapy] INFO: Optional features available: ssl, http11, boto
2017-02-09 18:55:25 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'Sina_spider3.spiders', 'CONCURRENT_REQUESTS': 1, 'SPIDER_MODULES': ['Sina_spider3.spiders'], 'BOT_NAME': ['Sina_spider3'], 'SCHEDULER': 'Sina_spider3.scrapy_redis.scheduler.Scheduler', 'REDIRECT_ENABLED': False, 'DOWNLOAD_DELAY': 10}
2017-02-09 18:55:25 [scrapy] INFO: Enabled extensions: CloseSpider, TelnetConsole, LogStats, CoreStats, SpiderState
2017-02-09 18:55:25 [Sina_spider3] DEBUG: Reading URLs from redis list 'Sina_spider3:start_urls'
Unhandled error in Deferred:
2017-02-09 18:55:26 [twisted] CRITICAL: Unhandled error in Deferred:

2017-02-09 18:55:26 [twisted] CRITICAL:
Traceback (most recent call last):
File "/anaconda/lib/python2.7/site-packages/twisted/internet/defer.py", line 1299, in _inlineCallbacks
result = g.send(result)
File "/anaconda/lib/python2.7/site-packages/scrapy/crawler.py", line 71, in crawl
self.engine = self._create_engine()
File "/anaconda/lib/python2.7/site-packages/scrapy/crawler.py", line 83, in _create_engine
return ExecutionEngine(self, lambda _: self.stop())
File "/anaconda/lib/python2.7/site-packages/scrapy/core/engine.py", line 68, in init
self.downloader = downloader_cls(crawler)
File "/anaconda/lib/python2.7/site-packages/scrapy/core/downloader/init.py", line 69, in init
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "/anaconda/lib/python2.7/site-packages/scrapy/middleware.py", line 56, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "/anaconda/lib/python2.7/site-packages/scrapy/middleware.py", line 34, in from_settings
mw = mwcls.from_crawler(crawler)
File "/Users/yang/Documents/Code/SinaSpider-master/Sina_spider3/Sina_spider3/middleware.py", line 40, in from_crawler
return cls(crawler.settings, crawler)
File "/Users/yang/Documents/Code/SinaSpider-master/Sina_spider3/Sina_spider3/middleware.py", line 36, in init
initCookie(self.rconn, crawler.spider.name)
File "/Users/yang/Documents/Code/SinaSpider-master/Sina_spider3/Sina_spider3/cookies.py", line 66, in initCookie
if rconn.get("%s:Cookies:%s--%s" % (spiderName, weibo[0], weibo[1])) is None: # 'SinaSpider:Cookies:账号--密码',为None即不存在。
KeyError: 0

Process finished with exit code 0`

关于速度变慢问题

你好。
我使用SinaSpider1,连续运行24小时后速度会降为原来的1/2,这是为什么呢?

抓取速度慢的问题

即使用4g 抓也只有3、4k 每秒的速度。
大约40个账号,设置如下:
DOWNLOAD_DELAY = 2 # 间隔时间
CONCURRENT_REQUESTS_PER_DOMAIN = 100
CONCURRENT_REQUESTS = 70
求解答!

KeyError: 'pop from an empty set'

各位大神,为什么我运行出现下面的错误啊?
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\scrapy\core\engine.py", line 126, in _next_request
request = next(slot.start_requests)
File "C:\Users\chinchilla77\SinaSpider\Sina_spider1\Sina_spider1\spiders\spiders.py", line 22, in start_requests
ID = self.scrawl_ID.pop()
KeyError: 'pop from an empty set'
新手,求帮助:(

中文验证码问题

我把SinaSpider3的验证码识别那一块接入到SinaSpider1 中,经常会出现中文验证码,在敲入中文验证码后会验证失败。求解。

单机版本运行不成功,求大神救助

/System/Library/Frameworks/Python.framework/Versions/2.7/bin/python2.7 /Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py --multiproc --qt-support --client 127.0.0.1 --port 59058 --file /data/apps/codes/weibo/Begin.py
warning: Debugger speedups using cython not found. Run '"/System/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python" "/Applications/PyCharm.app/Contents/helpers/pydev/setup_cython.py" build_ext --inplace' to build.
pydev debugger: process 8313 is connecting

Connected to pydev debugger (build 163.10154.50)
/data/apps/codes/weibo/Sina_spider1/spiders/spiders.py:4: ScrapyDeprecationWarning: Module `scrapy.spider` is deprecated, use `scrapy.spiders` instead
  from scrapy.spider import CrawlSpider
2017-02-13 19:40:30 [scrapy] INFO: Scrapy 1.2.2 started (bot: Sina_spider1)
2017-02-13 19:40:30 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'Sina_spider1.spiders', 'SPIDER_MODULES': ['Sina_spider1.spiders'], 'LOG_LEVEL': 'INFO', 'DOWNLOAD_DELAY': 2, 'BOT_NAME': 'Sina_spider1'}
2017-02-13 19:40:31 [scrapy] INFO: Enabled extensions:
['scrapy.extensions.logstats.LogStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.corestats.CoreStats']
Get Cookie Success!( Account:17012010793 )
Get Cookies Finish!( Num:1)
2017-02-13 19:40:31 [scrapy] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'Sina_spider1.middleware.UserAgentMiddleware',
 'Sina_spider1.middleware.CookiesMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2017-02-13 19:40:31 [scrapy] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2017-02-13 19:40:31 [scrapy] INFO: Enabled item pipelines:
['Sina_spider1.pipelines.MongoDBPipleline']
2017-02-13 19:40:31 [scrapy] INFO: Spider opened
2017-02-13 19:40:31 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2017-02-13 19:40:32 [scrapy] ERROR: Error downloading <GET http://weibo.cn/5235640836/follow>
Traceback (most recent call last):
  File "/Library/Python/2.7/site-packages/twisted/internet/defer.py", line 1299, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/Library/Python/2.7/site-packages/twisted/python/failure.py", line 393, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/Library/Python/2.7/site-packages/scrapy/core/downloader/middleware.py", line 43, in process_request
    defer.returnValue((yield download_func(request=request,spider=spider)))
  File "/Library/Python/2.7/site-packages/scrapy/utils/defer.py", line 45, in mustbe_deferred
    result = f(*args, **kw)
  File "/Library/Python/2.7/site-packages/scrapy/core/downloader/handlers/__init__.py", line 65, in download_request
    return handler.download_request(request, spider)
  File "/Library/Python/2.7/site-packages/scrapy/core/downloader/handlers/http11.py", line 60, in download_request
    return agent.download_request(request)
  File "/Library/Python/2.7/site-packages/scrapy/core/downloader/handlers/http11.py", line 285, in download_request
    method, to_bytes(url, encoding='ascii'), headers, bodyproducer)
  File "/Library/Python/2.7/site-packages/twisted/web/client.py", line 1631, in request
    parsedURI.originForm)
  File "/Library/Python/2.7/site-packages/twisted/web/client.py", line 1408, in _requestWithEndpoint
    d = self._pool.getConnection(key, endpoint)
  File "/Library/Python/2.7/site-packages/twisted/web/client.py", line 1294, in getConnection
    return self._newConnection(key, endpoint)
  File "/Library/Python/2.7/site-packages/twisted/web/client.py", line 1306, in _newConnection
    return endpoint.connect(factory)
  File "/Library/Python/2.7/site-packages/twisted/internet/endpoints.py", line 788, in connect
    EndpointReceiver, self._hostText, portNumber=self._port
  File "/Library/Python/2.7/site-packages/twisted/internet/_resolver.py", line 174, in resolveHostName
    onAddress = self._simpleResolver.getHostByName(hostName)
  File "/Library/Python/2.7/site-packages/scrapy/resolver.py", line 21, in getHostByName
    d = super(CachingThreadedResolver, self).getHostByName(name, timeout)
  File "/Library/Python/2.7/site-packages/twisted/internet/base.py", line 276, in getHostByName
    timeoutDelay = sum(timeout)
TypeError: 'float' object is not iterable
2017-02-13 19:40:32 [scrapy] INFO: Closing spider (finished)
2017-02-13 19:40:32 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 1,
 'downloader/exception_type_count/exceptions.TypeError': 1,
 'downloader/request_bytes': 1003,
 'downloader/request_count': 1,
 'downloader/request_method_count/GET': 1,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2017, 2, 13, 11, 40, 32, 108493),
 'log_count/ERROR': 1,
 'log_count/INFO': 7,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'start_time': datetime.datetime(2017, 2, 13, 11, 40, 31, 780254)}
2017-02-13 19:40:32 [scrapy] INFO: Spider closed (finished)

Process finished with exit code 0

以上是运行的时候的错误堆栈,大神快出现。

爬虫出现302错误

你好,之前爬虫都没有问题,后来有一次DOWNLOAD_DELAY设为0了,爬了半个小时就出现302错误,错误如下:
Redirecting (302) to GET http://m.weibo.cn/security from GET http://weibo.cn/2139359753/fans
Redirecting (302) to GET http://m.weibo.cn/security from GET http://weibo.cn/attgroup/opening?uid=2139359753
一开始以为是ip被封了,但是好几天都没好,而且直接用浏览器访问
http://weibo.cn/2139359753/fans
又都能打开,请问是什么问题呢?

关于微博反爬机制

虽然说现在微博对爬取的频率做了限制,不过站在微博的角度思考,人家也很纠结,又想让搜索引擎爬取到数据,又要防止其他爬虫对服务器带来压力。

本人尝试过将爬虫对 UA 改为诸如百度爬虫之类的,可以在不模拟登录的情况下,高频率爬取到很多数据。可以把这条建议追加到 README 中。

关于scrapy-redis分布式问题

您好,我最近遇到scrapy分布式处理的问题,对这个分布式处理总是处理不正确,您可以详细介绍下scrapy分布式中的注意和说明吗 十分感谢

SinaSpider讨论群537549079

借楼主地用一下,我建一个QQ群537549079,大家可以加入讨论一下这个爬微博数据的问题,我现在用分布式的版本,但是有很多问题,希望大家能够一起交流解决问题。

sina_spider1

mongodb结果表中的Follows表和Fans表,_id后面的的1 10 100 101 ...... 这些列是什么意思?后面跟的都是关注人和粉丝的ID吗?

from Sina_spider1.items import InformationItem, TweetsItem, FollowsItem, FansItem

Sina_spider1.items这个是怎么来的?

from scrapy.spider import CrawlSpider
from scrapy.selector import Selector
from scrapy.http import Request
from Sina_spider1.items import InformationItem, TweetsItem, FollowsItem, FansItem

Traceback (most recent call last):
File "<pyshell#37>", line 1, in
from Sina_spider1.items import InformationItem, TweetsItem, FollowsItem, FansItem
ImportError: No module named Sina_spider1.items
运行出错是什么情况

单机版出现'list' object has no attribute 'iteritems'

Traceback (most recent call last):
File "Begin.py", line 3, in
cmdline.execute("scrapy crawl sinaSpider".split())
File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 108, in execute
settings = get_project_settings()
File "/usr/local/lib/python2.7/dist-packages/scrapy/utils/project.py", line 60, in get_project_settings
settings.setmodule(settings_module_path, priority='project')
File "/usr/local/lib/python2.7/dist-packages/scrapy/settings/init.py", line 285, in setmodule
self.set(key, getattr(module, key), priority)
File "/usr/local/lib/python2.7/dist-packages/scrapy/settings/init.py", line 260, in set
self.attributes[name].set(value, priority)
File "/usr/local/lib/python2.7/dist-packages/scrapy/settings/init.py", line 55, in set
value = BaseSettings(value, priority=priority)
File "/usr/local/lib/python2.7/dist-packages/scrapy/settings/init.py", line 91, in init
self.update(values, priority)
File "/usr/local/lib/python2.7/dist-packages/scrapy/settings/init.py", line 317, in update
for name, value in six.iteritems(values):
File "/usr/local/lib/python2.7/dist-packages/six.py", line 599, in iteritems
return d.iteritems(**kw)
AttributeError: 'list' object has no attribute 'iteritems'

单机版运行缺少模块问题?

哥们你好,我一运行就提示ImportError: cannot import name CrawlSpider,想问一下CrawlSpider模块是系统自带的吗?还是自己写的,要是自己写的话能不能分享一下?

Fans数量不全的问题

在Information表中有的Id拥有几千粉丝,但是存储的时候只能存几个,应该是翻页有限制,翻了几页之后就翻不下去了。不知道有没有什么比较好的解决方案。谢谢:)

运行Begin.py报错:'pop from an empty set'

报错如下
2017-04-29 11:04:01 [scrapy.core.engine] ERROR: Error while obtaining start requests
Traceback (most recent call last):
File "/Library/Python/2.7/site-packages/scrapy/core/engine.py", line 127, in _next_request
request = next(slot.start_requests)
File "/Users/voidwalker/Downloads/爬虫和web程序/SinaSpider/Sina_spider1/Sina_spider1/spiders/spiders.py", line 64, in start_requests
ID = self.scrawl_ID.pop()
KeyError: 'pop from an empty set'

SinaSpider3 去重应该有bug

self.server.setbit(self.key + str(uid / 4000000000), uid % 4000000000, 1)

key = "dupefilter:%s" % int(time.time())

这里key不唯一,去重应该是失败的。

请楼主确认。

捣鼓了一天终于能运行,踩过的坑跟大家分享

踩过的坑:
a. scrapy版本不对,安装时不要安装自带的,而是用sudo pip install scrapy==1.0.5。如果已经安装自带的,就用sudo pipi uninstall scrapy卸载后,再用sudo pip install scrapy==1.0.5安装。
b.安装mongo后,安装pymongo,然后就可以运行了,不要管之前报的错误。之前看似跟mongo无关的错误,在安装pymongo包后都会消失!
错误类似于:
twisted] CRITICAL: Unhandled error in Deferred:

Traceback (most recent call last):
File "c:\python27\lib\site-packages\scrapy\cmdline.py", line 150, in _run_comm
and
cmd.run(args, opts)
File "c:\python27\lib\site-packages\scrapy\commands\crawl.py", line 57, in run

self.crawler_process.crawl(spname, **opts.spargs)

等等。
c.mongo的管理工具mongoBooster下载地址:http://mongobooster.com/downloads

有多个爬虫的时候如何控制每个爬虫的爬取频率

比如我注意到SinaSpider2里拆分为info和tweet两个爬虫去实现,但是用户info不需要每次都更新(只有微博数、粉丝数改变的频率比较高),而tweet需要每隔一段时间看看有没有更新,怎么将不同爬虫设置不同的爬取频率?

Will the middleware change the cookie per spider? (new)

I wonder if the customized CookiesMiddleware bellow will change the cookie per spider. That is, since normally serveral requests are needed for each spider to crawl the target will it (randomly) change the cookie(we know that we provide many cookies for the project containing several spiders) during the this process for each spider?

class CookiesMiddleware(object):
""" 换Cookie """

def process_request(self, request, spider):
    cookie = random.choice(cookies)
    request.cookies = cookie

I'm tailoring your code for my weibo.com crawler. Weibo.com's anti-crawler is way powerful than that of weibo.cn and m.weibo.cn.

单机跑 Sina_spider2 的时候报错Tried to stop a LoopingCall that was " AssertionError: Tried to stop a LoopingCall that was not running.

我在单机跑 Sina_spider2 的时候报错,已经成功获得Cookie,并加载中间件和pipelines ,但是Spider刚open就立刻close了。Sina_spider2能不能再一台机器上跑?Tried to stop a LoopingCall that was " AssertionError: Tried to stop a LoopingCall that was not running.

settings.py

SCHEDULER = 'scrapy_redis.scheduler.Scheduler'
SCHEDULER_PERSIST = True
SCHEDULER_QUEUE_CLASS = 'scrapy_redis.queue.SpiderPriorityQueue'
REDIE_URL = None
REDIS_HOST = 'localhost'
REDIS_PORT = 6379

报错信息

......
2016-07-02 08:15:15 [scrapy] INFO: Enabled item pipelines:
['Sina_spider2.pipelines.MongoDBPipleline']
2016-07-02 08:15:15 [scrapy] INFO: Spider opened
2016-07-02 08:15:15 [scrapy] INFO: Closing spider (shutdown)
Unhandled error in Deferred:
2016-07-02 08:15:15 [twisted] CRITICAL: Unhandled error in Deferred:


Traceback (most recent call last):
  File "/Users/georgezou/Documents/Coding/github/SinaSpider/Sina_spider2/Sina_spider2/commands/crawlall.py", line 37, in run
    self.crawler_process.crawl(spidername, **opts.spargs)
  File "/Library/Python/2.7/site-packages/scrapy/crawler.py", line 163, in crawl
    return self._crawl(crawler, *args, **kwargs)
  File "/Library/Python/2.7/site-packages/scrapy/crawler.py", line 167, in _crawl
    d = crawler.crawl(*args, **kwargs)
  File "/Library/Python/2.7/site-packages/twisted/internet/defer.py", line 1274, in unwindGenerator
    return _inlineCallbacks(None, gen, Deferred())
--- <exception caught here> ---
  File "/Library/Python/2.7/site-packages/twisted/internet/defer.py", line 1126, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/Library/Python/2.7/site-packages/twisted/python/failure.py", line 389, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/Library/Python/2.7/site-packages/scrapy/crawler.py", line 87, in crawl
    yield self.engine.close()
  File "/Library/Python/2.7/site-packages/scrapy/core/engine.py", line 100, in close
    return self._close_all_spiders()
  File "/Library/Python/2.7/site-packages/scrapy/core/engine.py", line 340, in _close_all_spiders
    dfds = [self.close_spider(s, reason='shutdown') for s in self.open_spiders]
  File "/Library/Python/2.7/site-packages/scrapy/core/engine.py", line 298, in close_spider
    dfd = slot.close()
  File "/Library/Python/2.7/site-packages/scrapy/core/engine.py", line 44, in close
    self._maybe_fire_closing()
  File "/Library/Python/2.7/site-packages/scrapy/core/engine.py", line 51, in _maybe_fire_closing
    self.heartbeat.stop()
  File "/Library/Python/2.7/site-packages/twisted/internet/task.py", line 202, in stop
    assert self.running, ("Tried to stop a LoopingCall that was "
exceptions.AssertionError: Tried to stop a LoopingCall that was not running.
2016-07-02 08:15:15 [twisted] CRITICAL: 
Traceback (most recent call last):
  File "/Library/Python/2.7/site-packages/twisted/internet/defer.py", line 1126, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/Library/Python/2.7/site-packages/twisted/python/failure.py", line 389, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/Library/Python/2.7/site-packages/scrapy/crawler.py", line 87, in crawl
    yield self.engine.close()
  File "/Library/Python/2.7/site-packages/scrapy/core/engine.py", line 100, in close
    return self._close_all_spiders()
  File "/Library/Python/2.7/site-packages/scrapy/core/engine.py", line 340, in _close_all_spiders
    dfds = [self.close_spider(s, reason='shutdown') for s in self.open_spiders]
  File "/Library/Python/2.7/site-packages/scrapy/core/engine.py", line 298, in close_spider
    dfd = slot.close()
  File "/Library/Python/2.7/site-packages/scrapy/core/engine.py", line 44, in close
    self._maybe_fire_closing()
  File "/Library/Python/2.7/site-packages/scrapy/core/engine.py", line 51, in _maybe_fire_closing
    self.heartbeat.stop()
  File "/Library/Python/2.7/site-packages/twisted/internet/task.py", line 202, in stop
    assert self.running, ("Tried to stop a LoopingCall that was "
AssertionError: Tried to stop a LoopingCall that was not running.

获取weibo.cn部分的cookies的一点建议

其实可以不用进行验证码操作,受作者启发,可以先登录weibo.com的无验证码入口(微博账号安全里设为常登陆地点可以免验证码),然后直接在phontomjs模拟打开weibo.cnweibo.cn会是登录状态,这时候获取cookies便可。

由于我自己实现了,代码如下,仅供参考:

def init_phantomjs_driver():
    headers = {
        'Cookie': 'YF-Ugrow-G0=b02489d329584fca03ad6347fc915997; SUB=_2AkMvgPj2dcPxrAFYnPgWyGvkZYpH-jycVZEAAn7uJhMyOhgv7nBSqSVOKynW2PbhU4768kfRGZgNPwXeRA..; SUBP=0033WrSXqPxfM72wWs9jqgMF55529P9D9WWEFXHsNpvgJdQjr1GM.e765JpVF020SKM7e0571hMc',  # 未登录时weibo.com的cookie
    }
    for key, value in headers.items():
        webdriver.DesiredCapabilities.PHANTOMJS['phantomjs.page.customHeaders.{}'.format(key)] = value
    useragent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.110 Safari/537.36'
    webdriver.DesiredCapabilities.PHANTOMJS['phantomjs.page.settings.userAgent'] = useragent

    #   local path refer phantomjs
    driver = webdriver.PhantomJS(executable_path='xxxxxxxphantomjs路径xxxxxxx')
    driver.set_window_size(1366, 768)
    return driver
browser = weibo_auto_handle.init_phantomjs_driver()
    browser.get("http://weibo.com")
    time.sleep(3)
    failure = 0
    while "微博-随时随地发现新鲜事" == browser.title and failure < 5:
        failure += 1
        username = browser.find_element_by_name("username")
        pwd = browser.find_element_by_name("password")
        login_submit = browser.find_element_by_class_name('W_btn_a')
        username.clear()
        username.send_keys(account['usn'])
        pwd.clear()
        pwd.send_keys(account['pwd'])
        login_submit.click()
        time.sleep(5)

        # if browser.find_element_by_class_name('verify').is_displayed():
        #     logging.error("Verify code is needed! (Account: %s)" % account)

    if "我的首页 微博-随时随地发现新鲜事" in browser.title:
        browser.get('http://weibo.cn/')
        cookie = dict()
        if "我的首页" in browser.title:
            for elem in browser.get_cookies():
                cookie[elem["name"]] = elem["value"]
        # p2 = persist_iics.Persist()
        # p2.save_account_cookies(accounts[0][0], cookie, datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
        logging.error('Account cookies updated! (Account_id: %s)' % account['usn'])
        return cookie

运行Sina_spider1出错

刚开始我运行了D:\SinaSpider-master\Sina_spider1\Begin.py(我的路径)没有任何反应,也没有报错
于是我尝试了在命令行进行运行,代 码都没有动 因为之前有人分享过将twisted改到16以下,于是我也试了,改成了Twisted-16.5.0,可还是不行,跪求大神指点
却出了错误,报错如下:
D:\SinaSpider-master\Sina_spider1>scrapy crawl sinaSpider -s LOG_LEVEL=ERROR
D:\SinaSpider-master\Sina_spider1\Sina_spider1\spiders\spiders.py:4: ScrapyDeprecationWarning: Module scrapy.spider is deprecated, use scrapy.spiders instead
from scrapy.spider import CrawlSpider
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5235640836/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5235640836/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=5235640836>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5235640836/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5676304901/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5676304901/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=5676304901>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5676304901/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5871897095/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5871897095/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=5871897095>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5871897095/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/2139359753/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/2139359753/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=2139359753>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/2139359753/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5579672076/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5579672076/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=5579672076>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5579672076/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/2517436943/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/2517436943/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=2517436943>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/2517436943/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5778999829/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5778999829/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=5778999829>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5778999829/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5780802073/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5780802073/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=5780802073>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5780802073/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/2159807003/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/2159807003/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=2159807003>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/2159807003/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/3378940452/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/3378940452/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=3378940452>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/3378940452/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/1885080105/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/1885080105/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=1885080105>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/1885080105/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5778836010/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5778836010/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=5778836010>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5778836010/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5762793904/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5762793904/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=5762793904>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5762793904/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5722737202/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5722737202/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=5722737202>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5722737202/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/3105589817/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/3105589817/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/3105589817/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=3105589817>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5882481217/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5882481217/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=5882481217>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5882481217/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5831264835/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5831264835/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5831264835/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=5831264835>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/1756807885/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/1756807885/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=1756807885>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/1756807885/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/3637185102/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/3637185102/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=3637185102>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/3637185102/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/2717354573/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/2717354573/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/2717354573/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=2717354573>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/1934363217/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/1934363217/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=1934363217>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/1934363217/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5336500817/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5336500817/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=5336500817>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5336500817/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/1431308884/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/1431308884/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=1431308884>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/1431308884/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5818747476/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5818747476/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=5818747476>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5073111647/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5818747476/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5073111647/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5073111647/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=5073111647>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5398825573/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5398825573/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/5398825573/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=5398825573>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/2501511785/follow>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/2501511785/fans>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/attgroup/opening?uid=2501511785>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range
2017-04-24 16:01:07 [scrapy.core.scraper] ERROR: Error downloading <GET http://weibo.cn/2501511785/profile?filter=1&page=1>
Traceback (most recent call last):
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1260, in _inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "D:\SinaSpider-master\Sina_spider1\Sina_spider1\middleware.py", line 19, in process_request
cookie = random.choice(cookies)
File "c:\python27\lib\random.py", line 275, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
IndexError: list index out of range

求助新浪账号

加了代理池 配了分布式 但是突然发现淘宝没有卖微博账号了 求助那里可以购买账号

单次没成功获取cookies就跳出报错

我用的国外的VPS 在虚拟机上测试是可以用的 但是在VPS上的时候就会单次获取不成功就直接Unhandled error in Deferred: 应该怎么解决

URLError:
2016-08-26 16:22:18 [boto] ERROR: Unable to read instance data, giving up
2016-08-26 16:22:18 [requests.packages.urllib3.connectionpool] INFO: Starting new HTTPS connection (1): login.sina.com.cn
2016-08-26 16:22:18 [requests.packages.urllib3.connectionpool] DEBUG: "POST /sso/login.php?client=ssologin.js(v1.4.15) HTTP/1.1" 200 None
Unhandled error in Deferred:
2016-08-26 16:22:18 [twisted] CRITICAL: Unhandled error in Deferred:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.