Light

geekan / scrapy-examples Goto Github PK

View Code? Open in Web Editor NEW

3.1K 234.0 1.0K 17.82 MB

Multifarious Scrapy examples. Spiders for alexa / amazon / douban / douyu / github / linkedin etc.

Python 96.73% Shell 0.29% Ruby 0.09% HTML 1.49% Jupyter Notebook 1.41%

scrapy-examples's People

Contributors

Stargazers

Watchers

Forkers

wujingke xiaozhenliang lichhaosysu igledaniel wangfeng3769 wudi senonchen wangzhiyuan fubuki hezila mrmign wangeldon nicholaswang maxliaops booox nsdown jspenc72 ghotiv yourmoonlight curious-boy vicjung crmiv a781947241 huhuuu qingqingqing anguscupid wyrover lightning-li youngcun sharkhoo yian454 yolanda1989 boosheng theringer jackzhch vs-studio ciryiping dabing1022 madre fashtimedotcom foxweek etongle friedpine molock catherine123 mindis ericfourrier narakai igorcosta ssfbest isensen rh-xu orchestor wenkezhou huokedu davionxiang kcstewart duyet jackwang429 cedric-coroir pucca601 easonbryant dindins prehawk1999 hq20051252 wangmiao1981 luis-wang liwei123o0 magic007 donghuiliu charlotteliu daur1020 yiliu1 leeomar minozmel subodhkr24 wysroy pierre9972 qsli phongtnit pagenotfound4o4 last2003 zjuwangfei toobit usersnames brother-simon ccdpowell mirasole smalljune javiermaly zhouyunan randy-ran jamesblunt fengkaungluanzhua albert2lyu lovejavaee yyhthu ccagg aprilara heianxing

scrapy-examples's Issues

Scrapy (parsing/scraping)

wordpress password protected site

i m the admin of a password protected site, is there a way to download the entire wordpress site ?

Zhihu followees followers

Can not access to followees/followers without login.

doubanbook这个spider爬取不到内容

如题

有没有能在windows运行的

在linux上时候，，启动./startproject.sh 就可以创建一个新的scrapy框架。我想知道在在你windows下有没有也有这样写一个脚本，就可以直接创建scrap项目框架

程序运行不了，只是运行了豆瓣的文件

File "/Users/lifuyi/www/scrapy-examples/misc/middleware.py", line 3, in <module> from agents import AGENTS ModuleNotFoundError: No module named 'agents'

不知道为什么会报这个错
然后可以改成兼容python3吗？我目前已经把print个expert改了。然后还遇到一些包引用问题。

How we will store googlescholar data in mysql database.

please give me a example

跑qqnews 爬虫，运行ok，但是结果没有保存在json文件中。帮看看？

跑qqnews 爬虫，运行ok，但是结果不能保存在JsonWithEncodingPipeline类的json文件中。初步分析了下没有执行process_item函数，为啥？
debug过程发现：只执行了JsonWithEncodingPipeline类的__init__和close_spider函数，没有执行process_item，这是为啥？

您好，打扰了，有个小白问题问下，谢谢了！

比如豆瓣这个例子https://github.com/geekan/scrapy-examples/blob/master/doubanbook/doubanbook/spiders/douban_spider.py

Rule(sle(allow=("/subject/\d+/?$")), callback='parse_2'), 这句话是在主页面内匹配subject

不太清楚抓取子页面里的东西的是哪句代码？

两个主要问题

目前有两个主要的问题，导致没法在scrapy 1.3 和python 3.5环境里面使用

1). 需要把from urlparse import urlparse 改为from urllib.parse import urlparse
2).需要在所有的print 后面加括号
print e to print (e)

请问你的模板是怎么继承的？

template模板并没有在scrapy默认继承模板里呀

刚进来，看到 3 years ago .....................?_?

doubanbook项目爬不了关于subject的页面

运行项目发现只能爬取出关于tag的url，之后直接就结束了，没有打印任何item信息，请问是什么问题？

全部是python2写的吗

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.