Giter Club home page Giter Club logo

web_spiders's Introduction

README

install

npm install chromedriver
npm install -g phantomjs

Create

kimurai generate project web_spiders
kimurai generate spider kkyyycc

Test

kimurai console kkyyycc --url http://www.kkyyy.cc/frim/15.html

response.xpath('//div[@class="fed-part-layout fed-back-whits"]/ul/li/a').map{|e| e[:href]}
response.xpath('//div[@class="fed-page-info fed-text-center"]/a').search("[text()*='下页']")[0][:href]

kimurai console kkyyycc --url http://www.kkyyy.cc/movie/38825.html
response.xpath("//dt[@class='fed-deta-images fed-list-info fed-col-xs3']/a").attr('data-original').value

response.xpath("//dd[@class='fed-deta-content fed-col-xs7 fed-col-sm8 fed-col-md10']/h1/a").children.first.text.gsub('【美剧】', '').strip


array = response.xpath("//dd[@class='fed-deta-content fed-col-xs7 fed-col-sm8 fed-col-md10']/ul/li").children.map(&:text)
categories = array[array.index('更新:') + 1].split('/')


browser.visit(downloads_page_link)
browser.current_response.xpath("//ul[@class='down-list']/li")


save_to "kkyyycc.json", item, format: :pretty_json

启动页面控制台

bundle exec kimurai dashboard

url: `http://localhost:3001/spiders`

运行

bundle exec kimurai crawl kkyyycc

部署机爬虫环境安装

kimurai setup [email protected] --ask-auth-pass

部署到树莓派上,我用的是sshpass方式验证,直接配置免密也可以

kimurai deploy [email protected] --ask-auth-pass --repo-url [email protected]:xxx/web_spiders_demo.git

web_spiders's People

Contributors

mendd avatar

Watchers

James Cloos avatar menxu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.