Giter Club home page Giter Club logo

routers-news's Introduction

Routers News

Routers is a collection of web-crawlers for various popular technology news sources.

It exposes a command-line interface to these crawlers, allowing for the distinguishing tech-news enthusiast to avoid leaving the comfort of their terminal.

It Currently Supports:

Technology News Sources

  • Ars Technica
  • Wired.com

Technology Blogs

  • TechCrunch
  • Mashable

Mainstream News Sources

  • New York Times
  • USA Today
  • L.A. Times

Other Random Stuff

  • Github
  • The Oatmeal
  • xkcd

(this categorization is loose, please feel free to shuffle stuff around.)

It's Also An Experiment

It is my hope that, by open-sourcing a collection of news scrapers, a community can be built around building a powerful set of real-time news aggregation tools.

Installation

npm install routers-news -g

Usage

Listing News Sources

routers-news --sources

Outputs

Routers News Sources:

  news:
    major:
      NewYorkTimes: The New York Times Bits blog.
      LATimes: The business and culture of our digital lives, from the L.A. Times.
      USAToday: Power up with breaking news on personal technology, electronics, gaming and computers.
    tech:
      Wired.com: Wired magazine is a monthly US technology publication.
      ArsTechnica: Ars Technica is a technology news site catering to PC enthusiasts.
      TechCrunch: A network of technology-oriented blogs and other web properties.
  other:
    Github: Trending and featured repos on Github.com

Displaying Headlines

routers-news --source=github

Outputs

[1] MacLemon / CongressChecklist
  https://github.com/MacLemon/CongressChecklist

[2] dejan / rails_panel
  https://github.com/dejan/rails_panel

[3] feross / md5-password-cracker.js
  https://github.com/feross/md5-password-cracker.js

[4] shadowsocks / shadowsocks-go
  https://github.com/shadowsocks/shadowsocks-go

[5] bcoe / routers-news
  https://github.com/bcoe/routers-news

[6] andrew / 24pullrequests
  https://github.com/andrew/24pullrequests

[7] nkohari / jwalk
  https://github.com/nkohari/jwalk

[8] lockitron / selfstarter
  https://github.com/lockitron/selfstarter

[9] twitter / bower
  https://github.com/twitter/bower

[10]  Spaceman-Labs / SMPageControl
  https://github.com/Spaceman-Labs/SMPageControl

Loading Articles

routers-news --source=github --article=5

Outputs:

bcoe / routers-news:


A crawler for various popular tech news sources. Read technology news from the comfort of your CLI.
      — Read more
---------
https://github.com/bcoe/routers-news

The Crawlers

The news crawlers used by Routers come in two varieties:

  • Page scrapers which use CSS selectors to extract content from news sources.
  • RSS/Atom feed parsers, which crawl articles using an RSS or Atom news feed.

Examples of both can be found in the lib/sources directory.

Contributing

It's easy to add a new news source:

  • fork the routers news repo.
  • clone it locally.
  • run npm install to install the libraries locally.
  • create a new crawler in the lib/sources directory (everything in this hierarchy is automatically loaded).
  • to test your crawler run: node ./bin/routers-news.js.

You can also help a ton by:

  • reporting when crawlers are broken.
  • extending on the crawelrs, I'd love to have:
    • Dates.
    • Authors.
    • Better image extraction.
  • improving on the CLI client.

Help make our dreams of a collaborative web-crawler a reality :)

Copyright

Copyright (c) 2012 Benjamin Coe and Joshua Hull and Gabriel Silk. See LICENSE.txt for further details.

routers-news's People

Contributors

bcoe avatar

Watchers

Michalis™ avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.