Giter Club home page Giter Club logo

wiki_scraping's Introduction

Wiki_Scraping

PRs Welcome made-with-python

Wiki Scraping is a simple python web scraping tool to scrap educational institutions details from Wikipedia using user input.

This repository is part of Gof-2018 . Don't forget to register yourself before contributing.

Project Status

Committed code is scrapping the Motto of MIT. Later on we can enhance the code in such a way that it may be able to fetch any data from wikipedia of any college, school or organizations as per user need.

Contributions

Contribution rules

  • Create a new branch in your forked repository and then start working
  • Include a proper commit message in your commits
  • Always rebase with master:DSC-Galgotias/Wiki_Scraping to avoid merge conflicts whenever you start to work
  • Include proper PR message while giving pull request
  • Try to keep pull requests small to minimize merge conflicts

Getting Started

  • Fork this repo (button on top)
  • Clone on your local machine
        $ git clone https://github.com/DSC-Galgotias/Wiki_Scraping.git
    
  • Change to Wiki_Scraping directory
        $ cd Wiki_Scraping
    
  • Install dependencies :
        $ python setup.py install
    
  • Create a new branch
        $ git checkout -b my-new-branch
    
  • Add your contribution
  • Commit and push
        $ git add .
    
        $ git commit -m "your-commit-msg"
    
        $ git push origin my-new-branch
    
  • Create a new pull request from your forked repository to master branch of DSC-Galgotias/Wiki_Scraping

How to use

  • Open Terminal
  • Go to Wiki_Scraper directory.
  • Install dependencies :
        $ python setup.py install
    
  • Run Wiki_Info.py :
        $ python Wiki_Info.py
    

Technologies used

Remaining Enhancements

  • Basic GUI
  • User input about the organization
  • Multiple options asking about the college, schools or organization like fees, employers, etc

Join the slack Channel : Slack

Contributing

To learn more about how to contribute, check out this guide.

wiki_scraping's People

Contributors

adam2809 avatar anirudh-prakash avatar bharathkumarravichandran avatar chicobentojr avatar gprisco avatar kaus-rai avatar vankineenitawrun avatar

Stargazers

 avatar

Watchers

 avatar

wiki_scraping's Issues

Creation of UI

We need to create a dynamic and responsive UI. Current UI was created only for testing purpose.

Difference in xpath of different colleges

Wikipedia pages of all the colleges don't have the same data at same location i.e xpath MIT's Wikipedia page President name may not be same as the xpath of Stanford's Wikipedia page

Creation of JSON data

Creation of JSON file containing all the information present in the wiki_infobox class

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.