In this project, I build my own scraper. The main goal is to scrape information from https://www.jumia.mg/maisons-a-vendre&xhr=9jdcu and to store data in a json file.
- Web scraping: is a programmatic method of extracting data from websites. When you browse the web you consume a ton of publicly available information. As a user, all of this information is presented to you as unstructured data in the form of HTML documents. Now imagine, what if you could take all of these pages of and turn them into structured data, pick out the pieces you like and export it all to a database or spreadsheet. There are many ways to scrape data from websites but in this project I have built my own sraper.
You can get a local copy of the repository please run the following commands on your terminal:
$ cd <folder>
$ git clone https://github.com/rindrajosia/capstone-scraper.git
- Go to the bin folder and open the main file then change "last_page" with the number of the page that you want to scrape:
- In your terminal go to the folder you have saved the repository and type:
- bundle install
- Go to '/tmp/search_result.json'
- To test it you should go to your terminal on the folder you saved and type: rspec
- Ruby
- Nokogiri
- Rspec
๐ค Rindra Josia
- Github: @rindrajosia
- Twitter: @rindrajosia
- Linkedin: linkedin
Contributions, issues and feature requests are welcome!
Feel free to check the issues page.
Give a โญ๏ธ if you like this project!
- Project from Microverse
- Originally taken from The Odin Project