Shopping web scraper: scrapes products from different Dutch supermarkets, processes and organizes the results.
A personal project to practise with scrapy
, MongoDB
, Redis
, Docker
and
flask
.
Shop scraper is delivered with docker
and good to go with docker-compose
:
> docker-compose up
If you need to rebuild old images, start the app by running the script:
> ./run.sh
When the app is up and running, you can open the webapp from browser,
from 0.0.0.0:5000
.
There you will find the search page:
Digit the search term and press Search
to see the results page:
Note: digit the search term in Dutch since the supermarkets of interest are from the Netherlands. Here you can see the results from different supermarkets in a single table, with product name, shop price, unit price and unit measure, image and a link to the shop.
All the columns are sortable (except the 'image' column), and there is a search bar where you can search for a particular product.
The project is composed of different units, each of them is a docker container
(except for the client, which is served by the flask backend).
The communication is done by using message-passing through redis
over docker
networks.
client
:flask
client and backendbroker
: a broker which directs requests traffic from the client back-end to crawlerscrawling
: crawling of products from web shops, their processing and insertion into the dbdata_collections
: database schema definitiondata_processing
: tools to make products comparisons, e.g., to check if two products are the same, or to normalize product measure units and quantities