smileagain6698 / scrappy_100 Goto Github PK

View Code? Open in Web Editor NEW

This project is a Python-based web scraper that extracts information such as social media links, tech stack details, meta titles and descriptions, payment gateways, and website languages from various websites, storing the data in a MySQL database.

Python 100.00%

scrappy_100's Introduction

🌐Website Data Scraper - 100

📖Description

This script scrapes data from a list of websites and extracts the following information:

Social Media Links
Tech Stack (MVC, CMS, JS type, etc.)
Meta Title
Meta Description
Payment Gateways (e.g., PayPal, Stripe, Razorpay)
Website Language
Category of Website

The extracted data is stored in a MySQL database. Moreover, 1.cvs is the exported data of MySQL database.

⚙️Requirements

Python 3.x
requests library
beautifulsoup4 library
lxml library
mysql-connector-python library
python-Wappalyzer library

🔍Setup Instructions

Clone the repository.

git clone https://github.com/smileagain6698/scrappy_100.git

Create the virtual environment:

python -m venv venv
venv\Scripts\activate

Install the required Python libraries:

pip install requests beautifulsoup4 lxml mysql-connector-python python-Wappalyzer

Set-up your MySQL database
Add extensions to your vscode - SQLTools database management && SQLTools MySQL/MariaDB/TiDB Driver
Open SQLTools, select MySQL database and update Connection Settings with similar to the settings.json file and as per your database. *(select MySQL) (password is your MySQL database password)
Test your connection and Save Connection.
Run in terminal
```
python main.py ```
```