This script pulls the top 20 most frequently used words from a Wikipedia article. It uses regular expressions and stop word removal to create a cleaned table that we can view with the results
python main.py your_article_name_here yes
Example Run:
python web_scraper_exam ple/main.py Journal yes
| Word | Frequency | Frequency Percentage |
|-----------+-------------+------------------------|
| also | 3 | 7.5 |
| journal | 3 | 7.5 |
| public | 2 | 5
| business | 2 | 5 |
| called | 1 | 2.5 |
| used | 1 | 2.5 |
| related | 1 | 2.5 |
| daybyday | 1 | 2.5 |
| word | 1 | 2.5 |
| press | 1 | 2.5 |
| latin | 1 | 2.5 |
| diurnalis | 1 | 2.5 |
| events | 1 | 2.5 |
| term | 1 | 2.5 |
| several | 1 | 2.5 |
| whose | 1 | 2.5 |
| century | 1 | 2.5 |
| daily | 1 | 2.5 |
| use | 1 | 2.5 |
| records | 1 | 2.5 |
Process finished with exit code 0