usage: crawler.py [-h] [-s] time_limit subdomain
positional arguments:
time_limit crawling time limit in seconds
subdomain crawling subdomain (e.g. en, de, fr)
optional arguments:
-h, --help show this help message and exit
-s, --summary collect summaries instead of full articles
bornabesic / wikipedia-crawler Goto Github PK
View Code? Open in Web Editor NEWPython 3 script for collecting articles from one of Wikipedia's language-specific subdomains