dafeng097 / wp2txt Goto Github PK
View Code? Open in Web Editor NEWThis project forked from yohasebe/wp2txt
WP2TXT extracts plain text data from Wikipedia dump file (encoded in XML/compressed with Bzip2) stripping all the MediaWiki markups and other metadata.
License: MIT License