tugstugi / mongolian-bert Goto Github PK
View Code? Open in Web Editor NEWPre-trained Mongolian BERT models
Pre-trained Mongolian BERT models
Sain baina uu
Mongol hel deer sentence similarity Bert -r hiij baisanuu?
Zuvluguu baina uu?
jishee github repo bnuu?
thx
Танитай холбогдох гэсэн юмаа email хаяг юмуу facebook -ээр холбогдож болох уу?
Email хаяг болон Facebook хаягаа өгөөч please
Сайн байна уу?
Дата татах хэсэг дээр scratch-аас эхлээд reproduce хийж чадахгүй байна.
$ python3 datasets/dl_and_preprop_mn_wiki.py
downloading https://dumps.wikimedia.org/mnwiki/20181220/mnwiki-20181220-pages-articles-multistream.xml.bz2...
1MB [00:00, 1081.56MB/s]
Traceback (most recent call last):
File "wikiextractor/WikiExtractor.py", line 3238, in <module>
main()
File "wikiextractor/WikiExtractor.py", line 3228, in main
args.compress, args.processes)
File "wikiextractor/WikiExtractor.py", line 2847, in process_dump
for line in input:
File "/usr/local/lib/python3.6/fileinput.py", line 250, in __next__
line = self._readline()
File "/usr/local/lib/python3.6/fileinput.py", line 364, in _readline
return self._readline()
File "/usr/local/lib/python3.6/bz2.py", line 219, in readline
return self._buffer.readline(size)
File "/usr/local/lib/python3.6/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/usr/local/lib/python3.6/_compression.py", line 103, in read
data = self._decompressor.decompress(rawblock, size)
OSError: Invalid data stream
tmp_mn_wiki/*/*.txt
done!
*.bz файл татагдахгүй байгаа юм болов уу гэж тааж байна. Тэгээд татагдахгүй байх үед дараагийн хэсэг нь ажилчихаад байна уу?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.