osfans / rime-tool Goto Github PK
View Code? Open in Web Editor NEW開源 rime 碼表方案集
License: GNU General Public License v3.0
開源 rime 碼表方案集
License: GNU General Public License v3.0
百萬行數據,Excel分類匯總實在太慢了,
@osfans 大大有時間幫忙寫一個python3分類匯總的程式吧!感謝!
文檔內碼:GBK+UTF-8
文檔格式:字詞tab編碼
編碼格式:小狼毫格式(空格分隔)
編碼字符:字母、數字、拼音
好 hao
好 hao
工作 gong1 zuo4
工作 gong1 zuo4
工作 gong1 zuo4
樸 piáo
樸 piáo
樸 piáo
樸 piáo
這好像 zhè hǎo xiàng
這好像 zhè hǎo xiàng
這好像 zhè hǎo xiàng
這好像 zhè hǎo xiàng
這好像 zhè hǎo xiàng
這好像 zhè hǎo xiàng
好 hao 2
工作 gong1 zuo4 3
樸 piáo 4
這好像 zhè hǎo xiàng 6
@osfans 问个词汇的出处:
在Rime菜鸟群/群文件/输入法素材
你发的"汉典词汇_带拼音-2015年11月.xlsx"文件,
搜索github.com和百度都没有“汉典词汇”的词。
能提供源(源文件)出处地址吗?
谢。
https://github.com/chromezh/unicode_nushu
女书是流传在湖南江永县潇水流域的一种妇女专用文字。
女书是一种女人创造、女人使用、专门写女性生活与感情的文字。它记录的是当地的一种土话,有大约一千个单字,形体呈斜长的菱形框架,风格飘逸、舒展。女书是一种单音节文字,每个音节表示一组同音不同意义的语词。她像一朵野花,几百年深藏在湖南省都庞岭的偏僻乡村中,自生自长、自开自谢。就在她濒临灭绝的时候,1982 年被学者所发现。20 多年来,学者们为破译女书,研究女书,抢救女书,付出了艰苦的努力,做出了重大的贡献。
女书被收录进 Unicode 字符集,因此我使用 Rime 制作了 Unicode 女书输入法。
I have implemented a new input schema -- Unicode Nvshu Input Method
. You are welcomed to include it to this Rime Collection.
Nüshu (simplified Chinese: 女书; traditional Chinese: 女書; pinyin: Nǚshū [nỳʂú]; literally: "women's script"), is a syllabic script derived from Chinese characters that was used exclusively among women in Jiangyong County in Hunan province of southern China. Nüshu has been included in the Unicode Standard since June 2017.
Unlike the standard written Chinese, which is logographic (with each character representing a word or part of a word), Nüshu is phonetic, with each of its approximately 600-700 characters representing a syllable. This is about half the number required to represent all the syllables in Tuhua, as tonal distinctions are frequently ignored, making it "the most revolutionary and thorough simplification of Chinese characters ever attempted". Zhou Shuoyi, described as the only male to have mastered the script, compiled a dictionary listing 1,800 variant characters and allographs.
我也来帮衬啦。
C:\Python 34\DLLs\sqlite3.dll
就好了。C:\\Program Files (x86)\\Rime\\weasel-0.9.30\\data\\
,trime-tool.py
得对这种情况处理一下。use_preset_vocabulary: true
max_phrase_length: 7
则trime.db
有40多M,怕会很卡。如果引入的八股文最大词长改成3,则有20多M。可能原因:为了覆盖地域更广,平均每个字的读音比较多,八股文自动编码的结果成倍增加。
继续跑去trime项目反馈一下。
dict.yaml
,其实是不用人处理的。但没动手之前会一直有疑问:“schema.yaml
输入了命令行,那dict.yaml
要不要?”在新的repository下,有使用說明(pdf)和schema文件,配合terra_pinyin.dict.yaml使用,兼容無聲調漢語拼音
@osfans 幫忙寫一個python3篩選過濾的程式!
文檔內碼:GBK+UTF-8
文檔格式:字詞tab編碼
編碼格式:小狼毫格式(空格分隔)
編碼字符:字母、數字、拼音
好 hao
好 hao
工作 gong1 zuo4
工作 gong1 zuo4
工作 gong1 zuo4
樸 piáo
樸 piáo
樸 piáo
樸 piáo
這好像 zhè hǎo xiàng
這好像 zhè hǎo xiàng
這好像 zhè hǎo xiàng
這好像 zhè hǎo xiàng
這好像 zhè hǎo xiàng
這好像 zhè hǎo xiàng
樸
工作
工作 gong1 zuo4
工作 gong1 zuo4
工作 gong1 zuo4
樸 piáo
樸 piáo
樸 piáo
樸 piáo
謝謝。
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.