Giter Club home page Giter Club logo

Comments (5)

hotoo avatar hotoo commented on August 15, 2024

pinyin 本身的字库(1.3M)和词库(1.8M),另外依赖了分词程序,其词典大约 10M,这是文件存储本身的大小,运行时的内存可能比这些更大些。

最早的时候,这个模块仅实现了 Web 版,提供了有限的常用字,并且不提供分词和词语功能。
同时为了追求算法上的极速,将整个字典装入内存,使用空间换得时间。

现在支持 Node 版,增加了分词和词组,用以提升同音字的拼音准确率,并提供了较为完整的拼音字库。

理论上,一次装载后,内存就基本稳定了(如果有内存泄漏请告知)

目前还没有找到非常有效的平衡性能和内存资源的方法。如有你有好的想法,请不吝赐教 :)

from pinyin.

wanming avatar wanming commented on August 15, 2024

怎样在NodeJS里使用Web版?我用的NodeJS里的cluster,共享内存有点麻烦,我在index.js里将if(isNode)换成if(false)后提示错误。BTW,使用Web版本和Node版有哪些区别?我只用多音字、不用几声,转换的都是比较常用的汉字或词。

from pinyin.

hotoo avatar hotoo commented on August 15, 2024

要分词处理多音字就不好精简了,否则可以考虑提供一个 Node 精简版,只提供单字拼音转换。

from pinyin.

wanming avatar wanming commented on August 15, 2024

好吧,我先这样用着吧。麻烦了

from pinyin.

TooBug avatar TooBug commented on August 15, 2024

同需要精简版,不需要考虑词组的问题,可否提供一下是否使用词组的选项,选项启用后再引入词组。试着优化了一下,去掉词组的话,取“你好”的拼音,本机(2013年Macbook Pro,8G内存)可以从1100+ms降到150ms左右,内存应该也会降一半以上。

P.S.之所以研究这个,是因为用微信开发了一个查通讯录的功能,服务器配置很低,查询词组耗时很大,经常导致超过微信5s响应时间。

from pinyin.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.