Giter Club home page Giter Club logo

jpinyin's Introduction

jpinyin - A opensource java library for converting chinese to pinyin

JPinyin是一个汉字转拼音的Java开源类库,在PinYin4j的功能基础上做了一些改进。

【JPinyin主要特性】

1、准确、完善的字库;

Unicode编码从4E00-9FA5范围及3007(〇)的20903个汉字中,JPinyin能转换除46个异体字(异体字不存在标准拼音)之外的所有汉字;

2、拼音转换速度快;

经测试,转换Unicode编码从4E00-9FA5范围的20902个汉字,JPinyin耗时约100毫秒。

3、多拼音格式输出支持;

JPinyin支持多种拼音输出格式:带音标、不带音标、数字表示音标以及拼音首字母输出格式;

4、常见多音字识别;

JPinyin支持常见多音字的识别,其中包括词组、成语、地名等;

5、简繁体中文转换;

6、支持添加用户自定义字典;

Maven

    <dependency>
       <groupId>com.github.stuxuhai</groupId>
       <artifactId>jpinyin</artifactId>
       <version>1.1.7</version>
    </dependency>

Usage

    String str = "你好世界";
    PinyinHelper.convertToPinyinString(str, ",", PinyinFormat.WITH_TONE_MARK); // nǐ,hǎo,shì,jiè
    PinyinHelper.convertToPinyinString(str, ",", PinyinFormat.WITH_TONE_NUMBER); // ni3,hao3,shi4,jie4
    PinyinHelper.convertToPinyinString(str, ",", PinyinFormat.WITHOUT_TONE); // ni,hao,shi,jie
    PinyinHelper.getShortPinyin(str); // nhsj
    PinyinHelper.addPinyinDict("user.dict");  // 添加用户自定义字典

Thanks

@snakezzl

@yuanboliu

@xamous

@lkfs

@latifrons

@baimoon

@chorar

@ShaomingLi

jpinyin's People

Contributors

qzw1210 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

jpinyin's Issues

‘好好学习’转拼音不带声调转错了

好好学习 转不带声调的拼音 期望得到 "hao,hao,xue,xi"。得到的结果确实"hao,xue,xi"。
发现源码中有这么一句,就是这句去重复导致的
`
/**
* 将带声调格式的拼音转换为不带声调格式的拼音
*
* @param pinyinArrayString
* 带声调格式的拼音
* @return 不带声调的拼音
*/
private static String[] convertWithoutTone(String pinyinArrayString) {
String[] pinyinArray;
for (int i = ALL_MARKED_VOWEL.length() - 1; i >= 0; i--) {
char originalChar = ALL_MARKED_VOWEL.charAt(i);
char replaceChar = ALL_UNMARKED_VOWEL.charAt((i - i % 4) / 4);
pinyinArrayString = pinyinArrayString.replace(String.valueOf(originalChar), String.valueOf(replaceChar));
}
// 将拼音中的ü替换为v
pinyinArray = pinyinArrayString.replace("ü", "v").split(PINYIN_SEPARATOR);

    // *****去掉声调后的拼音可能存在重复,做去重处理*****
    LinkedHashSet<String> pinyinSet = new LinkedHashSet<String>();
    for (String pinyin : pinyinArray) {
        pinyinSet.add(pinyin);
    }

    return pinyinSet.toArray(new String[pinyinSet.size()]);
}

`

springboot项目打包后无法正常加载自定义字典

环境

  • springboot 2.6.6
  • java 1.8

产生原因

打包成jar后,
resource下的文件是存在于jar这个文件里面,在磁盘上是没有真实路径存在,导致自定义字典无法读取

问题复现

自定义字典目录:\src\main\resources\drug_mutil_pinyin.dic
调用方法如下:

    @Test
    void testAddMutilPinyinDict() throws IOException {
        ClassPathResource resource = new ClassPathResource("drug_mutil_pinyin.dic");
        PinyinHelper.addMutilPinyinDict(resource.getFile().getPath());
    }

开发测试结果:无异常
打包后测试结果:

class path resource [drug_mutil_pinyin.dic] cannot be resolved to absolute file path because it does not reside in the file system: jar:file:/C:/dev/workspeace/java/demo/target/demo-0.0.1-SNAPSHOT.jar!/BOOT-INF/classes!/drug_mutil_pinyin.dic

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.