Giter Club home page Giter Club logo

libchewing's Introduction

	           _                   _
	       ___| |__   _____      _(_)_ __   __ _
	      / __| '_ \ / _ \ \ /\ / / | '_ \ / _` |
	     | (__| | | |  __/\ V  V /| | | | | (_| |
	      \___|_| |_|\___| \_/\_/ |_|_| |_|\__, |
	                                       |___/
	               https://chewing.im/

libchewing - The intelligent phonetic input method library

The Chewing (酷音) is an intelligent phonetic input method (Zhuyin/Bopomofo) and is one of the most popular choices for Traditional Chinese users. Chewing was inspired by other proprietary intelligent Zhuyin input methods on Microsoft Windows, namely Wang-Xin by Eten, Microsoft New Zhuyin, and Nature Zhuyin (aka Going). The Chewing developer maintains the project as a fully open-source effort, positioning it as a leading libre intelligent phonetic solution among major operating environments.

libchewing releases can be verified with the following minisign public key

RWRzJFnXiLZleAyCIv1talBjyRewelcy9gzYQq9pd3SKSFBPoy57sf5s

Status

1. System bridge integration

Chewing has been integrated into various input frameworks in Unix-like systems and even in Microsoft Windows and Android. On these systems, the Chewing package is typically divided into two parts: libchewing, which manages the actual character selection logic, and an input framework interface for display and preference settings.

  • The active integration:
  • The inactive one: SCIM, standalone Microsoft Windows 32/64-bit (windows-chewing), mozc, uim, ucimf, JMCCE, xcin, IIIMF, standalone MacOS X (SpaceChewing), Sun's Java Desktop System Input Method Framework, OpenVanilla Input Method Framework (previous than version 1.0), and OXIM.

2. support phonetic keyboard layout

  • DaChen (default)
  • Hsu
  • IBM
  • Gin-Yieh
  • Eten
  • Eten 26 keys
  • Dvorak
  • Dvorak Hsu
  • HanYu PinYin
  • Taiwan Huayu Luomapinyin
  • MPS2 Pinyin
  • Colemak-DH ANSI
  • Colemak-DH Ortholinear

3. External and unmerged projects

libchewing provides a straightforward API and design, enabling third-party projects to deploy innovative features. Here are some examples:

Minimal Build Tools Requirement

The following tools are used to build libchewing. Not all tools are necessary during building. For example, if the compiler you used is clang, gcc & Visual Studio are not needed. The versions listed here is the minimal version known to build libchewing. If any tools you use below this version, libchewing might not be built.

  • Build tools:
    • cmake >= 3.21.0
  • Toolchain / IDE:
    • clang >= 3.2 OR gcc >= 4.6.3
    • Rust >= 1.70
    • Build Tools for Visual Studio 2022 for MSVC build
  • Documentation tools:
    • texinfo >= 4.8

Build and Installation

Use the default preset:

cmake --preset default --install-prefix /usr
cmake --build build
cmake --build build -t test
cmake --build build -t install

Build the rust implementation:

cmake --preset rust-release --install-prefix /usr
cmake --build build
cmake --build build -t test
cmake --build build -t install

Check other supported presets:

cmake --list-presets

Cross-build

Define a cmake-toolchains file to cross-compile.

Example cross-build instructions:

cmake --preset default --toolchain arm-none-linux-gnueabi.cmake
cmake --build build

Build on Windows with MinGW

To build libchewing on Windows, you need to setup MinGW and MSYS in your system. The installer of MinGW and MSYS is in the following link:

https://sourceforge.net/projects/mingw/files/Installer/mingw-get-inst/

In "Select Components" during installing, please select the following items:

  • MinGW Compiler Suite -> C Compiler
  • MSYS Basic System

After installing, execute [MinGW directory]\msys\1.0\msys.bat (default is C:\MinGW\msys\1.0\msys.bat) to enter MSYS shell.

Build on Windows with Build Tools for Visual Studio 2022

To build libchewing on Windows and link to other program build from MSVC, you need to use the MSVC toolchain. To install the build environment:

Open admin prompt cmd.exe

winget install Microsoft.VisualStudio.2022.BuildTools
winget install Ninja-build.Ninja
winget install Kitware.CMake
winget install Rustlang.Rustup

Optional development tools

winget install Git.Git
winget install VSCodium.VSCodium

Reboot, then open Visual Studio Installer and install C/C++ components.

Open x64 Native Tools Command Prompt for VS 2022

rustup default stable
cmake -G Ninja --preset rust

Now you have the build environment for libchewing. You can follow the installation steps to build with cmake.

Build on macOS

To build libchewing on macOS, you will need tools listed in the requirements. Since macOS does not ship with these tools, building them from source can be a tricky task.

A simple way to install these tools is through Homebrew, a package manager for macOS. Once Homebrew is installed, run the following commands to install the tools you need:

brew install cmake
brew install rustup
rustup default stable

Minimum Supported Rust Version

To ensure libchewing can be built on various Linux distributions, we use the minimum rust version available from major distributions' next release branch. Data source: https://repology.org/project/rust/versions

  • Current MSRV: 1.70.0 (Debian unstable)

Usage

Chewing enables users to input Chinese by its pronunciation, using either Bopomofo/Zhuyin or Hanyu pinyin. It also supports Chinese punctuation marks, as well as both normal and full-shape numbers and the English alphabet.

The following sections are based on the assumption that you are using the default configuration. This includes the default/DaChen Bopomofo keyboard layout on an en_US keyboard, along with the default key-binding.

Glossary

Preedit Buffer: This is the area where your typing is stored before being sent to the applications (such as Firefox) you are using.

Mode: This determines how Chewing responds to keyboard input.

Editing mode

This mode facilitates the typing of normal Chinese characters and punctuation and is typically the default working mode.

In this mode, alphanumeric characters and punctuation marks are interpreted as Bopomofo symbols or punctuation marks. When these symbols form Chinese characters, the system chooses the most appropriate character based on the context in the preedit buffer.

Entering complete Chinese sentences is advantageous as it allows the system to perform auto-correction. To confirm the output, pressing Enter will commit the characters in the preedit buffer.

In case of errors, characters can be selected by moving the cursor with {Left} or {Right}, followed by pressing {Down} to enter Candidate Selection mode for word choice.

Auto-correction for a specific phrase can be overridden by pressing {Tab} at the end of the sentence.

Memorization of 2, 3, or 4-word phrases is possible by pressing {Ctrl-2}, {Ctrl-3}, or {Ctrl-4} at the phrase's end.

The behavior of the Shift key changes in this mode. Using Shift with an alphanumeric key outputs corresponding full-shape Chinese symbols if "Easy Symbol Input" is enabled, or outputs corresponding half-shape lowercase English alphabets if "Easy Symbol Input" is disabled.

For inputting Chinese symbols, aside from enabling "Easy Symbol Input" mode, pressing {Ctrl-1} or {`} opens a symbol selection dialog. After selecting the category, the {Down} key can be used to choose symbols as one would for characters.

Key binding   API name                   Functionality
-----------   --------                   -------------
Caps Lock     chewing_handle_Capslock    Toggle Temporary English sub-mode
Down          chewing_handle_Down        Enter Candidate Selection mode
Shift-Space   chewing_handle_ShiftSpace  Toggle Half/Full Shape sub-mode
Enter         chewing_handle_Enter       Commit the content in preedit buffer
                                         to active application window
Tab           chewing_handle_Tab         Break the auto-correction.
Ctrl-1        chewing_handle_CtrlNum     Open symbol selection dialog
Ctrl-2        chewing_handle_CtrlNum     Remember 2-word phrase.
Ctrl-3        chewing_handle_CtrlNum     Remember 3-word phrase.
Ctrl-4        chewing_handle_CtrlNum     Remember 4-word phrase.

Half/Full Shape sub-mode

This sub-mode is for inputting half-shape and full-shape characters. Half-shape characters are essentially normal English characters, while full-shape characters are stylized symbols that resemble English characters in a larger, more prominent format.

Key binding   API name                   Functionality
-----------   --------                   -------------
Shift-Space   chewing_handle_ShiftSpace  Toggle Half/Full Shape sub-mode

Temporary English sub-mode

This sub-mode is for temporary English inputting.

Key binding   API name                   Functionality
-----------   --------                   -------------
Caps Lock     chewing_handle_Capslock    Toggle Temporary English sub-mode

Candidate Selection mode

This mode is for choosing the candidate. It first displays the longest phrases that match the pronunciation, followed by progressively shorter phrases, down to single characters. Pressing {Down} cycles back to the longest phrases.

For example, after entering "w91o3g4" and pressing {Down}, Chewing displays the 3-word candidate "台北市". Pressing {Down} again shows the 2-word candidate "北市". Another press of {Down} brings up 1-word candidates "市" and "是". Pressing {Down} once more cycles back to the 3-word candidate "台北市".

Key binding   API name                   Functionality
-----------   --------                   -------------
Down          chewing_handle_Down        Next bunch of candidates in
                                         different length
Left          chewing_handle_Left        Previous page of candidates
Right         chewing_handle_Right       Next page of candidates
1, 2, ...0    chewing_handle_Default     Select 1st, 2nd, ... 10th candidate

Bypass mode

This mode is active whenever the preedit buffer is empty. It enables the use of movement keys (such as cursor keys and page up/page down) and popular key bindings (such as Ctrl-A, Ctrl-S).

For a brief overview of using the libchewing APIs, please refer to the simplified example in the file contrib/simple-select.c.

History

Libchewing is derived from the original Chewing input method, a module of XCIN that focuses on intelligent phonetic (Bopomofo/Zhuyin) processing and was initially intended for use with the X Window System. This input method module was developed by Lu-chuan Kung (lckung) and Kang-pen Chen (kpchen), and was sponsored by Tsan-sheng Hsu from Academia Sinica between 1999 and 2001.

However, the original authors eventually ceased the development of Chewing, and its strong coupling with XCIN limited its application in broader contexts. Additionally, there was a similar input method, bimsphone, which was included in the XCIN server. Like Chewing, bimsphone also lacked a convenient API for further development. In 2002, Jim Huang, along with others, formed the Chewing core team and extended the work of Kung and Chen. The Chewing core team renamed the project "New Chewing" to differentiate their work from the original. Nevertheless, the English name has remained "Chewing," which is recognized by various input method frameworks as well.

License

Except for the following source code, all other source code is licensed under the GNU LGPL v2.1 (Lesser General Public License v2.1), or (at your option) any later version. See "COPYING" for details:

  • The directory "thirdparty/sqlite-amalgamation" contains the SQLite3 source, which is in the public domain. For more information, see https://www.sqlite.org/copyright.html.
  • The file "cmake/FindCurses.cmake" is modified from the CMake source and is licensed under the BSD 3-Clause license.

Authors & Contact Information

See "AUTHORS" for details.

libchewing's People

Contributors

aitjcize avatar billy4195 avatar blue119 avatar bors[bot] avatar buganini avatar chocobo1 avatar czchen avatar definite avatar dimotsai avatar elleryq avatar fourdollars avatar hialan avatar hiroshiyui avatar hiunnhue avatar ianchen-tw avatar jserv avatar kanru avatar kcwu avatar kidwm avatar kito-cheng avatar lantw44 avatar mlouielu avatar pcman avatar peterdavehello avatar school510587 avatar shaform avatar shengyenpeng avatar stanleyding avatar vicamo avatar yan12125 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

libchewing's Issues

Too many way to input a symbol

Currently we can use the following methods to input a symbol:

  • Symbol mode: See test/test-symbol.c.
  • Easy symbol mode: See test/test-easy-symbol.c.
  • Fullshape mode: See test/test-fullshape.c.
  • Special symbol: See test/test-special.c.
  • Change preedit symbol by select: The example keystoke is u <CB>a<D>1<E> and u <CB>1<D>1<E>.

I think we need to reduce the way to input the symbol. It is too complicated.

in data/Makefile.am, touch timestamp while clean gendata_stamp

Index: libchewing/data/Makefile.in
===================================================================
--- libchewing.orig/data/Makefile.in    2013-02-16 11:42:56.028902101 +0800
+++ libchewing/data/Makefile.in 2013-02-16 11:42:56.024902101 +0800
@@ -496,7 +496,7 @@
        echo "chewing-definition.h exists."; \
    fi  
    $(MAKE) gendata && \
-   touch "timestamp" > $@
+   touch "gendata_stamp" > $@

 gendata:
    $(tooldir)/sort_word$(EXEEXT) $(top_srcdir)/data/phone.cin
Index: libchewing/data/Makefile.am
===================================================================
--- libchewing.orig/data/Makefile.am    2013-02-16 11:05:23.228839264 +0800
+++ libchewing/data/Makefile.am 2013-02-16 11:43:22.044902827 +0800
@@ -43,7 +43,7 @@
        echo "chewing-definition.h exists."; \
    fi  
    $(MAKE) gendata && \
-   touch "timestamp" > $@
+   touch "gendata_stamp" > $@

 gendata:
    $(tooldir)/sort_word$(EXEEXT) $(top_srcdir)/data/phone.cin

Provide APIs to set system path and hash path

Currently we could only use environment variable to set the path which should only be used to set the default path. Users will want to set the path after the library has been loaded. We will need

  1. set system path
  2. set user path
  3. reload dict data

Enclose all state transition in a function

Currently state transition is handled by each function which is error prone. The idea is to define two functions, getState() and setState() to handle the state transition like English Mode -> Chinese Mode.

src/hash.c:249:2: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]

src/hash.c:249:2: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
src/hash.c:250:2: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
src/hash.c:251:2: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
src/hash.c:252:2: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]

We should fix this and convert the binary format to platform independent format.

Create mid-level API for internal use and external use

Currently we call high-level API like chewing_handle_CtrlNum in other high-level APIs. This means if we change the high-level API implementation would also affect the other high-level user.

We should create mid-level API to operate on the internal state and dictionary data. These API will also be useful to external users.

Example: select_candidate, add_phrase (note these API are different with the low-level API which operates on data directly. The mid-level API operates on buffers and candidates.)

Chewing 好像沒有辦法輸出頓號 、

Chewing 用了好一陣子,一直有個不方便的地方,在輸入中文符號頓號時,找不到產生頓號的方法,情急之下只好改

/usr/share/libchewing3/chewing/swkb.dat

把字母 l 的對應改成 、

不知道有沒有更好的方法呢?

libchewing-0.3.3 在 ibus-chewing 1.3.10 下會漏字

我使用的作業系統是openSUSE 12.1, 輸入法是 ibus-chewing 1.3.10, ibus-1.4.0, libchewing 0.3.3。
如果暫存區的 最長字數 不是 輸入詞語的字數 的整數倍的話,當暫存區滿出而自動commit時,最前面的一、二字會被漏掉。
例如:暫存區可容納4個中文字,則連續輸入「輸入法輸入法輸入法輸入法...」而不按Enter的話,最後的輸出會變成「法法法法法...」。
我下載了 libchewing 0.3.2 編譯安裝,就沒有這個問題了。
我也檢查過在 ibus-chewing 自動 commit 呼叫 libchewing 中的 chewing_commit_String() 時確實只有傳回「法」而不是「輸入法」。

Add tsi.src sanity check in sort_dic

To prevent the issue like 50101fe, we shall add sanity check for tsi.src. The idea is to collect all possible phones in one word, and compare the phone in phrase to ensure that phrase does not contain the phone that are not belong to particular word.

Fix automake warning message

Automake version 1:1.13.3-1.1ubuntu2 shows the following warning messages.

src/Makefile.am:2: warning: 'INCLUDES' is the old name for 'AM_CPPFLAGS' (or '*_CPPFLAGS')
src/common/Makefile.am:1: warning: 'INCLUDES' is the old name for 'AM_CPPFLAGS' (or '*_CPPFLAGS')
src/porting_layer/src/Makefile.am:7: warning: 'INCLUDES' is the old name for 'AM_CPPFLAGS' (or '*_CPPFLAGS')
src/tools/Makefile.am:1: warning: 'INCLUDES' is the old name for 'AM_CPPFLAGS' (or '*_CPPFLAGS')
test/Makefile.am:66: warning: 'INCLUDES' is the old name for 'AM_CPPFLAGS' (or '*_CPPFLAGS')

Memory leaks detected in valgrind-check

Step to reproduce:

cd test
rm uhash.dat
make valgrind-check

==2524== 20 bytes in 3 blocks are definitely lost in loss record 1 of 2
==2524== at 0x4C2380C: calloc (vg_replace_malloc.c:467)
==2524== by 0x402C5BD: AlcUserPhraseSeq (hash.c:30)
==2524== by 0x402D3C6: HashInsert (hash.c:101)
==2524== by 0x402F125: UserUpdatePhrase (userphrase.c:157)
==2524== by 0x402AD21: AutoLearnPhrase (chewingutil.c:675)
==2524== by 0x4028B59: chewing_handle_Enter (chewingio.c:688)
==2524== by 0x401421: main (testchewing.c:214)
==2524==
==2524== 24 bytes in 3 blocks are definitely lost in loss record 2 of 2
==2524== at 0x4C2380C: calloc (vg_replace_malloc.c:467)
==2524== by 0x402C5CD: AlcUserPhraseSeq (hash.c:31)
==2524== by 0x402D3C6: HashInsert (hash.c:101)
==2524== by 0x402F125: UserUpdatePhrase (userphrase.c:157)
==2524== by 0x402AD21: AutoLearnPhrase (chewingutil.c:675)
==2524== by 0x4028B59: chewing_handle_Enter (chewingio.c:688)
==2524== by 0x401421: main (testchewing.c:214)

src/tools/sort.c generates data with incorrect word sequence in freebsd

for example, the correct sequence is "佛坲仏"

仏佛坲 (FreeBSD)

load into word_data[]

0x4e50570 佛
0x4e50578 坲
0x4e50580 仏

pass to compare_word_by_phone()

0x4e2b228 仏
0x4e2b230 佛
0x4e2b238 坲

佛坲仏 (Linux)

load into word_data[]

0x4e4ed60 佛
0x4e4ed68 坲
0x4e4ed70 仏

pass to compare_word_by_phone()

0x4e4ed60 佛
0x4e4ed68 坲
0x4e4ed70 仏

buffer overflow when the 50th character is easy symbol `Orz'

Currently, the chiSymbolBuf length is 50, and the maximum chiSymbolBufLen is 49. It means when the 50th character is inputted, libchewing will first store it in chiSymbolBuf[49] and later auto commit it due to exceed maximum chiSymbolBufLen. However, the easy symbol L contains three characters Orz. In worst case, libchewing will store Orz in chiSymbolBuf[49] ~ chiSymbolBuf[51], which causes buffer overflow.

This unit test is used to test this issue. Please turn it on when the issue is fixed.

Provide public API to manipulate/query system and user dict.

Copy from TODO

The following is proposing APIs for user dictionary manipulation.

  • int chewing_userphrase_enumerate(ChewingContext *ctx)
  • Start to enumerate userphrase
  • Return 0 if add is success, -1 otherwise.
  • int chewing_userphrase_has_next(ChewingContext *ctx, size_t *phrase_len, size_t *bopomofo_len)
  • Return 1 if it has next userphrase, 0 otherwise.
  • If return value is 1, phrase_len and bopomofo_len store the buffer length needed including null by phrase_buf and bopomofo_buf.
  • int chewing_userphrase_get(ChewingContext *ctx, char *phrase_buf, size_t phrase_len, char *bopomofo_buf, size_t bopomofo_len)
  • Get current userphrase. The length of phrase_buf shall be at least phrase_len. The length of bopomofo_buf shall be at least bopomofo_len.
  • bopomofo_buf is UTF-8 bopomofo in 聲母,介母,韻母,聲調 order. It is separated by space.
  • Return 0 if add is success, -1 otherwise.
  • int chewing_userphrase_add(ChewingContext *ctx, char *phrase_buf, char *bopomofo_buf)
  • Add new userphrase.
  • Return 0 if add is success, -1 otherwise.
  • void chewing_userphrase_remove(ChewingContext *ctx, char *phrase_buf, char *bopomofo_buf)
  • Remove userphrase.
  • Return 0 if add is success, -1 otherwise.
  • int chewing_userphrase_lookup(ChewingContext *ctx, char *phrase_buf, char *bopomofo_buf)
  • Return 1 if userphrase is present, 0 otherwise.

Cannot build libchewing.info in FreeBSD

The following is error message when building libchewing.info in FreeBSD.

[  2%] Generating doc/libchewing.info
/home/czchen/src/chewing/libchewing/doc/libchewing.texi:575: Unknown command `leq'.
/home/czchen/src/chewing/libchewing/doc/libchewing.texi:575: Misplaced {.
/home/czchen/src/chewing/libchewing/doc/libchewing.texi:575: Misplaced }.
/home/czchen/src/chewing/libchewing/doc/libchewing.texi:576: Unknown command `leq'.
/home/czchen/src/chewing/libchewing/doc/libchewing.texi:576: Misplaced {.
/home/czchen/src/chewing/libchewing/doc/libchewing.texi:576: Misplaced }.
makeinfo: Removing output file `/home/czchen/src/chewing/cmake-   libchewing/doc/libchewing.info' due to errors; use --force to preserve.
*** [doc/libchewing.info] Error code 1

Setup properly default configuration value

The default value of candPerPage is 0, which cause divide by zero exception. For example, the following code will crash:

ChewingContext *ctx = chewing_new();
chewing_set_selKey( ctx, SELECT_KEY, ARRAY_SIZE( SELECT_KEY ) );
chewing_set_maxChiSymbolLen( ctx, 16 );
chewing_handle_Default( ctx, '`' );

And the chewing_set_candPerPage API does not check the input, so the following code can will also crash

chewing_set_candPerPage( ctx, 0 );
chewing_handle_Default( ctx, '`' );

We shall set properly default value for every configuration and reject any nonsense configuration from API call.

Define SUCCESS and ERROR return type

Currently our API and internal function aren't very consistent about the return value. Whether should non-zero result be success or error? We should define the SUCCESS and ERROR symbol instead of return the raw integer.

Add API to select candidate from all choices

This is a mid-level API that could be useful. Currently we could only select from all candidates from each page but we could enumerate all candidates. To fill the gap we should have this API.

Crash in genkeystroke

Step to reproduce:

  1. cd test; ./genkeystroke t.txt
  2. Input hk4g4<H> 3 1<B>

Expected result:

Actual result:

crash

Backtrace shows that we used invalid address in show_interval_buffer (gen_keystroke.c:140) which might indicate that we got a wrong interval.

按 backspace 後無法輸出字

環境:
freebsd 10-current
xfce-4.10
ibus-1.4.1 (ibus-chewing-1.4.2)
gcin-2.7.8

有套用 THL 的 patch,不過應該無關,因為測試用注音模式亦有此現象

一開始可以選字也可輸出,但是只要有按過 backspace 鍵往前消除的話,字就出不來了,之後也就打不出任何字,要重新執行才行,在 ibus 跟 gcin 都是這樣,ex. 「測」依序按下 ㄘ->ㄜ->ˋ 按完四聲就直接跳掉

試的結果是在 2ca7235 之後有這個現象,之前的話沒有問題,至於其他輸入法架構我就沒實測了,不好意思…

Merge phone.cin to tsi.src

sort_dic uses "single word phrase" in tsi.src to check if there is any problem in tsi.src. However, the current tsi.src lacks lots of "single word phrase", thus sort_dic generates lots of error messages when running. We need to merge all word in phone.cin to tsi.src to cease error messages.

符號的使用頻率與次序

左右括號=〔〕【】《》(){}﹙﹚『』﹛﹜﹝﹞<>≦≧﹤﹥「」
以此為例,「」應當優先於『』,何以順序是相反的?

另外個人認為
≦≧可以移到最後,而另外在數學符號中也放一份。

同理
上下括號=︵︶︷︸︹︺︻︼︽︾〈〉︿﹀∩∪﹁﹂﹃﹄
∩∪也應該出現在數學符號。

Possible memory leak in hash.c

Step to reproduce:

cd test && make vcheck

Result:

==21867== HEAP SUMMARY:
==21867== in use at exit: 132 bytes in 6 blocks
==21867== total heap usage: 495 allocs, 489 frees, 112,503 bytes allocated
==21867==
==21867== 16 bytes in 2 blocks are still reachable in loss record 1 of 3
==21867== at 0x4C280A4: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21867== by 0x402F351: ReadHashItem_bin (hash.c:256)
==21867== by 0x402FB45: InitHash (hash.c:564)
==21867== by 0x40292B8: chewing_Init (chewingio.c:165)
==21867== by 0x40146F: main (testchewing.c:174)
==21867==
==21867== 20 bytes in 2 blocks are still reachable in loss record 2 of 3
==21867== at 0x4C280A4: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21867== by 0x402F3A7: ReadHashItem_bin (hash.c:266)
==21867== by 0x402FB45: InitHash (hash.c:564)
==21867== by 0x40292B8: chewing_Init (chewingio.c:165)
==21867== by 0x40146F: main (testchewing.c:174)
==21867==
==21867== 96 bytes in 2 blocks are still reachable in loss record 3 of 3
==21867== at 0x4C280A4: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21867== by 0x402FAC6: InitHash (hash.c:575)
==21867== by 0x40292B8: chewing_Init (chewingio.c:165)
==21867== by 0x40146F: main (testchewing.c:174)
==21867==
==21867== LEAK SUMMARY:
==21867== definitely lost: 0 bytes in 0 blocks
==21867== indirectly lost: 0 bytes in 0 blocks
==21867== possibly lost: 0 bytes in 0 blocks
==21867== still reachable: 132 bytes in 6 blocks
==21867== suppressed: 0 bytes in 0 blocks
==21867==
==21867== For counts of detected and suppressed errors, rerun with: -v
==21867== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)

Use -version-number instead of -version-info in libtool

Current we have two build systems (autotools & cmake), and they use different version system (current/revision/age vs. major/minor/revision). This different causes confused when updating version for API changed.

Since libtool provide -version-number, which uses the same version system as cmake, we can use it to replace -version-info so that autotools & cmake can have the same version system.

is python binding working?

cd libchewing-0.3.3/python
python test.py

on my amd64 laptop, it crash with seg fault.
on my i386 server, it output nothing.

both are freebsd os.

is it just not working for me or ?

Support platform independent binary data

Copy from TODO

  • Support data versioning
  • Explicit data struct size
  • Remove text data support
  • Explicit data endianness
  • All data struct shall be aligned to avoid misalignment problem in several platforms (ex: ARM, SPARC, ... )

Update `1' in preedit buffer become insert a new symbol

See code, when preedit buffer contains 1 and user uses down key to select the candidate, libchewing will call HaninSymbolInput() to handle it. However, the state transition of isSymbol when calling HaninSymbolInput() is SYMBOL_CATEGORY_CHOICE > SYMBOL_CHOICE_INSERT. The new symbol will be inserted before 1 instead of updated it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.