harfbuzz / harfbuzz Goto Github PK
View Code? Open in Web Editor NEWHarfBuzz text shaping engine
Home Page: http://harfbuzz.github.io/
License: Other
HarfBuzz text shaping engine
Home Page: http://harfbuzz.github.io/
License: Other
Needed for example:
https://github.com/googlei18n/noto-fonts/issues/473
cc @jfkthame
See:
https://bugzilla.gnome.org/show_bug.cgi?id=706898
This will change our Python API, but we knew that. Any volunteers?
Right now we have "1.0" for everything. Put true 0.9.x version they were introduced instead...
Hi. Could please inform me why HarfBuzz always reorder U+0E3A (Thai Nighahit) to the lowest stack?
I can't find any particular reason why it should do that for Thai national language. Moreover, it is not desired behavior for some minority languages which use Thai script.
The screenshot above, I input Thai consonants & U+0E3A before other low vowels but HarfBuzz still displays it at the lowest position (the first 2 are actual words from Melayu-Pattani, the last one is just for testing mkmk
). Compared to FontForge, it should be like a screenshot below - the font with mkmk
feature can be found here https://github.com/BoonUni/boonjot).
From ICU 54 Milestone 1 announcement:
"LayoutEngine - deprecated in favor of HarfBuzz"
http://site.icu-project.org/download/54m1
http://bugs.icu-project.org/trac/ticket/10530
I guess congratulations of sort are in order :)
A.
*** Building harfbuzz *** [1/38]
make -j 3
make all-recursive
make[1]: Entering directory '/home/aveyond/jhbuild/checkout/harfbuzz'
Making all in src
make[2]: Entering directory '/home/aveyond/jhbuild/checkout/harfbuzz/src'
make[3]: Entering directory '/home/aveyond/jhbuild/checkout/harfbuzz/src/hb-ucdn'
make[3]: Leaving directory '/home/aveyond/jhbuild/checkout/harfbuzz/src/hb-ucdn'
make all-recursive
make[3]: Entering directory '/home/aveyond/jhbuild/checkout/harfbuzz/src'
make[4]: Entering directory '/home/aveyond/jhbuild/checkout/harfbuzz/src'
GEN libharfbuzz.la
GEN harfbuzz-icu.pc
GEN harfbuzz.pc
/usr/bin/ld: .libs/libharfbuzz_la-hb-blob.o: relocation R_X86_64_32 against `hb_blob_destroy' can not be used when making a shared object; recompile with -fPIC
.libs/libharfbuzz_la-hb-blob.o: error adding symbols: Bad value
collect2: error: ld returned 1 exit status
Makefile:1097: recipe for target 'libharfbuzz.la' failed
make[4]: *** [libharfbuzz.la] Error 1
make[4]: Leaving directory '/home/aveyond/jhbuild/checkout/harfbuzz/src'
Makefile:1726: recipe for target 'all-recursive' failed
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory '/home/aveyond/jhbuild/checkout/harfbuzz/src'
Makefile:1018: recipe for target 'all' failed
make[2]: *** [all] Error 2
make[2]: Leaving directory '/home/aveyond/jhbuild/checkout/harfbuzz/src'
Makefile:478: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/aveyond/jhbuild/checkout/harfbuzz'
Makefile:409: recipe for target 'all' failed
make: *** [all] Error 2
*** Error during phase build of harfbuzz: ########## Error running make -j 3 *** [1/38]
Fedora 21 Beta
When I run ./autogen.sh
, I get:
/autogen.sh
-n checking for ragel...
/usr/local/bin/ragel
-n checking for pkg-config...
/usr/local/bin/pkg-config
-n checking for libtoolize...
*** No libtoolize (libtool) found, please install it ***
autogen.sh seems to just to which libtoolize
. However, because Mac OS X does include an old version of libtool, the Homebrew environment installs "libtoolize" as "glibtoolize". I wonder if ./autogen.sh could be made aware of that and check for "glibtoolize" first.
From mailing list discussion:
Yes, separate x-prefixes for script and land would be good! (That'd allow for 3-letter script tags).
Sent from my mobile phone.
On 07.05.2015, at 04:28, Martin Hosken [email protected] wrote:
Dear Behdad,
I can extend the BCP 47 extension to also choose the script system if it's
available. Eg, a language setting of "x-hbotdeva" will choose "deva" whereas
"x-hbotdev2" will choose "dev2". This works for script tags that have four
letters (ie, not 'lao ', 'yi ', 'nko ', and 'vai ').We would only recognize the three-letter ones as language system tag.
This will be useful for choosing 'math' script as well.
+1
Or you could use x-hbscdev2 (as in script) to separate the namespaces.
According to OpenType specification for its script tags:
http://www.microsoft.com/typography/otspec/scripttags.htm
Both Hiragana and Katakana are mapped to a same OpenType script tag 'kana'. As of OpenType 1.6, this is the only case where multiple scripts need to map to single OpenType script tag.
This actually should be done in caller side, since collating scripts earlier help to split to less runs and allow ligatures across these scripts, but doing this in HB is also nice to do.
pango-view has a --width option which automatically breaks lines that are longer than a specified width. Are there any plans to add something similar to hb-view?
Something along these lines:
It's quite straightforward. We should...
According to Roozbeh these scripts have a cursive connection and as such can't take letter-spacing:
Arabic, Syriac, N'Ko, Manichaean, Psaltar Pahlavi, Mandaic, Mongolian, Phags-pa,
Devanagari, Bengali, Gurmukhi, Modi, Sharada, Syloti Nagri, Tirhuta,
Ogham
Not sure how useful it is to enumerate them somehow in HarfBuzz. @jfkthame WDYT?
hi:
thanks your harfbuzz, and now I want to display at least two languages in one string, such as "abc**ป็" (three lang: english chinese, thai).
How to achieve this. Can you give a example,thank you very much.
BR
Behdad,
could you add an "ot-dumb" shaper that would only run the font-defined GSUB+GPOS features, but would not perform any script-specific shaping?
That shaper would not perform any "intelligent" shaping, would not split the shaping process into stages (pre-shaping, shaping, post-shaping), would not apply any features by default (i.e. would treat all features as discretionary and off), and would apply all the GSUB then GPOS features at once, in font-defined features order.
Such a shaper would be most useful for testing the features during font development and debugging.
Currently it sees it as integer.
After #90 I can get most of the *_from_string()
function to work from Python, except language_from_string()
which cases a “double free or corruption” crash. Running under valgrind shows the following:
==27049== Invalid free() / delete / delete[] / realloc()
==27049== at 0x4C2B200: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27049== by 0x770363A: g_boxed_free (in /usr/lib/libgobject-2.0.so.0.4200.2)
==27049== by 0x7099C5F: boxed_del (pygi-boxed.c:45)
==27049== by 0x4E808C2: PyObject_Call (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4F11BB6: PyEval_CallObjectWithKeywords (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4ED6C2B: slot_tp_del (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4ED1C89: subtype_dealloc (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4ECFED9: tupledealloc (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4F15C30: PyEval_EvalFrameEx (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4F18AEF: PyEval_EvalCodeEx (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4F16FE3: PyEval_EvalFrameEx (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4F170E9: PyEval_EvalFrameEx (in /usr/lib/libpython2.7.so.1.0)
==27049== Address 0x9f18ce0 is 0 bytes inside a block of size 3 free'd
==27049== at 0x4C2B200: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27049== by 0x770363A: g_boxed_free (in /usr/lib/libgobject-2.0.so.0.4200.2)
==27049== by 0x7099C5F: boxed_del (pygi-boxed.c:45)
==27049== by 0x4E808C2: PyObject_Call (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4F11BB6: PyEval_CallObjectWithKeywords (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4ED6C2B: slot_tp_del (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4ED1C89: subtype_dealloc (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4ECFED9: tupledealloc (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4F15C30: PyEval_EvalFrameEx (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4F18AEF: PyEval_EvalCodeEx (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4F16FE3: PyEval_EvalFrameEx (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4F170E9: PyEval_EvalFrameEx (in /usr/lib/libpython2.7.so.1.0)
and
==27049== Invalid free() / delete / delete[] / realloc()
==27049== at 0x4C2B200: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27049== by 0xA22AADB: finish (hb-common.cc:229)
==27049== by 0xA22AADB: free_langs() (hb-common.cc:243)
==27049== by 0x5455DB1: __run_exit_handlers (in /usr/lib/libc-2.21.so)
==27049== by 0x5455E04: exit (in /usr/lib/libc-2.21.so)
==27049== by 0x5440806: (below main) (in /usr/lib/libc-2.21.so)
==27049== Address 0x9f1e7b0 is 0 bytes inside a block of size 3 free'd
==27049== at 0x4C2B200: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27049== by 0x770363A: g_boxed_free (in /usr/lib/libgobject-2.0.so.0.4200.2)
==27049== by 0x7099C5F: boxed_del (pygi-boxed.c:45)
==27049== by 0x4E808C2: PyObject_Call (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4F11BB6: PyEval_CallObjectWithKeywords (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4ED6C2B: slot_tp_del (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4ED1C89: subtype_dealloc (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4ECFED9: tupledealloc (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4F15C30: PyEval_EvalFrameEx (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4F18AEF: PyEval_EvalCodeEx (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4F16FE3: PyEval_EvalFrameEx (in /usr/lib/libpython2.7.so.1.0)
==27049== by 0x4F170E9: PyEval_EvalFrameEx (in /usr/lib/libpython2.7.so.1.0)
I can not really tell what is going on, but it looks like a mismatch between the way the string is allocated and the use of free()
here, or a double free. May be there is a magic Introspection keyword to fix this but I don’t know what is it (I tried (transfer full)
but it made no difference).
After compiling today’s (2015-02-09) repo and trying to run src/sample.py with a sample font and text, I’m consistently getting:
Traceback (most recent call last):
File "sample.py", line 42, in <module>
infos = hb.buffer_get_glyph_infos (buf)
File "/usr/local/lib/python2.7/site-packages/gi/module.py", line 313, in __getattr__
return getattr(self._introspection_module, name)
File "/usr/local/lib/python2.7/site-packages/gi/module.py", line 134, in __getattr__
self.__name__, name))
AttributeError: 'gi.repository.HarfBuzz' object has no attribute 'buffer_get_glyph_infos'
This is the error I get when invoking make:
git.mk: Generating .gitignore
make all-recursive
make[1]: Entering directory `/home/mimiko/src/harfbuzz'
Making all in src
make[2]: Entering directory `/home/mimiko/src/harfbuzz/src'
../missing --run ragel -e -F1 -o "hb-buffer-deserialize-json.hh.tmp" "hb-buffer-deserialize-json.rl" && \
mv "hb-buffer-deserialize-json.hh.tmp" "hb-buffer-deserialize-json.hh" || ( rm -f "hb-buffer-deserialize-json.hh.tmp" && false )
../missing --run ragel -e -F1 -o "hb-buffer-deserialize-text.hh.tmp" "hb-buffer-deserialize-text.rl" && \
mv "hb-buffer-deserialize-text.hh.tmp" "hb-buffer-deserialize-text.hh" || ( rm -f "hb-buffer-deserialize-text.hh.tmp" && false )
../missing --run ragel -e -F1 -o "hb-ot-shape-complex-indic-machine.hh.tmp" "hb-ot-shape-complex-indic-machine.rl" && \
mv "hb-ot-shape-complex-indic-machine.hh.tmp" "hb-ot-shape-complex-indic-machine.hh" || ( rm -f "hb-ot-shape-complex-indic-machine.hh.tmp" && false )
hb-ot-shape-complex-indic-machine.ri:122:81: parse error
make[2]: *** [hb-ot-shape-complex-indic-machine.hh] Error 1
make[2]: Leaving directory `/home/mimiko/src/harfbuzz/src'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/mimiko/src/harfbuzz'
make: *** [all] Error 2
I tried master and 0.9.24.
0.9.25 complains about Automake 1.13, which is not available.
I'm on Debian Wheezy x86_64, updated and upgraded.
Hi there!
hb-view is really great to render beautiful text as svg for usage on the web.
It is also possible to use a text file for multiline input.
Now I was wondering if it is possible to align the text to the right or to center it with hb-view.
Any help is highly appreciated.
Thanks
Andreas
When we tried to use up-to-date harfbuzz-ng with pango 1.36.3, we've got a crash case. The detailed stack traces are reported at crbug.com/503858 and a comment in a patch review.
As far as I looked at the stack traces, this one seems to be offending:
0x00007ffff67c92b0 in hb_shape (font=0x32719220ab0, buffer=0x327191e4620,
features=0x7fffffffcc80, num_features=0)
at ../../third_party/harfbuzz-ng/src/hb-shape.cc:386
So Pango calls hb_shape with invalid features
(probably only if num_features
== 0) and the current harfbuzz-ng does not like this.
Is it possible not to read features
if num_features
== 0?
Various fonts define three-letter ccmp ligatures for modifier tone letters. But they don't seem to be working in HarfBuzz. It seems that a ligature is formed by the first two glyphs, and then the third never forms.
For example, test the sequence 02E8, 02E9, 02E7 in Arial or Noto Sans. I get two glyphs, while I should be getting one glyph.
For example with Songti.ttc
./hb-shape Songti.ttc He --direction t --font-funcs ft
Sounds like a FreeType bug to me, but not sure.
Just creating this as a tracking bug after discussion in https://code.google.com/p/chromium/issues/detail?id=470884#c24
CC @kojiishi
Unicode has a double diacritic+CGJ mechanism for rendering a diacritic over or under a double diacritic. For example, the sequence u͡͏́i (u + combining double inverted breve + CGJ + combining acute + i) should result in the acute accent rendered above the double breve.
As this doesn't seem to work in HarfBuzz with any of the Windows fonts, Roboto, Noto, or DejaVu, I'm assuming it's a HarfBuzz bug.
Here's the text of the Unicode standard, from Unicode 7.0 (http://www.unicode.org/versions/Unicode7.0.0/ch07.pdf), page 325 and 326:
Dear developers,
I'm afraid to ramble but I have a scabrous question I'd like to submit...
It seems that harfbuzz substitute the letters followed by a combined diacritic with their corresponding base form. Is it a good idea to lose the first alternative? It is a case where the second is worse: when the letter has contextual alternates (and "Ignore combining marks" is set) but not its accentuated form. Contextual alternates are lost, as in the case of aring below.
Regards,
Philippe
New Tai Lue just changed model from logical encoding (like most Indic) to visual encoding (like Thai, Lao, and Tai Viet).
Almost all data in New Tai Lue is already using the visual model (they basically ignored the Unicode recommendation).
The change is going to officially happen in Unicode 8.0, but I think we should make the change in HarfBuzz immediately, because almost all the data in the language is in visual order (see http://www.unicode.org/L2/L2014/14195-newtailue.txt).
Hi,
I would like the help of harfbuzz team to integrate harfbuzz in the unity3d engine so that we render indic fonts. Presently the unity3d engine does not render persian,arabic and indic fonts. I am not sure how to use harfbuzz in unity3d. Can anyone guide/help me in implementing harfbuzz in Unity3d?
Regards,
Abu Saad Papa
Dear developpers,
On this page http://pecita.eu the embedded font is not used as it should do.
The embedded font is used if it is replaced with an older version. (For example the version located in http://pecita.eu/b/)
The font is used if it is localy installed.
Regards,
Philippe Cochy
I am facing some issues with glyph choice when using buffer direction TTB. (That's on HarfBuzz master)
./hb-view /usr/share/fonts/truetype/wqy/wqy-microhei.ttc --features=-vert,-vrt2 --font-size 20 --direction ttb "【2009 年 11 月 4 日美國加訊】,「不僅徹生態,遊戲。」" | display
./hb-view /usr/share/fonts/truetype/simsun/simsun.ttc --font-size 20 --direction ttb "【2009 年 11 月 4 日美國加訊】,「不僅徹生態,遊戲。」" | display
Where SimSun is available on Windows, WenQuanYi Micro Hei for example on Ubuntu.
I've experimented with --features=+vert,+vrt2
and --features=-vert,-vrt2
but that didn't lead to any changes. Am I mising something?
According to roozbeh, those are:
Arabic, Syriac, N'Ko, Manichaean, Psaltar Pahlavi, Mandaic, Mongolian, Phags-pa,
Devanagari, Bengali, Gurmukhi, Modi, Sharada, Syloti Nagri, Tirhuta,
Ogham
For example, letterspacing should be disabled for these scripts.
There are cases where input script/lang does not match to what the font has 'vert' table in.
It'd be great if HB has either:
Longer description follows.
Scripts/langsys do not work well for multi-scripts languages such as CJK, and historically, different font/engine vendors did different workaround for the problem that engines need to be tolerant to find 'vert' when processing CJK text, sometimes to support broken fonts, or sometimes because of different workaround taken.
Due to this, most implementations display correctly even when fonts have wrong or broken scripts/langsys. Safari/IE renders such pages/fonts correctly. Blink has its own workaround today, but it'd be ideal if HB does this.
Would be rather handy.
clang complains:
..\..\third_party\harfbuzz-ng\src\hb-ot-shape-complex-use.cc(242,1) : warning(clang): unused function 'set_use_properties' [-Werror,-Wunused-function]
set_use_properties (hb_glyph_info_t &info)
^
Hi,
the prebuilt binaries on http://harfbuzz.org for windows do not contain their introspection typelib, making python bindings unusable on win32. A companion bug has been filed against PyGObject, perhaps coordination would be prudent.
Dear developers,
Want to evaluate my suggestion?
The mechanism of contextual alternates should not be affected by the presence of a diacritical.
If "x"+"y" is substituted with "(x.alt)"+"(y.alt)",
then "x"+"U+0300"+"y" should be substituted with "(x.alt)"+"U+0300"+"(y.alt)".
Test case at http://pecita.net/Aghja/accents.xhtml
Regards,
Philippe Cochy
(sorry, issue went to this repo instead of fonttools, fixed now)
During improvement of Lohit Bengali under lohit2 project we decided to remove half form which can easily achieve with positioning of base consonant and virama/halant mark.
TTF available @ https://pravins.fedorapeople.org/Lohit-Bengali.ttf
$hb-shape /usr/share/fonts/lohit-bengali/Lohit-Bengali.ttf হ্ণি
[habeng=0+403|viramabeng=0@-90,-21+0|ivowelbeng=2+220|nnabeng=2+464]
Here i think ivowelbeng should get reorderd at initial position.
Harfbuzz version: 0.9.34
Hello.
It is not very important but I am curious to understand ...
In this minimal font the latin dz digraph is transform to a d followed by a z using a ccmp lookup.
Applying an accent to the digraph then it moves it to the first letter ignoring (for GPOS mark) the separation done by the ccmp lookup.
http:pecita.eu/ccmpDigraph/ccmpDigraph.tar.bz2
Introduce ways to disable functionalities of the "ot" shaping backend which are not canonical to OpenType Layout, including:
etc.
Behdad already stated his intent to do this, and this is the issue to track it.
$ HB_SHAPER_LIST='ot' ./hb-shape "/System/Library/Fonts/LucidaGrande.ttc" "סְשְ" [shevahebrew=2@64,0+0|shinhebrew=2+1467|shevahebrew=0@-32,0+0|samekhhebrew=0+1361]
$ HB_SHAPER_LIST='coretext' ./hb-shape "/System/Library/Fonts/LucidaGrande.ttc" "סְשְ" [shevawidehebrew=3+0|shinhebrew=2+1467|shevahebrew=1+0|samekhhebrew=0+1361]
If I interpret the utility output correctly, the first digit after the = is the cluster index in both cases.
I would expect the cluster index information from CoreText to be identical to the one returned when using opentype shaping.
This breaks LayoutTests/fast/text/international/hebrew-selection.html in Blink when moving from CoreText to HarfBuzz.
These all have different behavior right now:
hb-view /dev/null test
touch null; hb-view null test
cat /dev/null | hb-view - test
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.