Comments (26)
Maybe Face.postscript_name.decode("utf-8")
?
from freetype-py.
A postscript name could be in any encoding, so that's not a good idea to always take utf-8.
from freetype-py.
from freetype-py.
Read the postscript reference manual.
Where is it available?
Or, is there any way to get the name of the postscript_name? With the name, i could easily convert it to string.
from freetype-py.
I would also want to know how can we decode sfnt name:
import freetype
face = freetype.Face("F5AJJI3A.TTF")
for i in range(face.sfnt_name_count):
name = face.get_sfnt_name(i)
print(name.string) # can return bytes
Here is a font example: https://mega.nz/file/S9ERDRpQ#bcPhS06kv-D5jt64aTNDbZVd6gZr6ZfJDYT91yYsoWk
from freetype-py.
For SNFT name, see https://freetype.org/freetype2/docs/reference/ft2-sfnt_names.html
For Postscript_name, see https://freetype.org/freetype2/docs/reference/ft2-base_interface.html#ft_get_postscript_name=
from freetype-py.
The postscript name is in plain ascii, the SNFT name is in SJIS encoding - the combination of platform/encoding/language id's said so. You need to call one of the python decoding function to decode bytes as sjis encoding.
The postscript name is Fj-Ima310, the SNFT name should decode to "Fjイーマ310" from "Fj\x83C\x81[\83}310"
from freetype-py.
In your code above, you need to read also "name.platform_id", encoding_id and language_id , before deciding how to decode name.string in general.
from freetype-py.
>>> name = face.get_sfnt_name(1)
>>> print((name.string).decode("sjis"))
Fjイーマ310
>>> print(name.encoding_id)
0
>>> print(name.language_id)
11
>>> print(name.platform_id)
1
1,0,11 is Japanese SJIS. There is a table linked in the https://freetype.org/freetype2/docs/reference/ft2-sfnt_names.html which tells you what (platform, encoding, language)= (1,0,11) means. You basically needs to check it is (1,0,11) to set "sjis"
in the decode argument.
from freetype-py.
Extracted from the freetype doc -
#define TT_PLATFORM_MACINTOSH 1
#define TT_MAC_ID_ROMAN 0
#define TT_MAC_LANGID_JAPANESE 11
from freetype-py.
Some of the other entries look broken, in this font.
>>> name = face.get_sfnt_name(8)
>>> print(name.platform_id)
3
>>> print(name.encoding_id)
2
>>> print(hex(name.language_id))
0x411
#define TT_PLATFORM_MICROSOFT 3
#define TT_MS_ID_SJIS 2
#define TT_MS_LANGID_JAPANESE_JAPAN 0x0411
This suggests it is in SJIS too. However, it won't decode as sjis, but needed to be decoded as utf-16-be:
>>> print(name.string.decode("sjis"))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'shift_jis' codec can't decode byte 0xfc in position 7: illegal multibyte sequence
>>> print(name.string.decode("utf-16-be"))
Fjイーマ310
>>>
ie. this font is slightly broken in some of its sfnt names.
from freetype-py.
I know that the encoding will depend on the platform_id and encoding_id (if platform_id = 0, it also depend on language_id).
My problem is that I don't understand how I can get the right encoding from these parameter. Which method should I call? In your example, you harcoded sjis.
But, if I remove all the sfnt name except the one from the platformID 3, I can't decode it. Still, I get the correct name with windows and libass which use freetype: https://mega.nz/file/Oo9ygbJA#7Ri7rlZ0oCxS6slXtfxIpP_VJa1HE6h24PcRRtGDr0E
from freetype-py.
Freetype itself does not care about text encodings. It is a library about any arbitrary mapping between text of any (sometimes localized, and sometimes even custom, like a private collections of symbols) encoding to shape. There is nothing inside freetype to call.
There are a few combinations of platform/encoding/language where it means unicode (the newer common standard). For the rest, it means the corresponding localised encoding, back in the 1980's, before unicode. Japan used SJIS, and still do.
As I said, this particular font is buggy in the (3,2,0x411) strings. 3,2,0x411 is japanese and sjis, but the bytes are in utf-16-be, wrongly.
There is no quick/fast way of setting the decoding parameter - given there are about 10+ common localised encodings (cjk is 4 already, simplified chinese = gb18030 vs traditional = big5). The logic is fairly messy:
If (combo = one of the unicode ones )
Do unicode
Else if (combo is one of lang1)
Do lang1
Else if (combo is one of lang 2)
... etc
As I write a 3rd time now, this particular font is buggy for its (3,2,0x411) name strings. Anyway, you can do "ftdump -n ..." on most fonts and ftdump (one of freetype2-demo programs written by the freetype people, to demonstrate freetype api's... available on most Linux platforms, and buildable for windows too) will try to decode all the strings to unicode or / hex for you. The actual decoding routine is "Print_Sfnt_Names" is about ~110 lines (total is ~1400, so about 10% of it!), from about line 340 onwards, and it is not a pretty thing: it is a few large and nested "switch (x_id) case:... " .
Considering even the freetype people needs to write 110 lines of C code to demonstrate how to decode the sfnt names, and it only converts the utf-16-be ones, and do nothing for the others. Utf-16-be is special: it is the native encoding of the first apple mac in 1980's, when truetype was created.That's your answer - you need to copy that 110 lines of C code, convert that to python, adds a few lines to decode arbitrary names for arbitrary fonts, if that's your goal.
I'll write a 4th time: this particular font is buggy (ie. Off-spec) in the names department. Don't use it for testing your code in this area.
from freetype-py.
I have just been reminded in #156 that in our example directory, there is a python version of ftdump.py : https://github.com/rougier/freetype-py/blob/master/examples/ftdump.py - you can see the piles of "if ... elif ..." for the name decoding part.
from freetype-py.
Ok, thank you.
The font is not really "buggy", but it is a special case. With a modified version of fonttools, i can decode everything correctly.
from freetype-py.
The font is buggy. The platform/encoding/language tags for the sfnt names don't reflect their encoding correctly. Maybe it is not seriously buggy, but buggy nonetheless. If fonttools shows every strings in human readable form (more than "ftdump -n " is able to show), then it is behaving in a friendly though off-spec (ie buggy) manner.
from freetype-py.
Since it is wrote in the documentation of freetype and adobe that postscript name should only contain ascii character, this seems to be a solution:
if font.postscript_name is not None:
try:
decoded_postscript_name = font.postscript_name.decode("ASCII")
except UnicodeDecodeError:
print("The font you specified contain an invalid postscript name")
from freetype-py.
BTW you can see "copyright 1998" for this particular font. Some of the specs/docs were written later.
It is a work-around: font designers / font editing software do all sort of things , until the community (font creators and font consuming techs) reaches concensus about what is good and what to avoid, and the spec gets updated to reflect concensus . Often old buggy fonts, which are sufficiently useful nonetheless, do not get updated.
I think "contains ascii only" is a "recommendation". Many fonts were created with non-ascii names (for non-english markets, like in this case, Japanese) before it was stated as a poor practice.
from freetype-py.
from freetype-py.
If it is an encoding issue, it is as I said, the correct way is in the reference
The adobe reference doesn't say how to decode it.
It only say how to create an postscript name.
what exactly is your problem?
Face.postscript_name can return bytes
It should always return a string.
It seems freetype always return an ascii bytes, so i think freetype-py should do that:
if font.postscript_name is not None:
try:
decoded_postscript_name = font.postscript_name.decode("ASCII")
except UnicodeDecodeError:
print("The font you specified contain an invalid postscript name")
from freetype-py.
@JeremieBergeron I have already pointed out that the correct way to interprete those bytes is as in ftdump.py example. The example does return a string. It is not a neat two-line of code answer, but it is the answer. The fact that this particular code does not work on this particular font , is because this particular font is buggy, as in it is off-spec. That the font still (partially) works (in some circumstances/ for some usage) is besides the point. Some other part of the font is not buggy, that's what you are claiming, really.
from freetype-py.
If you are proposing copying that 100 lines of ftdump.py as a wrapper into the core, that's debatable.
from freetype-py.
Why are you talking about ftdump.py?
It does not decode the byte:
freetype-py/examples/ftdump.py
Line 34 in 0084212
from freetype-py.
I am not sure what you are asking here. There is an implicit conversion on print. As I explained quite a few times, localised names are as done in ftdump. If the font name is not ascii, it is not ascii. Blindly converting to ascii seems wrong.
There is a better way of encoding localised names (And some font vendors still get it wrong). Historically the postscript name is anything that that font vendor put there, and it works for their intended purpose... and it looks as if font vendors put ascii names, localized names (for its intended locale), utf8 names recently in some cases, and postscript encoded hex in others. What it should be was added later.
If you think the conversion to ascii should be done, it could be added on the client side...
from freetype-py.
I am talking about postscript_name. In the freetype documentation, it is wrote: Retrieve the ASCII PostScript name of a given face, if available. This only works with PostScript, TrueType, and OpenType fonts.
So, it always return an ascii bytes.
Of course, this won't work if I was trying to decode directly the a name in the os2 table, but that's totally different (also, the code in ftdump does not always retrieve the good encoding, see what fonttools have done
from freetype-py.
In the case of it being completely normal and ascii, print(font.postscript_name.decode("ASCII"))
and print(font.postscript_name)
are not that different, visually. One might argue not to convert - python 3 strings internally are not single byte representations, so that will surprise some other people.
from freetype-py.
Related Issues (20)
- About emoji from font file error HOT 9
- libfreetype.so when system == Emscripten HOT 11
- Documentation for get_first_char HOT 3
- FT_Exception: (cannot open resource) HOT 21
- Two-factor identification on PyPi HOT 5
- get_cbox() raise "invalid argument" error HOT 3
- Wrong horizontal character positioning in hello world examples HOT 9
- New release with pyinstaller fix HOT 5
- feature request: expose FreeType's COLRv1 API HOT 6
- Please publish .tar.gz sdist archives
- on exit, freetype/__init__.py", line 1233, in __del__ TypeError: 'NoneType' object is not callable HOT 10
- `segfault` with version `2.4.0` HOT 10
- possible memory leaks and issues during finalizer HOT 1
- `load_char` failed for custom fonts with right input HOT 28
- Pyinstaller cannot import module HOT 5
- AttributeError: module 'freetype' has no attribute 'Face' HOT 2
- Please provide non-bundled wheels HOT 7
- Remove the pyinstaller code if there is any more trouble
- Move freetype-py to Freetype organisation ? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from freetype-py.