Giter Club home page Giter Club logo

Comments (7)

christianparpart avatar christianparpart commented on August 10, 2024 1

@shreevatsa I just checked. I'm having the same output as you on MacOS, but it works flawlessly on non-MacOS (same software version).

So the only difference is how we discover fonts, on MacOS. This is something I can investigate tonight (I'm out with family during the day). I keep you posted.

I'd like to note one very important thing though (having quickly scanned through your blog article), which is: Unicode is not specified for terminals. Not at all. This is all undefined implementation dependant behaviour. We (as in some terminal developers) try indeed to get to the current century, when it comes to Unicode. Every TE has its own priorities there. For us for example, we focus on complex grapheme cluster support, especially related to emoji, but also on ligature support. Which is both very well supported in Contour.
Languages like Hebrew, RTL, etc.

I don't like MacOS falling behind in font fallback (this is the issue here). I'll look into it later.

from contour.

shreevatsa avatar shreevatsa commented on August 10, 2024 1

Thank you so much!

To clarify:

  • Yes I understand terminal behaviour is not specified with Unicode, in particular (in this case) for complex scripts and trying to force them into the grid-of-cells assumption of terminals. I appreciate that Contour here intends to support Unicode better than other terminals! In fact, while searching around I found some very sensible comments by you on another repo (alacritty/alacritty#3975) which is what emboldened me to report this here. :-)
  • Sorry about "zero font support" — what I meant was that we speakers/readers of languages with complex scripts have, over the years, come to recognize several "levels" of support for our scripts, in various environments' text stacks: (0) no font loaded at all, glyphs are shown as question marks/boxes/tofu (LastResort/.notdef), (1) some font is used, but the glyphs are just laid out next to each other with no CTL reordering, (2) text rendered mostly correctly and readably, with some corner cases that can be read with guesswork, (3) text rendered properly. I imagine readers of Arabic recognize further levels with RTL support, and with terminals I now realize there are further complications with rendering to the grid.
  • Thanks for clarifying that Unicode Grapheme cluster support refers to grapheme cluster segmentation — I must say we are all thankful for emoji, the "trojan horse" of Unicode support (in a good way: many programs that would otherwise not care about Unicode end up implementing varying levels of support for the sake of emojis, which ends up benefiting all the world's languages).

Back to this issue: building from source after #1536 (following the steps from this comment: #1510 (comment)), I can confirm that after the recent PR, there is some positive change as font fallback seems to be working:

image

(The rendering isn't great, with characters overlapping etc, but most other terminals have similar issues, and I understand that implementing something better, when there isn't even any specification yet, may not be within scope. In the meantime on a personal note, I was able to get my work done using eshell, which is not a terminal emulator (thankfully), and doesn't try to force text to a grid.)

from contour.

Yaraslaut avatar Yaraslaut commented on August 10, 2024

hi @shreevatsa
Here is what i see for your text on my system
image

you need to setup fonts accordingly in contour config file. And contour debug font.textshaping might give you some additional info

from contour.

christianparpart avatar christianparpart commented on August 10, 2024

I think you might have strict_spacing set to true. Try setting it to false :)

from contour.

shreevatsa avatar shreevatsa commented on August 10, 2024

Thank you @Yaraslaut and @christianparpart. It is heartening to know that some amount of support exists in principle.

For what it's worth, I was not able to get it to work on macOS:

➜  ~ contour debug font.textshaping
Warning: Could not find the Qt platform plugin "cocoa" in "" ((null):0, (null))
Fatal: This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.
 ((null):0, (null))
[1]    73822 abort      contour debug font.textshaping

and

➜  ~ contour font-locator
[error] The configured text shaping engine CoreText does not yet support font feature settings. Ignoring.
Matching fonts using  : CoreText
Font description      : (family=monospace weight=Regular slant=Roman spacing=Proportional, strict_spacing=no)
Number of fonts found : 1
  path /System/Library/Fonts/Menlo.ttc

from contour.

christianparpart avatar christianparpart commented on August 10, 2024

@shreevatsa the given PR above actually implements proper font fallback on MacOS. Thanks for reporting this. :)

[ ... ] and Contour is the worst of them (zero font support), [ ... ]

I wanted to clarify here something on the wording.

"zero font support" is impossible. Some font is always displayed, and fonts can change, bold, italic, bold/italic, this all works (also on latest stable release for macOS). What you maybe meant is font fallback support, which is, what I addressed in #1536 (for macOS), because apparently, since we switched away from fontconfig-use on macOS to the native CoreText API, we did not implement font fallback, but only basic font matching support. #1536 requires macOS 13.1 or higher, however.

which is surprising given the mention of Unicode Grapheme cluster support etc

grapheme segmentation is something entirely different. This is part of UAX #29 and is implemented in libunicode. In grapheme cluster segmentation, one determines how many (UTF-32) Unicode codepoints form a single user perceived character. This can range from 1 to many (e.g. 7) with zero width joiners or even variation selectors included to alter the display. This is something most terminals don't get right. You can try a little test script which I once wrote to just check our own terminal (not sure why I created a separated repo for that, I was probably a little bit too over-motivated :D).

For reference, i've put a small screenshot of the script's output here (this test script solely focuses on Unicode grapheme segmentation, shown by printing various emoji characters):

image

from contour.

shreevatsa avatar shreevatsa commented on August 10, 2024

Just for completeness, some concrete numbers for an example (from another repo wez/wezterm#1333 (comment)): in the example there, the text "বাংলা ভাষা" has, at a font size where the space character (and thus one "cell") is 8 pixels wide:

  • বাং 7+4+6=17 pixels (=17/8=2.125 cells)
  • লা 10+4=14 pixels (=14/8=1.75 cells)
  • 8 pixels (=1 cell)
  • ভা 10+4=14 pixels (=14/8=1.75 cells)
  • ষা 8+4=12 pixels (=12/8=1.5 cells)

So ideally this would be 65 pixels = 8.125 cells wide, but if that's not possible, what I as a reader would prefer would be for cell-alignment to happen at word boundaries (so বাংলা = 17 + 14 pixels = 3.875 cells would be rounded up to 4 cells, then a space, then ভাষা = 26 pixels = 3.25 cells would be either squeezed to 3 or rounded up to 4 cells).

I think Contour tries to render the whole thing 5 or 6 cells wide, as there are 5 graphemes and 6 glyphs (copy-pasted input was echo বাংলা ভাষা | wc — note there's a space before the |):

image

while Terminal and iTerm2 use 10 cells, rounding up each grapheme or maybe even each glyph (বাং 2.125 -> 3 cells, লা 1.75 -> 2 cells, 1 cell, ভা 1.75 -> 2 cells, ষা 1.5 -> 2 cells):

image image

(renders better or more readable, but cursor movement goes haywire).

I understand there is no specification here and it's a research problem how best to render these.


Edit: I understand that the equation of "one grapheme cluster = one terminal cell" can make sense for cursor movement (I wonder what's happening with wide emoji or wide East Asian characters?), but if that needs to be retained, I think one simple hack (that would make the text both readable and usable, at cost of some ugliness) would be to scale glyphs so that they don't exceed one cell's width. For the grapheme clusters in the example above:

  • বাং — want to render it in one cell, but font says it's 2.125 cells wide, so scale this whole run by 1/2.125.
  • লা — want to render it in one cell, but font says it's 1.75 cells wide, so scale this whole run by 1/1.75.
  • — want to render it in one cell, and font says it's 1 cell wide, no scaling needing (scale=1).

etc. Kitty and wezterm seem to be attempting something like this, but half-heartedly (only for some glyphs).

from contour.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.