Giter Club home page Giter Club logo

Comments (18)

Omikhleia avatar Omikhleia commented on June 9, 2024 1

Unless there's something clear to do here, I am going to suggest closing/rejecting this issue, inactive for 2+ months

  • Part of it is merely due to tuning configuration options for small columns, which is doable in the existing code base (via emergencyStretch, tolerance, etc.)
  • Part of it is due to existing TeX patterns as-they-are = Not Our Bug

from sile.

jodros avatar jodros commented on June 9, 2024

изображение

Another format of the first example, this time I couldn't see the frames because -d frames isn't working well when I use parallel...

from sile.

jodros avatar jodros commented on June 9, 2024

изображение

from sile.

alerque avatar alerque commented on June 9, 2024

Lets split PT/RU into different issues because tracking down language-specific stuff doesn't always get resolved at the same time/via the same PR. Lets make this issue the PT one please.

For hyphenation issues, the first thing to check is if we even have break points to work with. Evidently not:

$ ./sile
SILE v0.14.17.r373-g72965ad (LuaJIT 2.1.1700206165) [Rust]
> SILE.showHyphenationPoints("quando", "pt")
quando
> SILE.showHyphenationPoints("apaziguam", "pt")
apa-zi-guam

So at least for "quando", for some reason the patterns are not allowing any hyphenation there. According to PT language rules, where should the points be?

The screen shots are kind of hard to work with for this because I can't tell if the problem is other metrics (like not having any stretch available) might be contributing to poor break choices. Also I can't even be sure I'm typing the same text as you are entering in many cases. Can you post the actual XMl/SIL input files you're testing too?

from sile.

Omikhleia avatar Omikhleia commented on June 9, 2024

Unless I misunderstood the screenshot, it doesn't look as an hyphenation issue, but rather a justification issue (overfull lines)

These examples have fairly short columns: have you tried loosening the justification constraints?

As for TeX, by default, overfull lines are preferred over underfull lines when the constraints cannot be respected (on space stretching/shrinking, etc.).

You can try tweaking, in order:

  • linebreak.emergencyStretch (e.g. set it around 1em, it's a delicate setting)
  • linebreak.tolerance (defaults to 500, something around 2000 might be necessary when width is constraint, or even up to 5000 in very short columns)

There are other settings (pretolerance, and even the space stretchability) that might be changed too, but they are more difficult (IMHO) to tweak "correctly".

If this is indeed the issue at stakes, then it pops up quite regularly, e.g. see #620 (comment)

I know the documentation mentions we use the TeX paragraph shaping and also explains briefly the settings...
But perhaps we could make it clear for casual readers (that's quite of a FAQ, even in the TeX world...) -- especially when most Office solution nowadays prefer underfull lines (at the risk of bad paragraphing in most cases).

Note that making these settings dynamically adaptable (e.g. depending on font size and target line width) could be an interesting exercise for an experimental package, as a possible helper to minimize the occurrence of these situations. We can easily modify the typesetter to account for such dynamic approaches, which was harder in old TeX (i.e. at least before LuaTeX added hooks in many places, though I don't know how much "hackability" it would now have here).

from sile.

Omikhleia avatar Omikhleia commented on June 9, 2024

(BTW, regarding quando, Typst too doesn't hyphenate it (see https://typst.app/tools/hyphenate/) at this point. It's quite logical, as it uses the same TeX hyphenation patterns as SILE -- but at least it shows it's from these original patterns, and not a SILE-specific issue.)

from sile.

jodros avatar jodros commented on June 9, 2024

I ran showHyphenationPoints in some other words with the same issue, and noticed that some of them are indeed missing the rules, e.g.

  • pri-meiro
  • re-cordo
  • to-mado
  • vai-dade
  • mal-dito

from sile.

jodros avatar jodros commented on June 9, 2024

You can try tweaking, in order:

linebreak.emergencyStretch (e.g. set it around 1em, it's a delicate setting)
linebreak.tolerance

I've tested and confirm that sometimes this solved the problem, thanks.

from sile.

Omikhleia avatar Omikhleia commented on June 9, 2024

I ran showHyphenationPoints in some other words with the same issue, and noticed that some of them are indeed missing the rules, e.g.

* pri-meiro

* re-cordo

* to-mado

* vai-dade

* mal-dito

But what should they be? SILE and Typst both use the TeX patterns, and both software show the same hyphenation points here, don't they?

from sile.

jodros avatar jodros commented on June 9, 2024

But what should they be?

I forgot to tell, they should be:

  • pri-mei-ro
  • re-cor-do
  • to-ma-do
  • vai-da-de
  • mal-di-to

from sile.

Omikhleia avatar Omikhleia commented on June 9, 2024

SILE is using (a Lua port of) https://github.com/hyphenation/tex-hyphen/blob/master/hyph-utf8/tex/generic/hyph-utf8/patterns/tex/hyph-pt.tex

So this is likely an issue for https://github.com/hyphenation/tex-hyphen (though it would be easier then if SILE was able to use TeX patterns directly rather than having its own error-prone re-implementation as a Lua table, or to ship with a conversion script).

from sile.

Omikhleia avatar Omikhleia commented on June 9, 2024

(This being said, one can also register exceptions manually, with \hyphenator:add-exceptions)

from sile.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.