Giter Club home page Giter Club logo

Comments (17)

da-liii avatar da-liii commented on July 24, 2024

Here is the way to reproduce it:
https://github.com/XmacsLabs/mogan/releases/tag/v1.2.2

build it and launch it and manually import the Noto CJK SC ttc (have to download the ttc first)

And then open the doc (https://github.com/XmacsLabs/mogan/blob/v1.2.2/TeXmacs/doc/about/mogan/research.zh.tm), and change the font to Noto Serif CJK SC, and then export it to PDF.

Here is the font I'm using:
NotoSerifCJK-Regular.ttc

from pdf-writer.

da-liii avatar da-liii commented on July 24, 2024

A simplified way to reproduce it is:

  • Install NotoSerifCJK-Redular.ttc to ~/Library/Fonts via double click and install
  • Download and install Mogan Research 1.2.2 on macOS arm64 (because it is a release version, there are no debug info, the crash actually happen in PDF HUmmus)
  • Open Mogan Research and open the document below
  • Export it to PDF

You should save the following snippets into crash.tm and then open it.

<TeXmacs|2.1.2>

<style|<tuple|generic|chinese>>

<\body>
  \<#5E2E\>\<#52A9\>\<#83DC\>\<#5355\>\<#4E0B\>\<#7684\>\<#6587\>\<#6863\>\<#9ED8\>\<#8BA4\>\<#4F1A\>\<#663E\>\<#793A\>\<#4E2D\>\<#6587\>\<#6587\>\<#6863\>\<#FF0C\>\<#5F53\>\<#4E2D\>\<#6587\>\<#6587\>\<#6863\>\<#4E0D\>\<#5B58\>\<#5728\>\<#FF08\>\<#6BD4\>\<#5982\>\<#6CA1\>\<#6709\>\<#88AB\>\<#7FFB\>\<#8BD1\>\<#FF09\>
</body>

<\initial>
  <\collection>
    <associate|font|Noto CJK SC>
    <associate|font-family|rm>
    <associate|page-medium|paper>
  </collection>
</initial>

The only related font file is NotoSerifCJK-Regular.ttc via debugging info.

And the crash depends on the content of the document.

from pdf-writer.

da-liii avatar da-liii commented on July 24, 2024

Another truth is that when using Songti SC, it works fine.

from pdf-writer.

da-liii avatar da-liii commented on July 24, 2024

#150 might be the related issue.

from pdf-writer.

da-liii avatar da-liii commented on July 24, 2024

Well, finally, I found that it will crash on the character

https://old.unicode-table.com/cn/9ED8/

here is the document to reproduce the bug:

<TeXmacs|2.1.2>

<style|<tuple|generic|chinese>>

<\body>
  \<#9ED8\>
</body>

<\initial>
  <\collection>
    <associate|font|Noto CJK SC>
    <associate|font-family|rm>
    <associate|page-medium|paper>
  </collection>
</initial>

from pdf-writer.

da-liii avatar da-liii commented on July 24, 2024

Tried SourceHanSerifSC-Regular.otf in https://github.com/adobe-fonts/source-han-serif , it crashes too.

from pdf-writer.

da-liii avatar da-liii commented on July 24, 2024
image
inFontIndex: 0
inCharStringIndex: 46956

// the dumped op and oprand
-44	16	363	29	96	30	6	29	227	30	7	-79
op 29
-73
op 29
33.5
op 11

op 18

op 11
84	55	-55	151	-125	27	55	43	-23	61	-20	61	-57	53	31	57	31	55	-29	54	142	63	-63	65	-59	65
op 20
-65
op 0

op 19
26
op 0
309	164
op 21
-14	-6
op 5
23	-37	23	-60	-47
op 26
50	-48	61	109	-143	89
op 8
100	29
op 21
-12	-7	27	-28	27	-48	4	-38
op 25
52	-43	54	108	-152	56
op 8
-196	-50
op 21
-15	-3	12	-45	8	-69	-10	-54
op 25
43	-59	77	110	-115	120
op 8
-85	-12
op 21
-18	1	1	-52	-29	-62	-26	-24
op 25
-19	-16	-10	-22	11	-20	14	-22	37	7	18	20	26	31	16	69	-21	90
op 8

op 19

op 0

op 0
40	573
op 21
-833
op 29

op 19
-75
op 0

op 11
-14	-4	17	-41	20	-64	1	-49
op 25

op 19

op 0

op 0
38	-41	47	90	-109	109
op 8
605	74
op 21
-11	-6	31	-39	38	-62	9	-49
op 25
61	-47	58	124	-186	79
op 8

op 19

op 10

from pdf-writer.

da-liii avatar da-liii commented on July 24, 2024

This operators seq (without operand) occurs in several times.

op 19

op 10

op 0

https://learn.microsoft.com/en-us/typography/opentype/spec/cff2charstr#one-byte-cff2-operators

The logic to increase the ipc for hintmask (19) needs to be reviewed.

Byte* CharStringType2Interpreter::InterpretHintMask(Byte* inProgramCounter)
{
	mStemsCount+= (unsigned short)(mOperandStack.size() / 2);

	EStatusCode status = mImplementationHelper->Type2Hintmask(mOperandStack,inProgramCounter);
	if(status != PDFHummus::eSuccess)
		return NULL;

	ClearStack();
	return inProgramCounter+(mStemsCount/8 + (mStemsCount % 8 != 0 ? 1:0));
}

from pdf-writer.

galkahana avatar galkahana commented on July 24, 2024

@darcy-shen where did you get the op code dump from?
it seems to show an error in the glyph code...as far as i understand it. op 10, which is callsubr, is not having an operand. that operand would be the subr index. hence the problem. as far as i can see this is where pdfwriter fails (i got my own tracer for glyphs in CharStringType2Tracer. i'll see if there's an expected behavior in such a case (e.g. skip?) which might work better. and also consider your suggetion RE review hintmask...though i think it's ok...but i'll check.

from pdf-writer.

da-liii avatar da-liii commented on July 24, 2024

where did you get the op code dump from?

I skipped the callsubr here when there are no oprands.

Byte* CharStringType2Interpreter::InterpretCallSubr(Byte* inProgramCounter)
{
	CharString* aCharString = NULL;
	if(mOperandStack.size() < 1)
-		return NULL;
+              return inProgramCounter;

from pdf-writer.

galkahana avatar galkahana commented on July 24, 2024

gotcha. does this help in displaying the glyph properly?

from pdf-writer.

da-liii avatar da-liii commented on July 24, 2024

does this help in displaying the glyph properly?

The PDF is ok but the glyph is missing. The exported pdf in the master branch is corrupted.

from pdf-writer.

da-liii avatar da-liii commented on July 24, 2024

Here is my debugging pr: #239

from pdf-writer.

galkahana avatar galkahana commented on July 24, 2024

cool. I got something similar with CharStringType2Tracer.
im comparing it with a reference implementaiton called ttx

[on a mac just go brew install fonttools and ttx is one of them.]
[would upload output but github doesnt allow. too bid]

might not make it to resolve this today, but tomorrow i got all day so i'll fairly confident i'll be able to come up with a correction.

i'll compare and see where the problem is....there's definitely something, here's what they got:

          -44 16 363 29 96 30 6 29 227 30 7 -79 callgsubr
          84 55 -55 151 -125 27 55 43 -23 61 -20 61 -57 53 31 57 31 55 -29 54 142 63 -63 65 -59 65 cntrmask 011010100100101000000000
          hintmask 100000001010010100000000
          309 164 rmoveto
          -14 -6 rlineto
          23 -37 23 -60 -47 vvcurveto
          50 -48 61 109 -143 89 rrcurveto
          100 29 rmoveto
          -12 -7 27 -28 27 -48 4 -38 rlinecurve
          52 -43 54 108 -152 56 rrcurveto
          -196 -50 rmoveto
          -15 -3 12 -45 8 -69 -10 -54 rlinecurve
          43 -59 77 110 -115 120 rrcurveto
          -85 -12 rmoveto
          -18 1 1 -52 -29 -62 -26 -24 rlinecurve
          -19 -16 -10 -22 11 -20 14 -22 37 7 18 20 26 31 16 69 -21 90 rrcurveto
          hintmask 000010010000000000000000
          40 573 rmoveto
          -833 callgsubr
          -14 -4 17 -41 20 -64 1 -49 rlinecurve
          hintmask 000010010000000000000000
          38 -41 47 90 -109 109 rrcurveto
          605 74 rmoveto
          -11 -6 31 -39 38 -62 9 -49 rlinecurve
          61 -47 58 124 -186 79 rrcurveto
          hintmask 001010100000101000000000
          -344 -96 rmoveto
          -67 23 -6 -40 -15 -79 -14 -49 rlinecurve
          13 -5 27 43 25 53 14 33 rlinecurve
          10 -1 8 2 5 4 rrcurveto
          -176 -119 262 119 vlineto
          -290 -321 rmoveto
          29 vlineto
          hintmask 011010100001001000000000
          114 -96 9857 callsubr
          -95 hlineto
          -92 -9 -76 -7 -44 -2 30 -78 rcurveline
          9 2 10 8 5 12 188 32 138 30 103 22 -3 16 rcurveline
          -207 -21 rlineto
          90 172 vlineto
          14 9 262 callsubr
          -39 -48 rlineto
          -87 96 115 -32 8 hlineto
          19 28 15 6 hvcurveto
          295 -833 callsubr
          -72 -907 callgsubr
          -275 -1129 callgsubr
          -398 9 vlineto
          hintmask 001010100000101000000000
          24 22 12 6 hvcurveto
          118 59 rmoveto
          -118 262 118 hlineto
          hintmask 000101000000000000100000
          624 -174 -33 callgsubr
          -123 hlineto
          4 80 1 87 1 97 23 3 10 11 3 -215 callgsubr
          -1 -113 0 -100 -4 -90 4729 callsubr
          -988 callgsubr
          141 hlineto
          -14 -252 -50 -169 -181 -138 15 -16 rcurveline
          -1111 callgsubr
          220 133 57 173 16 259 21 -245 51 -185 120 -123 15 29 23 15 26 1 4 10 rcurveline
          -135 101 -77 180 -27 227 rrcurveto
          6830 callsubr
          hintmask 000101000000000000100000
          -29 29 -48 -96 callgsubr
          endchar
        </CharString>

from pdf-writer.

galkahana avatar galkahana commented on July 24, 2024

yeah defo a bug there right at the start when reading how many stems there should be. this results in the hummus interpreter figuring there's 6 stems, while there are at least 17 which results in reading too few bytes later.

from pdf-writer.

galkahana avatar galkahana commented on July 24, 2024

ok. figured it out. and the glyph shows correctly.

MR.
will merge soon. hope it's ok.

i have a bug (for 13 year oh my) in the implementation of cntrmask operator. as opposed to a correct implementation of hintmask it was NOT CONSIDERING it might be used in a special way that optimizes not having to call vstem.
as a result it got lots of operands...but did nothing with them.

as a result all later hintmasks and cntrmasks calls did not read enough of the following bytes (1 instead of 3) and this resulted in a wonderful mass with creating random operands.

anyways, fixed the code of cntrmask and it seems ok now.

please let me know if this resolves the problem on your end too.

from pdf-writer.

da-liii avatar da-liii commented on July 24, 2024

It fixed my bug, thanks a lot. It will benefit all Mogan/GNU TeXmacs users on Linux who are using CJK characters.

from pdf-writer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.