Comments (17)
Here is the way to reproduce it:
https://github.com/XmacsLabs/mogan/releases/tag/v1.2.2
build it and launch it and manually import the Noto CJK SC ttc (have to download the ttc first)
And then open the doc (https://github.com/XmacsLabs/mogan/blob/v1.2.2/TeXmacs/doc/about/mogan/research.zh.tm), and change the font to Noto Serif CJK SC, and then export it to PDF.
Here is the font I'm using:
NotoSerifCJK-Regular.ttc
from pdf-writer.
A simplified way to reproduce it is:
- Install NotoSerifCJK-Redular.ttc to ~/Library/Fonts via double click and install
- Download and install Mogan Research 1.2.2 on macOS arm64 (because it is a release version, there are no debug info, the crash actually happen in PDF HUmmus)
- Open Mogan Research and open the document below
- Export it to PDF
You should save the following snippets into crash.tm
and then open it.
<TeXmacs|2.1.2>
<style|<tuple|generic|chinese>>
<\body>
\<#5E2E\>\<#52A9\>\<#83DC\>\<#5355\>\<#4E0B\>\<#7684\>\<#6587\>\<#6863\>\<#9ED8\>\<#8BA4\>\<#4F1A\>\<#663E\>\<#793A\>\<#4E2D\>\<#6587\>\<#6587\>\<#6863\>\<#FF0C\>\<#5F53\>\<#4E2D\>\<#6587\>\<#6587\>\<#6863\>\<#4E0D\>\<#5B58\>\<#5728\>\<#FF08\>\<#6BD4\>\<#5982\>\<#6CA1\>\<#6709\>\<#88AB\>\<#7FFB\>\<#8BD1\>\<#FF09\>
</body>
<\initial>
<\collection>
<associate|font|Noto CJK SC>
<associate|font-family|rm>
<associate|page-medium|paper>
</collection>
</initial>
The only related font file is NotoSerifCJK-Regular.ttc
via debugging info.
And the crash depends on the content of the document.
from pdf-writer.
Another truth is that when using Songti SC, it works fine.
from pdf-writer.
#150 might be the related issue.
from pdf-writer.
Well, finally, I found that it will crash on the character 默
https://old.unicode-table.com/cn/9ED8/
here is the document to reproduce the bug:
<TeXmacs|2.1.2>
<style|<tuple|generic|chinese>>
<\body>
\<#9ED8\>
</body>
<\initial>
<\collection>
<associate|font|Noto CJK SC>
<associate|font-family|rm>
<associate|page-medium|paper>
</collection>
</initial>
from pdf-writer.
Tried SourceHanSerifSC-Regular.otf in https://github.com/adobe-fonts/source-han-serif , it crashes too.
from pdf-writer.
![image](https://private-user-images.githubusercontent.com/1267865/292884416-004c1065-0929-4327-beaf-f6e35912245e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE3NjczOTIsIm5iZiI6MTcyMTc2NzA5MiwicGF0aCI6Ii8xMjY3ODY1LzI5Mjg4NDQxNi0wMDRjMTA2NS0wOTI5LTQzMjctYmVhZi1mNmUzNTkxMjI0NWUucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDcyMyUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA3MjNUMjAzODEyWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9N2NmZTJiNjE3YTAzM2M1ZWNhNTRmZGE4MjI2NDcxZGYwYzI3NjcxYTU2OTM5ZTlkOTkyNjg5ZWZlMzBlMDI5NSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.G4rCwfBTLgd8HnONgZa23W-W0iWqaLczpwKZa8mGU_E)
inFontIndex: 0
inCharStringIndex: 46956
// the dumped op and oprand
-44 16 363 29 96 30 6 29 227 30 7 -79
op 29
-73
op 29
33.5
op 11
op 18
op 11
84 55 -55 151 -125 27 55 43 -23 61 -20 61 -57 53 31 57 31 55 -29 54 142 63 -63 65 -59 65
op 20
-65
op 0
op 19
26
op 0
309 164
op 21
-14 -6
op 5
23 -37 23 -60 -47
op 26
50 -48 61 109 -143 89
op 8
100 29
op 21
-12 -7 27 -28 27 -48 4 -38
op 25
52 -43 54 108 -152 56
op 8
-196 -50
op 21
-15 -3 12 -45 8 -69 -10 -54
op 25
43 -59 77 110 -115 120
op 8
-85 -12
op 21
-18 1 1 -52 -29 -62 -26 -24
op 25
-19 -16 -10 -22 11 -20 14 -22 37 7 18 20 26 31 16 69 -21 90
op 8
op 19
op 0
op 0
40 573
op 21
-833
op 29
op 19
-75
op 0
op 11
-14 -4 17 -41 20 -64 1 -49
op 25
op 19
op 0
op 0
38 -41 47 90 -109 109
op 8
605 74
op 21
-11 -6 31 -39 38 -62 9 -49
op 25
61 -47 58 124 -186 79
op 8
op 19
op 10
from pdf-writer.
This operators seq (without operand) occurs in 默
several times.
op 19
op 10
op 0
https://learn.microsoft.com/en-us/typography/opentype/spec/cff2charstr#one-byte-cff2-operators
The logic to increase the ipc
for hintmask (19)
needs to be reviewed.
Byte* CharStringType2Interpreter::InterpretHintMask(Byte* inProgramCounter)
{
mStemsCount+= (unsigned short)(mOperandStack.size() / 2);
EStatusCode status = mImplementationHelper->Type2Hintmask(mOperandStack,inProgramCounter);
if(status != PDFHummus::eSuccess)
return NULL;
ClearStack();
return inProgramCounter+(mStemsCount/8 + (mStemsCount % 8 != 0 ? 1:0));
}
from pdf-writer.
@darcy-shen where did you get the op code dump from?
it seems to show an error in the glyph code...as far as i understand it. op 10, which is callsubr, is not having an operand. that operand would be the subr index. hence the problem. as far as i can see this is where pdfwriter fails (i got my own tracer for glyphs in CharStringType2Tracer
. i'll see if there's an expected behavior in such a case (e.g. skip?) which might work better. and also consider your suggetion RE review hintmask...though i think it's ok...but i'll check.
from pdf-writer.
where did you get the op code dump from?
I skipped the callsubr here when there are no oprands.
Byte* CharStringType2Interpreter::InterpretCallSubr(Byte* inProgramCounter)
{
CharString* aCharString = NULL;
if(mOperandStack.size() < 1)
- return NULL;
+ return inProgramCounter;
from pdf-writer.
gotcha. does this help in displaying the glyph properly?
from pdf-writer.
does this help in displaying the glyph properly?
The PDF is ok but the glyph is missing. The exported pdf in the master branch is corrupted.
from pdf-writer.
Here is my debugging pr: #239
from pdf-writer.
cool. I got something similar with CharStringType2Tracer.
im comparing it with a reference implementaiton called ttx
[on a mac just go brew install fonttools
and ttx is one of them.]
[would upload output but github doesnt allow. too bid]
might not make it to resolve this today, but tomorrow i got all day so i'll fairly confident i'll be able to come up with a correction.
i'll compare and see where the problem is....there's definitely something, here's what they got:
-44 16 363 29 96 30 6 29 227 30 7 -79 callgsubr
84 55 -55 151 -125 27 55 43 -23 61 -20 61 -57 53 31 57 31 55 -29 54 142 63 -63 65 -59 65 cntrmask 011010100100101000000000
hintmask 100000001010010100000000
309 164 rmoveto
-14 -6 rlineto
23 -37 23 -60 -47 vvcurveto
50 -48 61 109 -143 89 rrcurveto
100 29 rmoveto
-12 -7 27 -28 27 -48 4 -38 rlinecurve
52 -43 54 108 -152 56 rrcurveto
-196 -50 rmoveto
-15 -3 12 -45 8 -69 -10 -54 rlinecurve
43 -59 77 110 -115 120 rrcurveto
-85 -12 rmoveto
-18 1 1 -52 -29 -62 -26 -24 rlinecurve
-19 -16 -10 -22 11 -20 14 -22 37 7 18 20 26 31 16 69 -21 90 rrcurveto
hintmask 000010010000000000000000
40 573 rmoveto
-833 callgsubr
-14 -4 17 -41 20 -64 1 -49 rlinecurve
hintmask 000010010000000000000000
38 -41 47 90 -109 109 rrcurveto
605 74 rmoveto
-11 -6 31 -39 38 -62 9 -49 rlinecurve
61 -47 58 124 -186 79 rrcurveto
hintmask 001010100000101000000000
-344 -96 rmoveto
-67 23 -6 -40 -15 -79 -14 -49 rlinecurve
13 -5 27 43 25 53 14 33 rlinecurve
10 -1 8 2 5 4 rrcurveto
-176 -119 262 119 vlineto
-290 -321 rmoveto
29 vlineto
hintmask 011010100001001000000000
114 -96 9857 callsubr
-95 hlineto
-92 -9 -76 -7 -44 -2 30 -78 rcurveline
9 2 10 8 5 12 188 32 138 30 103 22 -3 16 rcurveline
-207 -21 rlineto
90 172 vlineto
14 9 262 callsubr
-39 -48 rlineto
-87 96 115 -32 8 hlineto
19 28 15 6 hvcurveto
295 -833 callsubr
-72 -907 callgsubr
-275 -1129 callgsubr
-398 9 vlineto
hintmask 001010100000101000000000
24 22 12 6 hvcurveto
118 59 rmoveto
-118 262 118 hlineto
hintmask 000101000000000000100000
624 -174 -33 callgsubr
-123 hlineto
4 80 1 87 1 97 23 3 10 11 3 -215 callgsubr
-1 -113 0 -100 -4 -90 4729 callsubr
-988 callgsubr
141 hlineto
-14 -252 -50 -169 -181 -138 15 -16 rcurveline
-1111 callgsubr
220 133 57 173 16 259 21 -245 51 -185 120 -123 15 29 23 15 26 1 4 10 rcurveline
-135 101 -77 180 -27 227 rrcurveto
6830 callsubr
hintmask 000101000000000000100000
-29 29 -48 -96 callgsubr
endchar
</CharString>
from pdf-writer.
yeah defo a bug there right at the start when reading how many stems there should be. this results in the hummus interpreter figuring there's 6 stems, while there are at least 17 which results in reading too few bytes later.
from pdf-writer.
ok. figured it out. and the glyph shows correctly.
MR.
will merge soon. hope it's ok.
i have a bug (for 13 year oh my) in the implementation of cntrmask operator. as opposed to a correct implementation of hintmask it was NOT CONSIDERING it might be used in a special way that optimizes not having to call vstem.
as a result it got lots of operands...but did nothing with them.
as a result all later hintmasks and cntrmasks calls did not read enough of the following bytes (1 instead of 3) and this resulted in a wonderful mass with creating random operands.
anyways, fixed the code of cntrmask and it seems ok now.
please let me know if this resolves the problem on your end too.
from pdf-writer.
It fixed my bug, thanks a lot. It will benefit all Mogan/GNU TeXmacs users on Linux who are using CJK characters.
from pdf-writer.
Related Issues (20)
- Parse a screenplay into scene objects? HOT 2
- color emojis HOT 16
- Links are removed when documents are merged HOT 8
- Color inversion problem occurs when exporting images HOT 1
- infinite loop HOT 2
- Publish to github releases without PDFWriterTesting HOT 4
- U3D support, 10 years later HOT 10
- `Segmentation fault (core dumped)` just for adding `PDFWriter pdfWriter` in the `h` file HOT 11
- Fail to draw rectangle to pdf,if rectangle larger than(595x842) HOT 2
- german umlaute diacritic not rendered properly HOT 2
- Make a PDF with an image in memory HOT 4
- Double hex encoding in AbstractContentContext::TJHexLow HOT 3
- Perfectly kosher bug :) HOT 13
- bug: Stack overflow with Invalid PageTree HOT 1
- bug: Stack Overflow when loop in ParseNewObject -> ParseExistingInDirectStreamObject HOT 1
- bug: Requested Allocation Size too big HOT 2
- bug: Heap Use-after-free HOT 1
- bug: Allocation size too big when /Count is high HOT 2
- This line : ` status = pdfWriter.StartPDF(completePathToEmptyPages, ePDFVersion13, logConfiguration)` causes : `Segmentation fault (core dumped)` HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pdf-writer.