Comments (16)
i have attached 2 PDF files: one printed with Microsoft Print to PDF and one generated using Hummus PDF-Writer (with my fork more precisely).
You can see that in the printed PDF, emojis are rendered with color shapes while with PDF-Writer they are rendered with monochrome shapes.
testPrint.pdf
testExportPDFWriter.pdf
from pdf-writer.
No current such intention. fonts are assumed to be monochrome, where the color is determined by the graphic context.
frankly up to this ask i wasn't aware that such things are represented by plain fonts.
wonder how a PDF containing those things looks like. i mean how is this represented in PDF where a font is just glyphs drawing and color is determined externally.
from pdf-writer.
reading about this exactly i see that what you suggest, as in drawing the glyphs as images or lineart, is a common practice in such cases. given that the regular font support in PDF doesn't allow for that. i wonder how this works with copying and pasting the text then. support of understanding text as text in PDF is pretty much reliant on defining text as something drawn with fonts.
do you have an example of a PDF that shows the emojis correctly? like from maybe indesign or some app that might do it properly?
from pdf-writer.
you can reproduce also by printing to PDF some text using font Segoe UI Emoji and emojis from Microsoft Word for instance.
from pdf-writer.
the font file is bigger in the printed PDF so i guess because of the emojis colored glyphs: they are not vectorized so in the page content but stored within the font file.
from pdf-writer.
by the way by tracing the Freetype font face with Segoe UI Emoji i found that it contains 3 cmaps: so i tried switching to the other cmap with the same platform id as the cmap selected by default, but it did not fix anything. Actually the 2 cmaps return the same glyph index from the unicode character code so i guess it is more complicated than switching the cmap with Freetype in order to fix this issue...
But as i suggested i could workaround i guess by extracting a color bitmap from the emoji glyph using FT_Load_glyph and draw the bitmap but the character itself.
from pdf-writer.
it makes sense that switching cmaps didn't work. fonts in pdf (unless something changed recently) are inherently monochrome, meaning color info is expected to come from outside and all the PDF glyph drawing operators are just about creating paths. So there has to be some alternative implementation which might be exactly what you are trying to do. it'd be interesting to see waht word did (or indesign, if it supports those...this used to be my go to). one can use pdfhumuus recrypt method to create a version of a pdf with decrypted content streams. if i'll get to it, i'll check and let us know what they did.
from pdf-writer.
ok, so actually it's not as bad as i thought.
here's the decrypted pdf:
testPrintDecrypt.pdf
if you read the text placement code in the page content stream you'll see that the way each emoji character is drawn (that starts in line 77) by superimposing 4 chars, each with different color values.
like the smiling emoji is basically placing the following glyphs one on top of each other, each time with a different RGB color value (im translating the source 0..1 range to 0...255 so it's easier to understand):
0451 with (1.000000 0.690196 0.180392) = [255, 176, 46] (the yellow part)
0452 with (0.972549 0.192157 0.184314) = [248, 49, 47] (the red part)
0453 with (0.250980 0.164706 0.196078) = [64, 42, 50] (the sort of dark area)
0464 with (0.250980 0.164706 0.196078) = [64, 42, 50] (the sort of dark area)
So, a couple of conclusions from this:
- it's not a direct translation to pdf. you can also see that copying and pasting the text does NOT provide for a recreation of it (you will see some bs chars instead of the emojis). so there's no direct support in PDF for it (at least as far as we can tell from this file)
- i don't believe word has got something internal to come up with the color and which chars to use, so i think this might be internal to the true type font (Segue ui on my pc at least is true type, which i guess it's also the thing on your com). oh...looks like the wikipedia page for open type emojis explains this well. there's a COLR table for that. (there's also other options cause of course there are, where some fonts implement emojis as svgs or raster with another table describing them...all the fun).
- you don't have to draw bitmaps. rather, figure out the chars and colors...and you can just place with regular text operators, using the relevant glyph ids (this is where the GlyphUnicodeMapping input to TJ and its other text operator friends might come in handy) , or char ids if they have any.
p.s.
later having read the specs of colr table...it's quite a lot. maybe if there's rasters and this option i shorter better do that as a quick solution.
and look at all this fun :) to support color fonts all one has to do is:
- support that colr tables. two variants - 0 is simple super imposing, 1 includes matrix changes and support of gradients. multiple type of gradients that is AND blend modes (that's transparency layering modes.
- support raster images via CBDT and sbix tables. the latter may include embedded PDFs (which allows for lineart). luckily hummus supports all of the possible image formats. We'll have to have a strategy to reuse all this bollocks so it doesn't grow too much when chars are repeated.
- for dessert: SVG-in-OpenType which basically means being able to render SVG to pdf. well...if we got this than at least we'll get an additional support of SVG which is not all that a bad thing to have in general.
just terrific :)
not ones weekend project exactly. i might start something, but can't make promises. there's some nice outcomes (especially the SVG rendering) if i go through with this...but it's a lot.
from pdf-writer.
Thanks for this detailed decrypt of colored emojis ;)
And wow it seems a lot to implement indeed for full support of it (especially for svg glyphs in font).
Actually i would need only support for Segoe UI Emoji color glyphs for now (as i use only this font for color emojis on Windows): if Segoe UI Emoji color emojis are encoded so with COLR table and color palette in font file i do not need support for SVG glyphs in font for now.
Also you are right that PDF does not read and copy/paste well these colored emojis which is weird...
So monochrome emojis seems to be better for correct PDF text copy/pasting, and because of that using colored glyphs should still remain a option in PDF.
By the way, the decrypted PDF uses only monochrome emojis, and not colored emojis ?
from pdf-writer.
i think what we need is using FT_Get_Color_Glyph_Layer to get each color layer glyph index and color from the base glyph index ?
from pdf-writer.
yes this could work
from pdf-writer.
@jacques-quidu I sat down to implement this colr table 0 version, which is the simplest one. if still useful to you, you can pull it. usage is like you originally did, but now the color emojis would show up.
attaching a sample PDF.
funny enough, copying and pasting the text from the pdf to someplace else both on my win and mac acrobats seem to correctly write the unicode text. im guessing that the word version doesn't set the original char value as the unicode value of the glyphs but rather attempts to map the parts, which creates the incorrect output. by some mysterious way (im guessing the algo of acrobat) this also doesn't result in multiplication of the chars per the layer, but just a nice single char per the text. so what i end up getting is the expected:
Segoe UI Emoji:☺☺☺, and some later text
anyways. figured i'll share. and we got an opacity setting operator now in content context as a side effect.
from pdf-writer.
oh. one thing. I did implement this on WriteText
operator. not on Tj and its friends. It's a rather high level function, and goes way beyond placing a Tj or something, so figured would only do this on the high level operator to avoid mistakes.
you can see the usage here.
from pdf-writer.
@galkahana thanks Gal i will give it a look when i find some time.
But yes i use low-level Tj operator and not high level writeText operator to draw text.
from pdf-writer.
ok it works well indeed with writeText method: so only with font Segoe UI Emoji on Windows i use now high-level writeText method instead of low-level text methods in order Segoe UI color emojis to be properly rendered in PDF: i still use low level methods with other fonts to avoid to reset font or color each time i write text.
from pdf-writer.
Thanks. Thats good input.
from pdf-writer.
Related Issues (20)
- Can not modify a document by creating a new form XObject and using it in one of the pages HOT 3
- [Question] - pdf to image HOT 1
- Question about attachments HOT 2
- some example projects in wiki are missing HOT 2
- Streams objects writing problem HOT 2
- Add watermark to PDF HOT 7
- Missing lib.obj file HOT 3
- Android Build Workflow HOT 3
- CIDSet encoding does not conform with ISO 19005-2:2011, ISO 19005-3:2012 (PDF/A-2b or PDF/A-3b) HOT 21
- annotations are lost with PDFDocumentCopyingContext::AppendPDFPageFromPDF HOT 3
- How to draw Bezier curves using PDF-Witer library? HOT 2
- Parse a screenplay into scene objects? HOT 2
- Links are removed when documents are merged HOT 8
- Color inversion problem occurs when exporting images HOT 1
- infinite loop HOT 2
- Crash when WriteUsedFontsDefinitions HOT 17
- Publish to github releases without PDFWriterTesting HOT 4
- U3D support, 10 years later HOT 10
- `Segmentation fault (core dumped)` just for adding `PDFWriter pdfWriter` in the `h` file HOT 11
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pdf-writer.