Giter Club home page Giter Club logo

Comments (16)

jacques-quidu avatar jacques-quidu commented on July 24, 2024 1

i have attached 2 PDF files: one printed with Microsoft Print to PDF and one generated using Hummus PDF-Writer (with my fork more precisely).
You can see that in the printed PDF, emojis are rendered with color shapes while with PDF-Writer they are rendered with monochrome shapes.
testPrint.pdf
testExportPDFWriter.pdf

from pdf-writer.

galkahana avatar galkahana commented on July 24, 2024

No current such intention. fonts are assumed to be monochrome, where the color is determined by the graphic context.
frankly up to this ask i wasn't aware that such things are represented by plain fonts.
wonder how a PDF containing those things looks like. i mean how is this represented in PDF where a font is just glyphs drawing and color is determined externally.

from pdf-writer.

galkahana avatar galkahana commented on July 24, 2024

reading about this exactly i see that what you suggest, as in drawing the glyphs as images or lineart, is a common practice in such cases. given that the regular font support in PDF doesn't allow for that. i wonder how this works with copying and pasting the text then. support of understanding text as text in PDF is pretty much reliant on defining text as something drawn with fonts.

do you have an example of a PDF that shows the emojis correctly? like from maybe indesign or some app that might do it properly?

from pdf-writer.

jacques-quidu avatar jacques-quidu commented on July 24, 2024

you can reproduce also by printing to PDF some text using font Segoe UI Emoji and emojis from Microsoft Word for instance.

from pdf-writer.

jacques-quidu avatar jacques-quidu commented on July 24, 2024

the font file is bigger in the printed PDF so i guess because of the emojis colored glyphs: they are not vectorized so in the page content but stored within the font file.

from pdf-writer.

jacques-quidu avatar jacques-quidu commented on July 24, 2024

by the way by tracing the Freetype font face with Segoe UI Emoji i found that it contains 3 cmaps: so i tried switching to the other cmap with the same platform id as the cmap selected by default, but it did not fix anything. Actually the 2 cmaps return the same glyph index from the unicode character code so i guess it is more complicated than switching the cmap with Freetype in order to fix this issue...
But as i suggested i could workaround i guess by extracting a color bitmap from the emoji glyph using FT_Load_glyph and draw the bitmap but the character itself.

from pdf-writer.

galkahana avatar galkahana commented on July 24, 2024

it makes sense that switching cmaps didn't work. fonts in pdf (unless something changed recently) are inherently monochrome, meaning color info is expected to come from outside and all the PDF glyph drawing operators are just about creating paths. So there has to be some alternative implementation which might be exactly what you are trying to do. it'd be interesting to see waht word did (or indesign, if it supports those...this used to be my go to). one can use pdfhumuus recrypt method to create a version of a pdf with decrypted content streams. if i'll get to it, i'll check and let us know what they did.

from pdf-writer.

galkahana avatar galkahana commented on July 24, 2024

ok, so actually it's not as bad as i thought.
here's the decrypted pdf:
testPrintDecrypt.pdf

if you read the text placement code in the page content stream you'll see that the way each emoji character is drawn (that starts in line 77) by superimposing 4 chars, each with different color values.
like the smiling emoji is basically placing the following glyphs one on top of each other, each time with a different RGB color value (im translating the source 0..1 range to 0...255 so it's easier to understand):

0451 with (1.000000 0.690196 0.180392) = [255, 176, 46] (the yellow part)
0452 with (0.972549 0.192157 0.184314) = [248, 49, 47] (the red part)
0453 with (0.250980 0.164706 0.196078) = [64, 42, 50] (the sort of dark area)
0464 with (0.250980 0.164706 0.196078) = [64, 42, 50] (the sort of dark area)

So, a couple of conclusions from this:

  1. it's not a direct translation to pdf. you can also see that copying and pasting the text does NOT provide for a recreation of it (you will see some bs chars instead of the emojis). so there's no direct support in PDF for it (at least as far as we can tell from this file)
  2. i don't believe word has got something internal to come up with the color and which chars to use, so i think this might be internal to the true type font (Segue ui on my pc at least is true type, which i guess it's also the thing on your com). oh...looks like the wikipedia page for open type emojis explains this well. there's a COLR table for that. (there's also other options cause of course there are, where some fonts implement emojis as svgs or raster with another table describing them...all the fun).
  3. you don't have to draw bitmaps. rather, figure out the chars and colors...and you can just place with regular text operators, using the relevant glyph ids (this is where the GlyphUnicodeMapping input to TJ and its other text operator friends might come in handy) , or char ids if they have any.

p.s.
later having read the specs of colr table...it's quite a lot. maybe if there's rasters and this option i shorter better do that as a quick solution.
and look at all this fun :) to support color fonts all one has to do is:

  • support that colr tables. two variants - 0 is simple super imposing, 1 includes matrix changes and support of gradients. multiple type of gradients that is AND blend modes (that's transparency layering modes.
  • support raster images via CBDT and sbix tables. the latter may include embedded PDFs (which allows for lineart). luckily hummus supports all of the possible image formats. We'll have to have a strategy to reuse all this bollocks so it doesn't grow too much when chars are repeated.
  • for dessert: SVG-in-OpenType which basically means being able to render SVG to pdf. well...if we got this than at least we'll get an additional support of SVG which is not all that a bad thing to have in general.

just terrific :)

not ones weekend project exactly. i might start something, but can't make promises. there's some nice outcomes (especially the SVG rendering) if i go through with this...but it's a lot.

from pdf-writer.

jacques-quidu avatar jacques-quidu commented on July 24, 2024

Thanks for this detailed decrypt of colored emojis ;)
And wow it seems a lot to implement indeed for full support of it (especially for svg glyphs in font).
Actually i would need only support for Segoe UI Emoji color glyphs for now (as i use only this font for color emojis on Windows): if Segoe UI Emoji color emojis are encoded so with COLR table and color palette in font file i do not need support for SVG glyphs in font for now.
Also you are right that PDF does not read and copy/paste well these colored emojis which is weird...
So monochrome emojis seems to be better for correct PDF text copy/pasting, and because of that using colored glyphs should still remain a option in PDF.
By the way, the decrypted PDF uses only monochrome emojis, and not colored emojis ?

from pdf-writer.

jacques-quidu avatar jacques-quidu commented on July 24, 2024

i think what we need is using FT_Get_Color_Glyph_Layer to get each color layer glyph index and color from the base glyph index ?

from pdf-writer.

galkahana avatar galkahana commented on July 24, 2024

yes this could work

from pdf-writer.

galkahana avatar galkahana commented on July 24, 2024

@jacques-quidu I sat down to implement this colr table 0 version, which is the simplest one. if still useful to you, you can pull it. usage is like you originally did, but now the color emojis would show up.
attaching a sample PDF.

ColorEmojiColr.pdf

funny enough, copying and pasting the text from the pdf to someplace else both on my win and mac acrobats seem to correctly write the unicode text. im guessing that the word version doesn't set the original char value as the unicode value of the glyphs but rather attempts to map the parts, which creates the incorrect output. by some mysterious way (im guessing the algo of acrobat) this also doesn't result in multiplication of the chars per the layer, but just a nice single char per the text. so what i end up getting is the expected:
Segoe UI Emoji:☺☺☺, and some later text

anyways. figured i'll share. and we got an opacity setting operator now in content context as a side effect.

from pdf-writer.

galkahana avatar galkahana commented on July 24, 2024

oh. one thing. I did implement this on WriteText operator. not on Tj and its friends. It's a rather high level function, and goes way beyond placing a Tj or something, so figured would only do this on the high level operator to avoid mistakes.

you can see the usage here.

from pdf-writer.

jacques-quidu avatar jacques-quidu commented on July 24, 2024

@galkahana thanks Gal i will give it a look when i find some time.
But yes i use low-level Tj operator and not high level writeText operator to draw text.

from pdf-writer.

jacques-quidu avatar jacques-quidu commented on July 24, 2024

ok it works well indeed with writeText method: so only with font Segoe UI Emoji on Windows i use now high-level writeText method instead of low-level text methods in order Segoe UI color emojis to be properly rendered in PDF: i still use low level methods with other fonts to avoid to reset font or color each time i write text.

from pdf-writer.

galkahana avatar galkahana commented on July 24, 2024

Thanks. Thats good input.

from pdf-writer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.