Giter Club home page Giter Club logo

Comments (11)

Hopding avatar Hopding commented on May 19, 2024

Hello @kevinswartz. pdf-lib currently has poor support for special characters and emojis in embedded fonts. However, this is something that can be supported, and is an area that I'm hoping to improve over the next month or two.

That being said, pdf-lib supports the WinAnsi (aka Windows-1252) character set for non-embedded fonts. This character set includes several special (non-ascii) characters, including ΒΆ (but not emojis). See here for an example, and #17 for more info.

Let me know if you need any clarification, or have any questions about this!

from pdf-lib.

kevinswartz avatar kevinswartz commented on May 19, 2024

Thanks @Hopding! It's working for standard fonts. It looks like encodeText() isn't available for non-standard fonts? Is that something that's determined by the font data I embed, or just a difference in the call? Thanks!

from pdf-lib.

Hopding avatar Hopding commented on May 19, 2024

encodeText() is only available for the standard fonts. It's not currently available for embedded fonts at all. That's what I'm hoping to add support for in the next month or two.

from pdf-lib.

kevinswartz avatar kevinswartz commented on May 19, 2024

Ok! Thanks for your help.

from pdf-lib.

Hopding avatar Hopding commented on May 19, 2024

Version 0.6.0 is now published. It contains support for the full range of UTF-8 and UTF-16 characters included in embedded fonts (example). The full release notes are available here.

You can install this new version with npm:

npm install [email protected]

It's also available on unpkg:

@kevinswartz If you're able to try this new release out, please let me know how it works for you!

from pdf-lib.

kevinswartz avatar kevinswartz commented on May 19, 2024

Thanks @Hopding! I tried this today, but haven't been able to get it working with emoji's so far.

Here's the code I'm currently using to write text to the pdf, updated for v0.6.0. Do you see any issues with the way I'm embedding/encoding text? Thanks!

    var embeddedFontRef = null;
    var embeddedFont = null;
    if(me.isStandardFont(fontFamily)){
      var arr = me.pdfDoc.embedStandardFont(fontFamily);
      embeddedFontRef = arr[0];
      embeddedFont = arr[1];
    }
    else{
      var arr = me.pdfDoc.embedNonstandardFont(me.getFontData(fontFamily));
      embeddedFontRef = arr[0];
      embeddedFont = arr[1];
    }
    page.addFontDictionary(fontFamily, embeddedFontRef);
    
    var encodedText = null;
    if(embeddedFont && typeof embeddedFont.encodeText === "function"){
      console.log('encoding text', text);
      encodedText = embeddedFont.encodeText(text);
      console.log('text encoded', encodedText);
    }
    else{
      console.log('not encoding text');
      encodedText = text;
    }
    var contentStream = me.pdfDoc.createContentStream(
      window.PDFLib.drawText(encodedText, {
        x: (x / scale),
        y: toHeight - ((y + fontSize) / scale),
        size: (fontSize / scale),
        font: fontFamily,
        colorRgb: [0,0,0],
      })
    );

I'm logging out the text I want to write before and after calling encodeText(), here's what I see:

encoding text 😁
text encoded tΒ {clone: Ζ’, toString: Ζ’, bytesSize: Ζ’, copyBytesInto: Ζ’, string: "0000"}

from pdf-lib.

Hopding avatar Hopding commented on May 19, 2024

What font are you using? Not all fonts support emojis. The string: "0000" value looks suspicious. I suspect it might indicate that the font you are embedding does not support the 😁 character.

(I'm thinking in the next major update of pdf-lib I'll update the encodeText() method to throw an error when an unsupported character is passed in. But I haven't done that yet in order to maintain backward compatibility in the API.)

from pdf-lib.

kevinswartz avatar kevinswartz commented on May 19, 2024

Hi @Hopding
Here's the font I was using above: https://fonts.google.com/specimen/La+Belle+Aurore
It's very possible that the font doesn't support emoji's. How would I check that?

I just tried encoding that same character in standard Helvetica and it looks like I'm getting an exception:

Error saving pdf changes Error: WinAnsi cannot encode "οΏ½"
    at Encoding.encodeUnicodeCodePoint (pdf-lib.js:7842)
    at Array.map (<anonymous>)
    at PDFStandardFontFactory.encodeTextAsGlyphs (pdf-lib.js:45607)
    at PDFStandardFontFactory.encodeText (pdf-lib.js:45562)
    at constructor.drawText (PDFGenerationHelper.js?_dc=1546964566632:122)

I can verify that I'm passing simply the 😁 in as the text string. Thanks again for your help!

from pdf-lib.

Hopding avatar Hopding commented on May 19, 2024

The standard fonts only support characters in the Win-1252 character set (also called WinAnsi in the PDF spec). This character set only includes that latin alphabet. Definitely no emojis in there, unfortunately πŸ™.

Google Fonts has a nifty field that lets you enter text and preview what it will look like in the selected font:
screen shot 2019-01-08 at 4 54 03 pm

The preview just omits any characters that aren't supported by that font. For example, when we enter this string: This is a test: πŸ˜Λ™βˆ†Λ†Ζ’Β©Γ§βˆšβˆ«ΓŸΓ₯, it displays the following:
screen shot 2019-01-08 at 4 56 11 pm

This indicates that the La Belle Aurore font does not support these characters:

πŸ˜βˆ†βˆšβˆ«

This is not at all uncommon. There's probably no single font that supports the entire unicode character set. Such a font would be enormous!

from pdf-lib.

Hopding avatar Hopding commented on May 19, 2024

I recently wrote a new integration test that also serves as a demo of the standard 14 fonts: standard_14_fonts_demo.pdf. It shows what each font looks like, and the entire character set it supports.

from pdf-lib.

kevinswartz avatar kevinswartz commented on May 19, 2024

Thanks for your help! That's very informative

from pdf-lib.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.