Giter Club home page Giter Club logo

Comments (1)

lionel-rowe avatar lionel-rowe commented on July 3, 2024

My PR #1087 will fix this if merged โ€” it adds more conversion options to the text-to-unicode tool. One of those options is UTF-16 Code Units, of which ๐Ÿ’ฏ โ†” \uD83D\uDCAF is an example.

{
text: 'a ๐Ÿ’ฉ b',
skipAscii: true,
results: {
// eslint-disable-next-line unicorn/escape-case
fullUnicode: String.raw`a \u{1f4a9} b`,
// eslint-disable-next-line unicorn/escape-case
utf16: String.raw`a \ud83d\udca9 b`,
hexEntities: String.raw`a 💩 b`,
decimalEntities: String.raw`a 💩 b`,
},
},

Note that it isn't accurate to call this "JSON Unicode", as the best* way of encoding ๐Ÿ’ฏ in JSON is as a literal ๐Ÿ’ฏ Unicode character. For example, JavaScript's JSON.stringify just uses the literal characters where possible, and JSON.parse accepts them with no problem:

JSON.stringify({ '๐Ÿ’ฏ': '๐Ÿ’ฏ' })
// result: '{"๐Ÿ’ฏ":"๐Ÿ’ฏ"}'

JSON.parse('{"๐Ÿ’ฏ":"๐Ÿ’ฏ"}')
// result: { '๐Ÿ’ฏ': '๐Ÿ’ฏ' }

Meanwhile, encoding it as a surrogate pair may yield correct results when parsed:

JSON.parse(String.raw`{"\uD83D\uDCAF":"\uD83D\uDCAF"}`)
// result: { '๐Ÿ’ฏ': '๐Ÿ’ฏ' }

But the JSON spec doesn't require that parsers convert the surrogate pair to its corresponding code point, so it's not guaranteed that even fully spec-compliant parsers will "do the right thing" here.

*For most use cases. If you can't guarantee that the consumer will use UTF-8 as required by RFC 8259, but you can guarantee that the consumer will handle surrogate pairs correctly, then it'd make sense to use the surrogate pair. But that seems like a pretty niche situation.

from it-tools.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.