Giter Club home page Giter Club logo

Comments (2)

janlelis avatar janlelis commented on June 12, 2024

Hey Andre,

although REGEX_ANY does match a lot of emoji-related codepoints, it does not match some Unicode-codepoints that are used by emoji, but are also used outside of the emoji-world, like U+200D zero-width joiner. That's exactly what is happening here, there is still a ZJW in the data:

uniscribe 'šŸ›¤šŸŽÆšŸ“®šŸ“˜ā†•ā­•šŸ‡¬šŸ‡¶šŸ‡¼šŸ‡øšŸ“ŖšŸ›ŽšŸ‘Øā€šŸŒ¾šŸŗšŸššŸ¤Æ'.gsub(Unicode::Emoji::REGEX_ANY, '')

200D ā”œā”€ ]ā€[		ā”œā”€ ZERO WIDTH JOINER

I've clarified this behavior in the README table.

What you want to do is to use REGEX which gives you better (and more robust) results. For example:

uniscribe 'šŸ›¤šŸŽÆšŸ“®šŸ“˜ā†•ā­•šŸ‡¬šŸ‡¶šŸ‡¼šŸ‡øšŸ“ŖšŸ›ŽšŸ‘Øā€šŸŒ¾šŸŗšŸššŸ¤Æ'.gsub(Unicode::Emoji::REGEX, '')

Unfortunately, this will let through textual emoji like

2195 ā”œā”€ ā†•		ā”œā”€ UP DOWN ARROW`

To work around this issue, you can also remove emoji that respond to REGEX_TEXT, for example, like this:

'šŸ›¤šŸŽÆšŸ“®šŸ“˜ā†•ā­•šŸ‡¬šŸ‡¶šŸ‡¼šŸ‡øšŸ“ŖšŸ›ŽšŸ‘Øā€šŸŒ¾šŸŗšŸššŸ¤Æ'.gsub(Regexp.union(Unicode::Emoji::REGEX, Unicode::Emoji::REGEX_TEXT), '') == "" # => true

Please leave some feedback, if this fixes your issue.

Actually, your feedback inspired me to have a REGEX_ALL regex in a future version of this gem, which will include textual emoji in its regex, see #5

from unicode-emoji.

janlelis avatar janlelis commented on June 12, 2024

Closing, please re-open if problem persists

from unicode-emoji.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.