Giter Club home page Giter Club logo

php-unhomoglyph's Introduction

php-unhomoglpyh

This is an incomplete project. Feel free to fork and develop.

Usage

require_once('Unhomoglyph.php');

Unhomoglyph::init('homoglyphCharmaps/extended.php');

$str1 = 'google.com';
$str2 = 'gοοgle.com'; // o's are actually u+03BF (lowercase omicron)
if( $str1 !== $str && Unhomoglyph::skeleton($str1) === Unhomoglyph::skeleton($str2) ) {

	echo "WARNING! $str1 looks like $str2 but they are NOT the same";

}

The Character Map

Full lookalike/homoglyph character map is in homoglyphCharmaps/extended.php. It's organized into unicode blocks which may help you if you only need to focus on certain ranges. A TODO is to automatically break this into separate files using code generation and only include blocks as needed.

Donate a Unicode Block!

The base of the character map is originally based on https://github.com/nodeca/unhomoglyph which is based on http://www.unicode.org/Public/security/latest/confusables.txt. However, this map left a lot to be desired, and I've gone through the first 5,000 unicode characters or so manually and updated the mapping. Unicode has over 150,000 characters at the time of this writing. See a Unicode Block that needs improvement? Donate it with a pull request!

Tools

Unhomoglyph::renderCharmapTable() - Renders a table of character => mapped character Unhomoglyph::exportUpdatedOrganizedCharmap() - Sorts and re-organizes extended charmap, generating updated return array
Unhomoglyph::exportInverseGroupedCharmap() - Generates an array of skeleton character => homoglyphs
Unhomoglyph::wikipediaUnicodeBlockTableParserApp() - Tool to generate Unhomoglyph::$blockRanges from Wikipedia

php-unhomoglyph's People

Contributors

dliebner avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.