Giter Club home page Giter Club logo

slug-generator's Introduction

Slug Generator Library

Build Status Coverage Packagist Version Downloads MIT License

This library provides methods to generate slugs for URLs, filenames or any other target that has a limited character set. It’s based on PHPs Transliterator class which uses the data of the CLDR to transform characters between different scripts (e.g. Cyrillic to Latin) or types (e.g. upper- to lower-case or from special characters to ASCII).

Usage

<?php
use Ausi\SlugGenerator\SlugGenerator;

$generator = new SlugGenerator;

$generator->generate('Hello Wörld!');  // Output: hello-world
$generator->generate('Καλημέρα');      // Output: kalemera
$generator->generate('фильм');         // Output: film
$generator->generate('富士山');         // Output: fu-shi-shan
$generator->generate('國語');           // Output: guo-yu

// Different valid character set, a specified locale and a delimiter
$generator = new SlugGenerator((new SlugOptions)
    ->setValidChars('a-zA-Z0-9')
    ->setLocale('de')
    ->setDelimiter('_')
);
$generator->generate('Äpfel und Bäume');  // Aepfel_und_Baeume

Installation

To install the library use Composer or download the source files from GitHub.

composer require ausi/slug-generator

Why create another slug library, aren’t there enough already?

There are many code snippets and some good libraries out there that create slugs, but I didn’t find anything that met my requirements. Options are often very limited which makes it hard to customize for different use cases. Some libs carry large rulesets with them that try to convert characters to ASCII, no one uses Unicode’s CLDR which is the standard for transliteration rules and many other transforms.

But most importantly no library was able to do the “correct” conversions, like Ö-Äpfel to OE-Aepfel for German or İNATÇI to inatçı for Turkish. Because the CLDR transliteration rules are context sensitive they know how to correctly convert to OE-Aepfel instead of Oe-Aepfel or OE-AEpfel. CLDR also takes the language into account and knows that the turkish uppercase letter I has the lowercase form ı instead of i.

Options

All options can be set for the generator object itself new SlugGenerator($options) or overwritten when calling generate($text, $options). Options can by passed as array or as SlugOptions object.

delimiter, default "-"

The delimiter can be any string, it is used to separate words. It gets stripped from the beginning and the end of the slug.

$generator->generate('Hello World!');                         // Result: hello-world
$generator->generate('Hello World!', ['delimiter' => '_']);   // Result: hello_world
$generator->generate('Hello World!', ['delimiter' => '%20']); // Result: hello%20world

validChars, default "a-z0-9"

Valid characters that are allowed in the slug. The range syntax is the same as in character classes of regular expressions. For example abc, a-z0-9äöüß or \p{Ll}\-_.

$generator->generate('Hello World!');                             // Result: hello-world
$generator->generate('Hello World!', ['validChars' => 'A-Z']);    // Result: HELLO-WORLD
$generator->generate('Hello World!', ['validChars' => 'A-Za-z']); // Result: Hello-World

ignoreChars, default "\p{Mn}\p{Lm}"

Characters that should be completely removed and not replaced with a delimiter. It uses the same syntax as the validChars option.

$generator->generate("don't remove");                         // Result: don-t-remove
$generator->generate("don't remove", ['ignoreChars' => "'"]); // Result: dont-remove

locale, default ""

The locale that should be used for the Unicode transformations.

$generator->generate('Hello Wörld!');                        // Result: hello-world
$generator->generate('Hello Wörld!', ['locale' => 'de']);    // Result: hello-woerld
$generator->generate('Hello Wörld!', ['locale' => 'en_US']); // Result: hello-world

transforms, default Upper, Lower, Latn, ASCII, Upper, Lower

Internally the slug generator uses Transform Rules to convert invalid characters to valid ones. These rules can be customized by setting the transforms, preTransforms or postTransforms options. Usually setting preTransforms is desired as it applies the custom transforms prior to the default ones.

How Transform Rules (like Lower or ASCII) and rule sets (like a > b; c > d;) work is documented on the ICU website: http://userguide.icu-project.org/transforms

$generator->generate('Damn 💩!!');                                           // Result: damn
$generator->generate('Damn 💩!!', ['preTransforms' => ['💩 > Ice-Cream']]);  // Result: damn-ice-cream

$generator->generate('©');                                          // Result: c
$generator->generate('©', ['preTransforms' => ['© > Copyright']]);  // Result: copyright
$generator->generate('©', ['preTransforms' => ['Hex']]);            // Result: u00a9
$generator->generate('©', ['preTransforms' => ['Name']]);           // Result: n-copyright-sign

Sponsors

Thanks to Blackfire for sponsoring performance profiling tools for this project.

slug-generator's People

Contributors

abdusco avatar ausi avatar fiedsch avatar leofeyer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

slug-generator's Issues

Got error from your library.

Can you please give me suggestion which version we can use for this library?

Message: Class 'Locale' not found

`An uncaught Exception was encountered
Type: Error

Message: Class 'Locale' not found

Filename: F:\XAMPP\htdocs\rockers_equity_development_new\vendor\ausi\slug-generator\src\SlugOptions.php

Line Number: 189`

Add "transliteration" to project

To get the library easier to find it would be good to add "transliteration" and similar words into composer.json keywords and maybe Githubs project description and tags.

How to force slug-generator not to convert a comma?

Hello, thanks for your great product.
Now I am writing a plugin for CMS Grav, and I need a comma to not be cropped after processing the line.
For example, there was a line like this:
WordPress, Plugins WordPress
It became like this:
wordpress,plugins-wordpress

How do I get the comma to remain in the link?

Cyrillic/Russian generation

For example, I tried to convert name Артём to artyom.

What I tried (under the numbers are code options):

$generator = new SlugGenerator;

//1

$generator = new SlugGenerator((new SlugOptions)->setLocale('ru'));

//2
$str = $generator->generate($str, ['locale' => 'ru']);

//3
$str = $generator->generate($str, ['locale' => 'uk', 'preTransforms' => ['uk-uk_Latn/BGN']]);

Last option from this issue: #19

Wrong result artem in all cases.
How to fix?

Unsupported declare 'strict_types' + Fatal error

I'm getting this while using the generator. Any idea? Thank you!

Warning: Unsupported declare 'strict_types' in SlugGenerator.php on line 12
Fatal error: Default value for parameters with a class type hint can only be NULL in SlugGenerator.php on line 36

Japan letter but translate to Chinese (i think)

Hi

I try to use the class like this

echo $generator->generate('お風呂 リフォーム | 和歌山県のリフォーム・リノベーション', ['locale' => 'jp']);

the translation in Japan for that word supposed to be something like "O furo rifōmu | Wakayama ken no rifōmu rinobēshon" but the result I got is "o-feng-lu-rifomu-he-ge-shan-xiannorifomu-rinobeshon" i think its Chinese word, did i do some mistake?

Correction in Hindi Phrases

मिठाई - mithai (coming up as mitha-i)
खुशबू - khushbu ( coming up as khasaba)
लेना - lena ( coming up as lana)
पैसे - paise (comping up as pasa)
अब - aba (must be ab)

Support for php >8.1

Hi, i'm using this package with my laravel application, i want to upgrade laravel to version 10 and now it's require to use php 8.1.
Can you add support to this php version?

thanks

Warning: unable to open ICU transliterator with id "Latin-Upper"

Steps to reproduce:

$ php -a
Interactive shell

php > Transliterator::create('Not-Exsits');
PHP Warning:  Transliterator::create(): transliterator_create: unable to open ICU transliterator with id "Not-Exsits" in php shell code on line 1
PHP Stack trace:
PHP   1. {main}() php shell code:0
PHP   2. Transliterator::create() php shell code:1

Warning: Transliterator::create(): transliterator_create: unable to open ICU transliterator with id "Not-Exsits" in php shell code on line 1

Call Stack:
    1.5453     388280   1. {main}() php shell code:0
    1.5456     388280   2. Transliterator::create() php shell code:PHP:

PHP:
PHP 7.2.12 (cli) (built: Nov 9 2018 11:03:05) ( NTS )

intl:

version: 1.1.0
ICU version: 62.1
ICU Data version: 62.1
ICU TZData version: 2018e
ICU Unicode version: 11.0

Class 'SlugGenerator' not found

I installed the Slug Generator via composer, but it doesn't work. Every time i try to use it, i get an php error: "Uncaught Error: Class 'Ausi\SlugGenerator' not found".

Tried it with including the autoload.php and also by directly including the source files.

What could be the problem, all the other required packages are working as they should.

Contao: Alias-Generierung bei News und Seitenstruktur unterschiedlich

Hallo,

ich habe den slug-generator in Contao 4.4.13 installiert und gerade festgestellt, dass bei der Einstellung "ASCII-Zahlen und Kleinbuchstaben" im News-Modul aus runden Klammern 40 bzw. 41 (der ASCII-Tabelle nach) im Alias werden. Erstelle ich eine neue Seite unter Seitenstruktur oder lasse ein Alias neu generieren, werden Klammern, wie man es aus Contao 3.5 kennt, in Bindestriche/Minus umgewandelt.

Da andere Satzzeichen im News-Modul nach bekanntem Schema ersetzt werden gehe ich von einem Bug im aus.

Gruß
Sebastian

Wrong transforms for Cyrillic letters

Hi, thanks for a bundle.
Can you help me understand why Cyrillic (I've checked ukrainian ans russian) letters don't transform like here
https://github.com/unicode-org/cldr/blob/master/common/transforms/Ukrainian-Latin-BGN.xml
$slugGenerator->generate('щ', ['locale' => 'uk']) gives me s instead of shch

########################################################################
#
# BGN Page 94 Rule 3.6
#
# шч becomes sh·ch
#
########################################################################
#
ШЧ → SH·CH ; # CYRILLIC CAPITAL LETTER SHA
Шч → Sh·ch ; # CYRILLIC CAPITAL LETTER SHA
шч → sh·ch ; # CYRILLIC SMALL LETTER SHA
Ш} $lower → Sh ; # CYRILLIC CAPITAL LETTER SHA
Ш → SH ; # CYRILLIC CAPITAL LETTER SHA
ш → sh ; # CYRILLIC SMALL LETTER SHA
Щ} $lower → Shch ; # CYRILLIC CAPITAL LETTER SHCHA
Щ → SHCH ; # CYRILLIC CAPITAL LETTER SHCHA
щ → shch ; # CYRILLIC SMALL LETTER SHCHA

requirements incompatible with your PHP version,

Package ausi/slug-generator has requirements incompatible with your PHP version, PHP extensions and Composer version: - ausi/slug-generator v1.1.1 requires ext-intl * but it is not present. - ausi/slug-generator v1.1.1 requires lib-icu >=4.2.1 but it is not present.
Hello, when I run composer require ausi/slug-generator it gives me this, and I have been googleing whole day still can't find a way around it, I was wondering you could help me, kindly asking.

Ability to reverse slug

Would be really nice if there was a function to reverse the slug. Could you add this feature or does it already exist?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.