Giter Club home page Giter Club logo

slugify's Introduction

Slugify Core

This is a fork of the original project here: https://github.com/fcingolani/Slugify. This has been updated for .NET Standard 2.0 support (older versions support .NET Standard down to 1.3).

Build status Current NuGet release MIT license

Simple Slug / Clean URL generator helper for Microsoft .NET.

With default settings, you will get an hyphenized, lowercase, alphanumeric version of any string you please, with any diacritics removed, whitespace and dashes collapsed, and whitespace trimmed.

For example, having:

a ambição cerra o coração

You'll get:

a-ambicao-cerra-o-coracao

Installation

You can get the Slugify NuGet package by running the following command in the Package Manager Console:

PM> Install-Package Slugify.Core

Or running dotnet add package Slugify.Core from the command line.

Upgrading from 2.x to 3.x

  • 3.0 is a significantly faster and less memory intensive version of the Slugifier. Whilst effort has been made to maintain backwards compatability, there may be some breaking changes.
  • The SlugHelper.Config nested class has been renamed to just SlugHelperConfiguration.

Basic Usage

It's really simple! Just instantiate SlugHelper and call its GenerateSlug method with the string you want to convert; it'll return the slugified version:

using Slugify;

public class MyApp
{
    public static void Main()
    {
        SlugHelper helper = new SlugHelper();

        String title = "OLA ke ase!";

        String slug = helper.GenerateSlug(title);

        Console.WriteLine(slug); // "ola-ke-ase"
    }
}

Supporting Non-ASCII Characters

If you want to support non-ASCII characters, you can use the SlugHelperForNonAsciiLanguages class instead of SlugHelper. This is a derived class which will translate the characters provided into something "equivalent" in ASCII.

Configuration

The default configuration of SlugHelper will make the following changes to the passed input in order to generate a slug:

  • Transform all characters to lower-case, to produce a lower-case slug.
  • Trim all leading and trailing whitespace.
  • Collapse all consecutive whitespace into a single space.
  • Replace spaces with a dash.
  • Remove all non-alphanumerical ASCII characters.
  • Collapse all consecutive dashes into a single one.

You can customize most of this behavior by passing a SlugHelperConfiguration object to the SlugHelper constructor. For example, the following example will keep upper-case characters in the input and provides a custom handling for ampersands in the input:

// Creating a configuration object
var config = new SlugHelperConfiguration();

// Add individual replacement rules
config.StringReplacements.Add("&", "-");
config.StringReplacements.Add(",", "-");

// Keep the casing of the input string
config.ForceLowerCase = false;

// Create a helper instance with our new configuration
var helper = new SlugHelper(config);

var result = helper.GenerateSlug("Simple,short&quick Example");
Console.WriteLine(result); // Simple-short-quick-Example

The following options can be configured with the SlugHelperConfiguration:

ForceLowerCase

This specifies whether the output string should be converted to lower-case. If set to false, the original casing will be preserved. The lower-case conversion happens before any other character replacements are being made.

  • Default value: true

CollapseWhiteSpace

This specifies whether consecutive whitespace should be replaced by just one space (" "). The whitespace will be collapsed before any other character replacements are being made.

  • Default value: true

TrimWhitespace

This specifies whether leading and trailing whitespace should be removed from the input string. The whitespace will be trimmed before any other character replacements are being made.

  • Default value: true

CollapseDashes

This specifies wehther consecutive dashes ("-") should be collapsed into a single dash. This is useful to avoid scenarios like "foo & bar" becoming "foo--bar". Dashes will be collapsed after all other string replacements have been made before the final result string is returned.

  • Default value: true

StringReplacements

This is a dictionary containing a mapping of characters that should be replaced individually before the translation happens. By default, this will replace space characters with a hyphen.

String replacements are being made after whitespace has been trimmed and collapsed, after the input string has been converted to lower-case characters, but before any characters are removed, to allow replacing characters that would otherwise be just removed.

  • Default value:

    new Dictionary<string, string> {
       [" "] = "-", // replace space with a hyphen
    }
  • Examples:

    var config = new SlugHelperConfiguration();
    
    // replace the dictionary completely
    config.StringReplacements = new() {
        ["ä"] = "ae",
        ["ö"] = "oe",
        ["ü"] = "ue",
    };
    
    // or add individual replacements to it
    config.StringReplacements.Add("ß", "ss");

AllowedChars

Set of characters that are allowed in the slug, which will be kept when the input string is being processed. By default, this contains all ASCII characters, the full stop, the dash and the underscore. This is the preferred way of controlling which characters should be replaced when generating the slug.

Characters that are not allowed will be replaced after string replacements are completed.

  • Default value: Alphanumerical ASCII characters, the full stop (.), the dash (-), and the underscore (-). abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-._)

  • Examples:

    var config = new SlugHelperConfiguration();
    
    // add individual characters to the list of allowed characters
    config.AllowedChars.Add('!');
    
    // remove previously added or default characters
    config.AllowedChars.Remove('.');

DeniedCharactersRegex

Alternative method of specifying which characters will be allowed in the slug, which will replace the functionality of the AllowedChars set. The value must be a valid regular expression that specifies which characters are to be removed. Every match of this regular expression in the input string will be removed. The removal happens after string replacements are completed.

This functionality is kept in place for legacy compatibility reasons and since it relies on regular expressions, it will perform worse than using the AllowedChars way of specifying.

Specifying the DeniedCharactersRegex option will disable the character removal behavior from the AllowedChars option.

  • Default value: null

  • Examples:

    var helper = new SlugHelper(new SlugHelperConfiguration
    {
        // this is equivalent to the default behavior from `AllowChars`
        DeniedCharactersRegex = "[^a-zA-Z0-9._-]"
    });
    Console.WriteLine(helper.GenerateSlug("OLA ke ase!")); // "ola-ke-ase"
    
    helper = new SlugHelper(new SlugHelperConfiguration
    {
        // remove certain characters explicitly
        DeniedCharactersRegex = @"[abcdef]"
    });
    Console.WriteLine(helper.GenerateSlug("abcdefghijk")); // "ghijk"
    
    helper = new SlugHelper(new SlugHelperConfiguration
    {
        // remove more complex matches
        DeniedCharactersRegex = @"foo|bar"
    });
    Console.WriteLine(helper.GenerateSlug("this is an foo example")); // "this-is-an-example"

slugify's People

Contributors

ctolkien avatar davidwengier avatar dependabot[bot] avatar eerrecart avatar fcingolani avatar jcharlesworthuk avatar poke avatar purekrome avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

slugify's Issues

Persian encoding

It doesn't work for Unicode characters such as "نوشتار فارسی".
How should I do?

[documentation] Clarify AllowedChars vs. DeniedCharactersRegex in README

The README states that the default config is DeniedCharactersRegex = @"[^a-zA-Z0-9\-\._]"; but looking at the source code this is not true, the default setting is a list of AllowedChars (it matches the specified regex but it is a difference, which I discovered while looking into #22 ). I think this should be clarified.

Also I noticed that in the unit tests the DeniedCharactersRegex is labeled as "Legacy".
Is AllowedChars now there preferred way of configuration?
The README doesn't mention AllowedChars at all.

Ignores capital 'I'

?slugHelper.GenerateSlug("FIFA 18")
"ffa-18"
"FIFA 18".ToLower()
"fıfa 18"

caused by Turkish encoding.

Can not handle umlaute

I am using version 3.0.0

the text
"Zwischenboden für Arbeitstische ATK - 600 Tief"
becomes
"zwischenboden-fur-arbeitstische-atk-600-tief"

Most often, German umlaute are translated with two characters. For example "ö" becomes "oe"

I´ve manualy tried to set umlaute, but not working either.

This is my recent configuration

SlugHelperConfiguration slugHelperConfiguration = new SlugHelperConfiguration();
            slugHelperConfiguration.ForceLowerCase = true;
            slugHelperConfiguration.CollapseDashes = true;
            slugHelperConfiguration.TrimWhitespace = true;
            slugHelperConfiguration.CollapseWhiteSpace = true;
            services.AddSingleton(new SlugHelper(slugHelperConfiguration));

Create an ISlugHelper interface

It would be nice if we could set up the SlugHelper once and then inject it around using something a bit easier to mock. I'd be up for creating an interface for it that we can use in unit tests.

StringReplacements for umlauts not working in 3.0.0

When I specify replacements for umlauts (e.g. "ae" for "ä" which is the common replacement) these are not working anymore after moving to 3.0.0

For this test:

[Fact]
public void TestUmlauts()
{
    const string original = "äöüÄÖÜß";
    const string expected = "aeoeueaeoeuess";

    var helper = new SlugHelper(new SlugHelperConfiguration
    {
        StringReplacements = new Dictionary<string, string>
        {
            {" ", "-" },
            {"Ä", "Ae" },
            {"Ö", "Oe" },
            {"Ü", "Ue" },
            {"ä", "ae" },
            {"ö", "oe" },
            {"ü", "ue" },
            {"ß", "ss" }
        },
    });

    Assert.Equal(expected, helper.GenerateSlug(original));
}

This is the result:
image

The reason is that the string normalisation changes the single umlaut char to multiple chars and therefore the replacement char is not found in the input string.
image

DeniedCharactersRegex broken in current NuGet build

(tl;dr: You need to release fixes to NuGet)

So, this was a “fun” one. When attempting to use the library from NuGet, DeniedCharactersRegex doesn’t work and the library will always return an empty string, regardless of what you pass as input string.

To reproduce

Simple example, create a new console application and add the dependency:

dotnet new console
dotnet add package Slugify.Core

Edit the Program.cs to contain the following:

using Slugify;

var config = new SlugHelperConfiguration();
config.DeniedCharactersRegex = "[^abc]";

var helper = new SlugHelper(config);
Console.WriteLine($"'{helper.GenerateSlug("abcdef")}'");

Run the project:

PS > dotnet run
''

Analysis

As it turns out, the following code is the problem:

var currentValue = sb.ToString();
sb.Clear();
sb.Append(DeleteCharacters(currentValue, deniedCharactersRegex));

In the released NuGet version, this appears to be compiled to the following:

sb.Clear();
sb.Append(DeleteCharacters(sb.ToString(), deniedCharactersRegex));

Originally, I thought this was some compiler optimization bug where it would inline the ToString() call here. But looking further, I realized that the released NuGet version is actually a bit older and this is actually the real code there:

sb.Clear();
sb.Append(DeleteCharacters(sb.ToString(), deniedCharactersRegex));

Sooo, this was fixed in #22, and apparently that change never made it onto NuGet (along with some of the other fixes that happened last year). So can you publish a new version to NuGet soon? 😁

Benchmark / Perf Improvements

As per the title, we should probably look at the allocations here as there is undoubtedly stacks of low hanging fruit.

Problem with the letter 'Å'

I have a problem with the string "Å å Æ æ Ø ø" - the slug produced is "a-a-ae-ae-oe-oe" when it should be "aa-aa-ae-ae-oe-oe"

Heres is the code:

var config = new SlugHelperConfiguration();
config.StringReplacements.Add("Å", "AA");
config.StringReplacements.Add("å", "aa");
config.StringReplacements.Add("Æ", "AE");
config.StringReplacements.Add("æ", "ae");
config.StringReplacements.Add("Ø", "OE");
config.StringReplacements.Add("ø", "oe");

SlugHelper helper = new SlugHelper(config);
var slug = helper.GenerateSlug("Å å Æ æ Ø ø");

Any solutions? :)

Migration from version 3.0.0

Hi! Thanks for maintaining this project!

I saw there was an update available today and wanted to read up on what has changed, but the releases here on github are a bit... cryptic. The release notes for https://github.com/ctolkien/Slugify/releases/tag/Release-v4 mention multiple versions. I guess these are changes that have not been included in a release yet, so GitHub automatically put them there.

Release https://github.com/ctolkien/Slugify/releases/tag/Release-v4.0.1 seems to just update some dependencies.

Is there any migration guide for updating from 3.0.0 to 4.0.1?

Also, I cannot find any tag that points to the rev which represents version 3.0.0. Can you add it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.