Giter Club home page Giter Club logo

ietf-language-tags's Introduction

ietf-language-tags πŸ‡ΊπŸ‡³

Tools for working with IETF language tags as specified by BCP 47 / RFC 5646.

  • Aims to implement the full RFC spec
  • Validates given tag strings like zh-Hant-CN
  • Interprets complicated (but valid) tags like zh-yue-Latn-CN-pinyin-a-extend1-x-foobar-private1
  • Helps you with getting rid of redundant or deprecated tags
  • Optionally checks given tags against a local copy of the central IETF language tag registry so don't let unregistered subtags slip through

Terminology

The terminology of this library follows RFC5646. To quote the specification:

  • Tag refers to a complete language tag, such as sr-Latn-RS or az-Arab-IR.
  • Subtag refers to a specific section of a tag, delimited by a hyphen, such as the subtags zh, Hant, and CN in the tag zh-Hant-CN.
  • Code refers to values defined in external standards (and that are used as subtags in this document). For example, Hant is an ISO15924 script code that was used to define the Hant script subtag for use in a language tag.

How is this related to Unicode CLDR language tags and POSIX locales?

  • Unicode CLDR uses IETF language tags as basis standard and adds its own extensions - it allows using underscores in the tags, and specifies how to add information about cultural attributes like currencies, measurement units or collations. Most OSes, programming languages and browsers support the CLDR specifications. What's still missing in many CLDR libraries is correct support for matching user-preferred and app-supported language tags - while the Unicode CLDR specification lists 2 algorithms for matching that are more sophisticated than the proposed matching algorithm in IETF's RFC 4647, many libraries only implement the simplest one. Rafael Xavier de Souza from the CLDR.js project has an explanation of the issue.
  • POSIX locales are used in Unix-based systems to determine how apps should handle character sets and string formatting. Even after replacing their underscore separators with hyphen characters, they would not be valid as IETF language tags, and you shouldn't use them with this library without a proper conversion. For example, C is a valid POSIX locale, but not a valid IETF language tag.

Installation

npm install --save @sozialhelden/ietf-language-tags
#or
yarn add @sozialhelden/ietf-language-tags

Usage examples

  • Parse a given IETF language tag to get access to its parts:

    import { parseLanguageTag } from '@sozialhelden/ietf-language-tags';
    const tag = parseLanguageTag(
      'sl-rozaj-biske',
      // Set to `true` for returning `undefined` for invalid tags,
      // outputting errors to the console.
      // Set to `false` to throw an error if a given tag is invalid.
      // The library tries to give helpful feedback for typical errors in tags.
      true,
      // Allows you to use your own logging function. Supply `null` to suppress console output.
      console.log
    );

    This returns the language tag for Slovenian in its Resian / San Giorgio dialect of Resian variant:

    {
      "langtag": "sl-rozaj-biske",
      "language": "sl",
      "variants": ["rozaj", "biske"]
    }
  • Get all information about a given language tag, including descriptions and registry meta infos:

    const tagMetaInfo = getTag('zh-yue-Latn-CN-pinyin-a-extend1-x-foobar-private1');
      {
        extlang: {
          Added: '2009-07-29',
          Description: ['Yue Chinese', 'Cantonese'],
          Macrolanguage: 'zh',
          'Preferred-Value': 'yue',
          Prefix: ['zh'],
          Subtag: 'yue',
          Type: 'extlang',
        },
        parts: {
          extensions: {
            a: 'extend1',
          },
          extlang: 'yue',
          langtag: 'zh-yue-Latn-CN-pinyin-a-extend1-x-foobar-private1',
          language: 'zh-yue',
          privateuse: 'x-foobar-private1',
          region: 'CN',
          script: 'Latn',
          variants: ['pinyin'],
        },
        privateuse: 'x-foobar-private1',
        region: {
          Added: '2005-10-16',
          Description: ['China'],
          Subtag: 'CN',
          Type: 'region',
        },
        script: {
          Added: '2005-10-16',
          Description: ['Latin'],
          Subtag: 'Latn',
          Type: 'script',
        },
        variants: [
          {
            Added: '2008-10-14',
            Description: ['Pinyin romanization'],
            Prefix: ['zh-Latn', 'bo-Latn'],
            Subtag: 'pinyin',
            Type: 'variant',
          },
        ],
      }
  • Return a plain English description of a given tag

    describeIETFLanguageTag('zh-Hans'); // β†’ 'Chinese, written in Han (Simplified variant) script'
    describeIETFLanguageTag('yue-HK'); // β†’ 'Yue Chinese / Cantonese, as used in Hong Kong'
    describeIETFLanguageTag('es-419'); // β†’ 'Spanish / Castilian, as used in Latin America and the Caribbean'
  • Beautify tags to make them more readable

    normalizeLanguageTagCasing('sGn-Be-fR'); // β†’ 'sgn-BE-FR'
  • Get a language tag the IETF language tag registry prefers over the given tag

    getPreferredLanguageTag('zh-yue'); // β†’ 'yue'
    getPreferredLanguageTag('i-klingon'); // β†’ 'tlh'
  • Get a specific, single subtag from the IETF language tag registry:

    getSubTag('extlang', 'hsn');
    {
      Type: 'language',
      Subtag: 'hsn',
      Description: ['Xiang Chinese'],
      Added: '2009-07-29',
      Macrolanguage: 'zh'
    }
  • Match against a RegExp mimicking the RFC specs, without further semantic checks:

    const regexp = createRFC5646Regexp();
    const match = 'zh-yue-Latn-CN-pinyin-a-extend1-x-foobar-private1'.match(regexp);
    {
      region: "CN",
      script: "Latn",
      extlang: "yue",
      language: "zh-yue",
      variants: "-pinyin" // can contain one or more variants in one string
      extensions: "-a-extend1", // can contain one or more extensions in one string
      privateuse: "x-foobar-private1",
      privateuse2: undefined, // For tags that consist of nothing more than a private-use subtag
      langtag: "zh-yue-Latn-CN-pinyin-a-extend1-x-foobar-private1",
    }

Credits / License

Contributors to this package:

Thanks to Matthew Caruana Galizia for maintaining the language-subtag-registry NPM package, which this package is based on.

Update scripts copyright (c) 2013, Matthew Caruana Galizia and licensed under and MIT license.

The JSON database is licensed under the Open Data Commons Attribution License (ODC-BY).

Supported by

.

ietf-language-tags's People

Contributors

dependabot[bot] avatar opyh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

fluenticltd

ietf-language-tags's Issues

Fails to install

npm install --save @sozialhelden/ietf-language-tags

> @sozialhelden/[email protected] postinstall /Users/project/functions/node_modules/@sozialhelden/ietf-language-tags
> cp -r ./node_modules/language-subtag-registry/data/json ./src

cp: ./node_modules/language-subtag-registry/data/json: No such file or directory
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! @sozialhelden/[email protected] postinstall: `cp -r ./node_modules/language-subtag-registry/data/json ./src`
npm ERR! Exit status 1
npm ERR! 
npm ERR! Failed at the @sozialhelden/[email protected] postinstall script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

Doesn't work even if I install language-subtag-registry separately on the same directory.

package.json has broken links for `module` and `types` property

TypeScript fails to work properly with this package at the moment because the module and types property of the distributed package.json are pointing to non-existent properties.

(PR to come later)

Workarounds

Add a (temporary) path binding to the proper file

When you upgrade your packages later on, this shouldn't break anything, and requires no modification of your code or the installed package.

{
  "compilerOptions": {
    "module": "commonjs",
    "target": "es6",
    "outDir": "dist",
    "rootDir": ".",
    "sourceMap": true,
    "strict": false,
    "esModuleInterop": true,
    "paths": {
      "@sozialhelden/ietf-language-tags": [
        "./node_modules/@sozialhelden/ietf-language-tags/dist/esm/index.js"
      ],
    }
  }
}

Reference the esm module directly from your code

This has the undesirable side effect of adding many ../../ to your code...

import {
	normalizeLanguageTagCasing,
	parseLanguageTag,
} from '../../node_modules/@sozialhelden/ietf-language-tags/dist/esm/';

Update the installed package

If you update node_modules/@sozialhelden/ietf-language-tags/package.json to use the esm outputs, typescript works properly, but this affects the ability to use npm install, so it may not be preferable, esp in CI/CD environments.

{
   "module": "dist/esm/index.js",
   "types": "./dist/esm/index.d.ts"
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.