Giter Club home page Giter Club logo

cyrillic-transliteration's Introduction

DOI

What is CyrTranslit?

A Python package for bi-directional transliteration of Cyrillic script to Latin script and vice versa.

By default, transliterates for the Serbian language. A language flag can be set in order to transliterate to and from Bulgarian, Montenegrin, Macedonian, Russian, Serbian, Tajik, and Ukrainian.

What is transliteration?

Transliteration is the conversion of a text from one script to another. For instance, a Latin alphabet transliteration of the Serbian phrase "Мој ховеркрафт је пун јегуља" is "Moj hoverkraft je pun jegulja".

How do I install this?

CyrTranslit is hosted in the Python Package Index (PyPI) so it can be installed using pip:

python -m pip install cyrtranslit       # latest version
python -m pip install cyrtranslit==1.0  # specific version
python -m pip install cyrtranslit>=1.0  # minimum version

What languages are supported?

CyrTranslit currently supports bi-directional transliteration of Bulgarian, Montenegrin, Macedonian, Russian, Serbian, Tajik, and Ukrainian:

>>> import cyrtranslit
>>> cyrtranslit.supported()
['bg', 'me', 'mk', 'ru', 'sr', 'tj', 'ua']

How do I use this?

Bulgarian

>>> import cyrtranslit
>>> cyrtranslit.to_latin("Съединението прави силата!", "bg")
"Săedinenieto pravi silata!"
>>> cyrtranslit.to_cyrillic("Săedinenieto pravi silata!", "bg")
"Съединението прави силата!"

Montenegrin

>>> import cyrtranslit
>>> cyrtranslit.to_latin("Република", "me")
"Republika"
>>> cyrtranslit.to_cyrillic("Republika", "me")
"Република"

Macedonian

>>> import cyrtranslit
>>> cyrtranslit.to_latin("Моето летачко возило е полно со јагули", "mk")
"Moeto letačko vozilo e polno so jaguli"
>>> cyrtranslit.to_cyrillic("Moeto letačko vozilo e polno so jaguli", "mk")
"Моето летачко возило е полно со јагули"

Russian

>>> import cyrtranslit
>>> cyrtranslit.to_latin("Моё судно на воздушной подушке полно угрей", "ru")
"Moyo sudno na vozdushnoj podushke polno ugrej"
>>> cyrtranslit.to_cyrillic("Moyo sudno na vozdushnoj podushke polno ugrej", "ru")
"Моё судно на воздушной подушке полно угрей"

Serbian

>>> import cyrtranslit
>>> cyrtranslit.to_latin("Мој ховеркрафт је пун јегуља")
"Moj hoverkraft je pun jegulja"
>>> cyrtranslit.to_cyrillic("Moj hoverkraft je pun jegulja")
"Мој ховеркрафт је пун јегуља"

Tajik

>>> import cyrtranslit
>>> cyrtranslit.to_latin("Ман мактуб навишта истодам", "tj")
"Man maktub navišta istodam"
>>> cyrtranslit.to_cyrillic("Man maktub navišta istodam", "tj")
"Ман мактуб навишта истодам"

Ukrainian

>>> import cyrtranslit
>>> cyrtranslit.to_latin("Під лежачий камінь вода не тече", "ua")
"Pid ležačyj kamin' voda ne teče"
>>> cyrtranslit.to_cyrillic("Pid ležačyj kamin' voda ne teče", "ua")
"Під лежачий камінь вода не тече"

How can I contribute?

You can include support for other Cyrillic script alphabets. Follow these steps in order to do so:

  1. Create a new transliteration dictionary in the mapping.py file and reference to it in the TRANSLIT_DICT dictionary.
  2. Watch out for cases where two consecutive Latin alphabet letters are meant to transliterate into a single Cyrillic script letter. These cases need to be explicitely checked for inside the to_cyrillic() function in __init__.py.
  3. Add test cases inside of tests.py.
  4. Update the documentation in the README.md and in the doc directory.

A big thank you to everyone who contributed:

Citation

A citation would be much appreciated if you use CyrTranslit in a research publication:

Georges Labrèche. (2021, March 29). CyrTranslit (Version v1.0). Zenodo. http://doi.org/10.5281/zenodo.4643047

BibTex entry:

@software{georges_labreche_2021_4643047,
  author       = {Georges Labrèche},
  title        = {CyrTranslit},
  month        = mar,
  year         = 2021,
  note         = {{A Python package for bi-directional 
                   transliteration of Cyrillic script to Latin script
                   and vice versa. Supports Bulgarian, Montenegrin,
                   Macedonian, Russian, Serbian, Tajik, and
                   Ukrainian.}},
  publisher    = {Zenodo},
  version      = {v1.0},
  doi          = {10.5281/zenodo.4643047},
  url          = {https://doi.org/10.5281/zenodo.4643047}
}

cyrillic-transliteration's People

Contributors

anonymousvoice1 avatar georgeslabreche avatar ratijas avatar savagej avatar shurph avatar syndamia avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.