Giter Club home page Giter Club logo

genshin-langdata's Introduction

genshin-langdata

This repository contains the translation dataset for Genshin Dictionary (GitHub) and Genshin Machine Translation (GitHub).

Just want to access translation data programatically?

Use API: https://dataset.genshin-dictionary.com/words.json

API document (Currently only Japanese version available. English version is planned.)

Development

Translation dataset for Genshin Dictionary is included in dataset/ directry. The dataset is written in JSON5.

Directory structure

dataset/
 ├ dictionary/ ― Dataset for Genshin Dictionary. Also used for Genshin Machine Translate.
 │ ├ artifacts.json5
 │ ├ characters.json5
 │ ︙
 │
 ├ translator/ ― Additional translation dataset for Genshin Machine Translate. This is not used for Genshin Dictionary.
 │ ├ characters.json5
 │ ├ domains.json5
 │ ︙
 │
 └ tags.json ― list of tags attached to each word in Genshin Dictionary.

JSON5 format

See API document. (Currently only Japanese version available. English version is planned.)

pinyins

When you add Chinese pronunciation in pinyin, you can use tone numbers (e.g. qia3) in source JSON5 files. It is transformed to tone letters (e.g. qiǎ) on build.

e.g.

  {
    // ...
    zhCN: "天云峠",
    pinyins: [{ char: "峠", pron: "qia3" }],
    // ...
  },

  {
    // ...
    "zhCN": "天云峠",
    "pinyins": [{ "char": "", "pron": "qiǎ" }],
    // ...
  },

Validation

JSON5 validation is not mandatory process because it automatically runs on GitHub Actions when you open a Pull Request. However, if you want to validate JSON5s on your local machine, follow the insturuction below.

You need following requirements:

  • Node.js: The latest LTS version recommended
  • npm: The latest version recommended
  • (Windows only) PowerShell 7+
    • Some npm scripts needs && support

To run validation:

$ cd /path/to/genshin-langdata
$ npm ci
$ npm test
$ npm run lint

Utility scripts

npm run todo lists the words without Chinese translation.

Example:

$ npm run todo

> todo
> node scripts/todo.js

# Words without Chinese translation

  ## characters.json5
    - Snezhevna (シュナイツェフナ)
    - Snezhevich (シュナイツェビッチ)
    ...

  ## quests.json5
    - Break the Sword Cemetery Seal (剣塚封印を探索)
    - Fishing For Jade (海上拾玉)
    ...

genshin-langdata's People

Contributors

bill-haku avatar dependabot[bot] avatar sleepyash0191 avatar xicri avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.