Giter Club home page Giter Club logo

aligned-texts's Introduction

Aligned Texts

You can find the aligned texts we use in the Bilinguator.com website to create bilingual books!

This repository consists of literary works presented in the form of folders with aligned TXT files in different languages. Each literary work folder is located in text/ folder and has an ID. The full list of IDs and what's meant by them can be found in the Texts list chapter. Each TXT file is named according to the pattern: {ID}_{lang}.txt, where lang is ISO-639 language code. The aligned text files are not just plain TXT files and have quite a simple specification. All TXT files have UTF-8 encoding.

Aligned text files specification

First two lines

Source files are the plain text files of TXT (not necessary) extention.

The first two lines are reserved for the information about a book. Line 1 stands for an author. Line 2 contains a title in the <h1></h1> tags. If an additional information about translator, publishing house, legal notice, etc. is needed, the <delimiter> tag is added after the </h1> tag and after that followed by the information. In the scripts, this additional information is called $titleRest1 and $titleRest2 for the two files respectively.

Example of the first two lines of a source file:

Antoine de Saint-Exupéry
<h1>Der Kleine Prinz</h1><delimiter>Ins Deutsche übertragen von Grete und Josef Leitgeb

If no information on author and/or book title is needed, leave the <delimiter> tag in the line 1 and/or 2. These two lines are not included in the book body which always starts with the line 3.

<delimiter>
<delimiter>

Do not leave the lines empty, because any empty line is eliminated from the result file! It may leed to the unexpected paragraphs shift.

Book body

Book body consists of the paragraphs (called articles in the code) divided by line breaks (\n). The <delimiter> tag is used if the line break is typed inside the article but alignment shoud not be disturbed. Besides, there are HTML-like tags: <h1></h1>, <b><\b>, <i></i> which stand for headers, bold and italic styles respectively.

Illustrations can be added while creating the FB2 and EPUB files with the help of Bilingual Formats scripts. For this, move all the illustrations to one directory, name them as natural arabiс numbers like here. We do not garantee if the script works correctly in case other symbols are provided in the file names. Add <imgℕ> tags to your source files, where is the natural arabic number. The entire article should contain the only tag and nothing else, for example, <img1>. If two corresponding articles contain the <imgℕ> tag with the same number, the illustration will be added only once.

aligned-texts's People

Contributors

bilinguator avatar dmitrii-snitkin avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Forkers

dmitrii-snitkin

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.