Giter Club home page Giter Club logo

greek-reader's Introduction

greek-reader

Build Status

Python 3 tool for generating (initially Biblical) Greek readers

Example from John 18:1-11. The steps to produce this are listed below under A More Extended Example.

Background

MorphGNT and my Morphological Lexicon aren't quite rich enough yet to produce the kind of readers I've long wanted to (much less the larger vision of a New Kind of Graded Reader) but I've been inspired by Brian Renshaw's (presumably manually produced) Greek Readers (e.g. A Good Friday Greek Reader) to at least put together a tool to show what's possible now and then build on it.

Of course, this isn't the first time I've written code to generate documents from my Greek New Testament databases. This year marks the 20th anniversary of my Index to the Greek New Testament which was the first major project I undertook based on MorphGNT.

What I'm initially putting together here is a Python 3 library and command-line script driven by text files. Eventually, I'll make a website out of this so the majority of the target audience can actually use it :-)

For other Greek projects of mine, see http://jktauber.com/.

Requirements

As well as Python 3, you'll need to install the packages in requirements.txt via pip (preferably in a virtualenv).

XeTeX is required as the current output of my scripts is LaTeX with Unicode (although I do plan to support other backends eventually). On OS X, I use the MacTeX distribution.

How to Use

Quick Start

Assuming you've installed the requirements, you can just type:

./reader.py "John 18:1-11" > reader.tex

You can then run:

xelatex reader.tex
xelatex reader.tex

The two rendering passes ensure that the footnotes are properly numbered.

Note that the reader.pdf PDF that results will footnote every word with the lemma from MorphGNT and, in the case of verbs will include parsing codes. No glosses will be included.

Excluding Words

If you want to exclude certain words (for example, very common words) from being annotated, you can pass an --exclude option to reader.py, giving the name of a file which simply lists the lemmas to exclude. For example:

αὐτός
καί
ὁ

You can easily generate such a file for any words occurring more than N times by running frequency_exclusion.py with N as an argument. For example, to create an exclusion file with any words occurring 31 times or more, run:

./frequency_exclusion.py 31 > exclude31.txt

and then run ./reader.py with --exclude exclude31.txt.

Note that you can make edits to the file after running frequency_exclusion.py to tailor the exclusion list to your needs.

Adding Glosses

If you want to provide glosses, you can pass a --glosses option to reader.py with the name of a YAML file that maps each lemma to a default gloss and possibly per-verse overrides. A file with just default (i.e. global) glosses might look like this:

ἀποκόπτω:
    default: cut off
ἕλκω:
    default: draw
θήκη:
    default: sheath

You can auto-generate an initial gloss file based on John Jeffrey Dodson's public domain lexicon (via lexemes.yaml in this repo) using make_glosses.py which takes a verse range argument just like reader.py as well as an --exclude option.

If you want to extend an existing glosses file you can pass its name in using the --existing option. This is useful if you've already made edits to the file and you don't want to lose them when expanding the coverage of the file to more verses (or fewer exclusions).

When typesetting a reader with glosses, the gloss language should be specified when you generate the reader file. If no value is specified it will default to English, but if your gloss words are in another language it should be specified with the --language option. Languages should be specified using three letter ISO-639-3 codes (e.g. --language rus for Russion or --language spa for Spanish).

Overriding Headwords

If you want to provide more detailed headwords (such as the article or adjective endings) you can pass a --headwords option to reader.py with the name of a YAML file that maps each lemma you want to override with the full headword you want to use instead. For example:

θήκη: θήκη, ης, ἡ
Κεδρών: Κεδρών, ὁ

You can run make_headwords.py to generate headword overrides for nouns and adjectives based on Danker's Concise Lexicon (via the lexemes.yaml file). make_headwords.py takes a verse range argument just like reader.py as well as an --exclude option.

If you want to extend an existing headword file you can pass its name in using the --existing option. This is useful if you've already made edits to the file and you don't want to lose them when expanding the coverage of the file to more verses (or fewer exclusions).

Changing Typeface

The default typeface is now Times New Roman but you can change this by passing a --typeface option to reader.py.

A More Extended Example

Here is how you might typically use the tools:

./frequency_exclusion.py 31 > example/exclude.txt
./make_glosses.py \
    --exclude example/exclude.txt \
    "John 18:1-11" > example/glosses.yaml
# edit example/glosses.yaml to your liking
./make_headwords.py \
    --exclude example/exclude.txt \
    "John 18:1-11" > example/headwords.yaml
./reader.py \
    --headwords example/headwords.yaml \
    --glosses example/glosses.yaml \
    --language eng \
    --exclude example/exclude.txt \
    --typeface "Skolar PE" \
    --backend backends.LaTeX \
    "John 18:1-11" > example/reader.tex
cd example
xelatex reader.tex
xelatex reader.tex
open reader.pdf

You can see the results of this in the examples directory.

Alternative Backends

A --backend option can be provided to reader.py to use an alternative backend. This option takes a module-qualified Python class name. As well as the default backends.LaTeX, there is an experimental backends.SILE included for the SILE Typesetter and backends.MARKDOWN for Markdown, most useful for Markdown processors that support footnotes, for example GitHub (for example).

./reader.py --backend backends.SILE "John 18:1-11" > reader.sil
sile reader.sil

greek-reader's People

Contributors

alerque avatar jtauber avatar lucafavatella avatar ryderwishart avatar thejoshmuller avatar willf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

greek-reader's Issues

py-sblgnt version?

In make_headwords, I am getting:

File "./make_headwords.py", line 39, in <module>
  for entry in get_morphgnt(verses, args.sblgnt_dir):
TypeError: get_morphgnt() takes 1 positional argument but 2 were given

If I remove the second argument I get:

File "./make_headwords.py", line 41, in <module>
  lexeme = entry[8]
IndexError: tuple index out of range

which makes me suspect that this is referring to a different (unreleased?) sblgnt version than the one which pip just installed for me.

Deduplicate footnote entries

It would be super nice if the footnotes didn't contain duplicates. Is that something that can be done easily?

If you'd like, I can have a look at this.

romans pdf page 1 of 3 2014-07-23 11-31-51 2014-07-23 11-32-31

error at "lexeme = entry[8]" (line 66 in reader.py)

When I ran

./reader.py --sblgnt ../sblgnt-tisch-merge "Mark 1:1"

(in a clean virtualenv with python3), I got this error:

Traceback (most recent call last):
  File "./reader.py", line 66, in <module>
    lexeme = entry[8]
IndexError: tuple index out of range

Looking at the sblgnt lines, each entry isn't long enough; changing the line to

lexeme = entry[7]

seems to work -- i.e. after installing MacTex, generated a PDF of John 18:1 which matches the SBLGNT content.

Excited to get this working! Thanks for making it available.

Single column footnotes

Is this something that's easily done? I tried looking at the tex output but I'm smart enough.

The text is all over the place. If not single column, add more space in between?

frequency_exclusion.py out of sync with utils.py

Traceback (most recent call last):
File "frequency_exclusion.py", line 7, in
from utils import morphgnt_filename, print_status
ImportError: cannot import name 'morphgnt_filename'

I see it was in the previous version of utils, but not the July 24th. Maybe utils just needs to import morphgnt_filename from pysblgnt as well?

package up the headwords and glosses from morphological-lexicon for independent use

At the moment if you want to generate glosses and headwords automatically, you need the morphological-lexicon in an adjacent directory. py-sblgnt got rid of the need to have sblgnt in an adjacent directory; we need something similar for the headwords and glosses (doesn't have to be any other part of the morphological-lexicon mega-project)

Option to exclude proper nouns

Of words that show up less than 30 times in the GNT, 520 of them are proper nouns (around half of which are hapax legomena). To have an option to exclude these would save space on exported readers, and could be implemented easily by referencing a pre-existing list. (like this one, for example).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.