jtauber / greek-inflexion Goto Github PK

View Code? Open in Web Editor NEW

38.0 10.0 12.0 836 KB

Python library for generating (and analyzing) Ancient Greek inflectional paradigms

License: MIT License

Python 99.11% Shell 0.51% CSS 0.38%

greek-new-testament ancient-greek morphological-analysis

greek-inflexion's Introduction

greek-inflexion

A Python 3 library for generating (and analyzing) Ancient Greek inflectional paradigms.

greek-inflexion builds on my generic inflexion library, adding a stem database and ending rules for Ancient Greek along with accentuation logic built on top of my greek-accentuation library.

It can precisely generate (i.e. without over-generation) all the forms in the verbal paradigms in Louise Pratt's The Essentials of Greek Grammar, Helma Dik's Nifty Greek Handouts, and Keller and Russell's Learn to Read Greek. It can also generate the nouns in Pratt.

For each generated form, it can show the stem, ending, and morphophonological (sandhi) rule applied.

Entire paradigms can be generated at once in the same YAML format used for tests.

The library can also parse forms whose information is in the given lexicon or conjecture possible stem information if not.

For more of my work on linguistics and Ancient Greek, see http://jktauber.com/.

Documentation

To run the full data tests from Pratt, Dik, and Keller and Russell, just run ./data_test.py.

For the noun data tests, run ./noun_data_test.py.

See examples.rst for individual usage examples of the library.

TODO

Most of these are partially done elsewhere and I'm in the process of cleaning them up and moving them into this repo.

reduction of repetition in ending rules
better tools for analysis of forms
better stem shape heuristics when conjecturing stems
better stem conjecture when multiple forms available
richer stem database from principal parts lists
support for more nominal forms

greek-inflexion's People

Contributors

Stargazers

Watchers

Forkers

kasev gregorycrane peithous etyates emg bryanforbes fhardison ryanquey python-repository-hub twarkows willf

greek-inflexion's Issues

some verb conjugations that seem to give incorrect or incomplete results

The following code seems to produce incorrect or incomplete results:

from greek_inflexion import GreekInflexion
inflexion = GreekInflexion('stemming.yaml', 'STEM_DATA/homer_lexicon.yaml')
print(inflexion.generate('λύω', 'AAI.3S').keys())
# ... fails
print(inflexion.generate('βάλλω', 'AAI.3S').keys())
# ... fails
print(inflexion.generate('βαίνω', 'AAD.2S').keys())
# ... only gives βῆθι, should also give βῆσον (Iliad 8.285)

stemming.yaml: PAP.DPF rule and δίδωμι

I am using this excellent library to generate paradigms for various verbs. I am using the lexicon in STEM_DATA/morphgnt_lexicon.yaml, and stemming.yaml from the root of the repo.
I find that generating the form of δίδωμι with parsing PAP.DPF generates this:

διδο{athematic}ούσαις

It would obviously generate the wrong form is I just removed the {athematic} substring.

Looking in stemming.yaml, I find this rule commented out for PAP.DPF:

"|ο{athematic}>ού<_|σαις"

If I uncomment this in, then the form becomes:

διδούσαις

which is, of course, correct.

I am not sure what other verbs would be affected by this change -- surely some others. I have not yet checked whether this can be applied without negative effects.

Hope this helps, and many thanks to all contributors for all your hard work.

Best wishes,

Ulrik Sandborg-Petersen

morphgnt_lexicon.yaml: unattested 3- stems of τίθημι

Hi,

In STEM_DATA/morphgnt_lexicon.yaml, I found this line:

3-: θησ/θε{athematic}

I am guessing this is intended to say that there are two stems for certain aorist forms, delimited by the slash.

I haven't found any aorist forms in the Greek New Testament (Nestle 1904, to be exact) where either stem is attested. Am I misinterprething this line?

If not, should it be deleted from the morphgnt_lexicon.yaml?

Many thanks.

Best wishes,

Ulrik Sandborg-Petersen

documentation suggestion: basic instructions for getting it running

This looks really nice -- thanks for making it open source!

I would suggest adding the following basic installation instructions to the README:

git clone https://github.com/jtauber/greek-inflexion
pipenv install inflexion [... or something like pipenv --python /bin/python3 if that doesn't work]
pipenv shell
./data_test.py
exit

If I'm understanding correctly, then the inflexion and greek-accentuation libraries are automagically located and pulled in from git or something (although it's not obvious to me how it knows) -- it might be helpful to say this in the README, because from reading it, I was under the impression that I would have to go to some other repo and separately install these dependencies.

inflexion.conjugate(): {root} in output for unattested forms

For some forms, inflexion.conjugate() includes the string "{root}". I'm not sure, but this looks like it may happen when there is no attested root in the lexicon.

For instance:

inflexion.conjugate("ἀνίστημι", "AAN", "PAN", "PAD", "AMD", "PAI", "AAI", "PAP", "AAP", tags={"final-nu-aai.3s"})

is correct for most forms, but has {root} for some of the AMD forms:

AMD.2S: ἀνάστα{root}αι/ἀνάστησαι
AMD.3S: ἀναστα{root}άσθω/ἀναστησάσθω
AMD.2P: ἀναστά{root}ασθε/ἀναστήσασθε
AMD.3P: ἀναστα{root}άσθων/ἀναστησάσθων

Usage Question

I suspect that this is simply that I'm not using the library correctly but when I run

inflexion.parse("ἀπεχόμενος")

I get:

set()

I thought it was a normalisation issue but it doesn't seem to be. What am I doing wrong?

keyerror, line 109 fileformat.py

Thanks a lot for writing this software. I was getting to write something similar. I'm glad you beat me to the punch. However, so far I've noticed one bug

in line 109 of the fileformat.py

when the partnum is noun, there is no key in partnum_to_key_regex called noun.

I want to complete this project

I've been reading over this code for about 5 hours now and I can't figure out what I'm supposed to put in for the second argument

inflexion.conjugate("λύω", "PAI", "AAI", tags={"final-nu-aai.3s"})

It goes into this function

def conjugate(self, lemma, *TVMs, tags=None):

Where is a list of all the possible TVMs?

You never state what that is.