jtauber / greek-accentuation Goto Github PK

View Code? Open in Web Editor NEW

53.0 13.0 10.0 99 KB

Python 3 library for accenting (and analyzing the accentuation of) Ancient Greek words

License: MIT License

Python 100.00%

greek-new-testament ancient-greek

greek-accentuation's Introduction

greek-accentuation

Python 3 library for accenting (and analyzing the accentuation of) Ancient Greek words.

For more of my work on Ancient Greek, see http://jktauber.com/.

Installation

pip install greek-accentuation==1.2.0

Documentation

see docs.rst

Change Log

Changed in 1.2.0

on_penult will now return an oxytone rather than None if input only has one syllable

Fixed in 1.1.1

handle VVι cluster better

Added in 1.1.0

accentuation.persistent now supports default_short parameter

Fixed in 1.0.5

fixed calculation of coda when syllable is vowel+macron+smooth

Fixed in 1.0.4

syllabify.is_diphthong now works with uppercase letters (fixes a syllabification bug when capitalized word begins with diphthong)
syllabify.add_necessary_breathing now returns a NFKC normalized form (improving rebreath/debreath roundtripping)

Fixed in 1.0.3

possible_accentuations now correctly gives paroxytone as a possible accentuation when penult is long and length of ultima is indeterminate

Fixed in 1.0.2

fixed persistent accent placement where original accent needs to change from circumflex to acute

Fixed in 1.0.1

syllabify.add_necessary_breathing now works with uppercase initial vowels

New in 1.0.0

syllabify.debreath
syllabify.rebreath
syllabify.add_necessary_breathing can optionally add rough breathing
characters.add_breathing properly handles macrons
modules moved into greek_accentuation package
universal wheel build

0.9.9 removed some unnecessary code

0.9.8 add_necessary_breathing now properly handles initial vowels with iota subscripts

0.9.7 fixed another bug in macro + breathing + accent case

0.9.6 fixed a bug in macro + breathing + accent case

0.9.5 breathing is now considered part of the onset and syllabification now works on words containing macron + breathing + accent on the same vowel

0.9.4 fixed syllabification of words containing macron and acute on same vowel

0.9.3 improved onset, nucleus, coda, and syllabify in cases where input has no vowels.

0.9.2 fixed some edge-case bugs in syllable_morae and contonation and got doctest coverage to 100%.

0.9.1 slightly improved persistent accent calculation by falling back to recessive if out of syllables (rather than raising an exception).

New in 0.9

initial documentation
accentuation.display_accent_type
accentuation.get_accent_type
accentuation.on_penult
syllabify.contonation
syllabify.add_necessary_breathing
characters.strip_breathing
characters.strip_accents
characters.remove_redundant_macron
allow ~ to be used for unspecified vowel
allow | to be used as a wall the accent can't cross
allow treatment of final AI/OI length to be settable
added option to treat unmarked vowels as short by default

Previous Versions

0.8 fixed bug in nucleus/coda calculation
0.7 added make_proparoxytone function
0.6 fixed another bug where possible_accentuations wouldn't work with single syllable words
0.5 fixed bug where possible_accentuations wouldn't work with single syllable words
0.4.1 added classifiers for PyPI
0.4 handle explicit length markers on vowels
0.3 attempts to make a word perispomenon or properispomenon will fall back to oxytone and paroxytone respectively if first attempt fails
0.2 better handling of final αι/οι
0.1 initial release

greek-accentuation's People

Contributors

Stargazers

Watchers

Forkers

cloudxtreme abithyzis gregorycrane picuszeus bryanforbes d-k-e padjupvik ryanquey python-repository-hub willf

greek-accentuation's Issues

dealing with an omega that is short for the purposes of accentuation

syllabify('ω̆ος') --> ['ω', '̆ος']

but

syllabify('εω̆ν') --> ['ε', 'ω̆ν']

ah! the short marker does not attach to the omega!

sillabify lines for poetry

I love this library: apart from being awesomely written, it is very useful in plenty of applications and teaching activities!
With a slight extension, the sillabify module can be used to teach Greek meter as well. Along with the suggested functionality, I give you the full explanation below. Apologies if the post is very long, but I think that a bit of context might help.

In Greek poetry, syllables are scanned in a continuum called synapheia. In other words, the whole line, not the single word, is the string to syllabify. So for example Eur. Medea 1:

Εἴθ᾽ ὤφελ᾽ Ἀργοῦς μὴ διαπτάσθαι σκάφος

Becomes:

ειθωφελαργοῦςμηδιαπτασθαισκαφος

If syllabified with the current method, this long string in synapheia yields:

['ει', 'θω', 'φε', 'λαρ', 'γοῦς', 'μη', 'δι', 'α', 'πτα', 'σθαι', 'σκα', 'φος.']

It should be:

['ει', 'θω', 'φε', 'λαρ', 'γοῦς', 'μη', 'δι', 'απ', 'τασ', 'θαισ', 'κα', 'φος.']

The source of the problem is that the current module works (nicely!) with the ordinary scansion rules for single words: every consonant cluster that can be found at word's onset (e.g. 'σκα') is grouped together. This doesn't apply in poetic scansion. The following is the list of the valid consonant cluster in poetry:

        "δρ",
        "θλ", "θν", "θρ", "θμ",
        "κλ", "κν", "κρ",
        "πλ", "πν", "πρ",
        "τρ", "τμ", "τν",
        "φλ", "φρ",
        "χλ", "χρ"

By the way, Homer and the Attic tragedians use notoriously different rules for the consonant clusters muta cum liquida ("κλ", "κρ" etc.). Thus, it might be a sound design choice to turn this list into an argument that users may pass to the function... But that's an extra!

Grave accent

I needed a functionality that allows to change accent in oxytones to grave (or to add it, if there is no accent), and that checks if a word has such an accent, so I added them to your code. I called the new function in the accentuation module make_varia. You can check the code on my fork of greek-accentuation and maybe add it to the library.

Accenting Ἰάννης

I was testing out how this word would be accented, by adding this to docs.rst:

>>> persistent('Ἰάννης', 'Ἰάννης')
'Ἰάννης'

It returned something I didn't expect:

File "docs.rst", line 660, in docs.rst
Failed example:
    persistent('Ἰάννης', 'Ἰάννης')
Expected:
    'Ἰάννης'
Got:
    'Ἰά́ννης'

My initial impression is that this is a bug, but perhaps I am doing something wrong? 😀

Interestinglingly the Byzantine Text accents this word as Ἰαννῆς. Im not sure why.

How do I get rid of extra accents from enclitics: e.g., Μαῖράν

I don't see a strip/simplify accent routine. Am I missing this?

accenting ὕβρεως rather than *ὑβρέως

The omega is specifically marked as short: εω̆ς -- so the accentuation routine should pick up on that.

consider containing package

I've often regretted that the three modules in greek-accentuation don't have a containing package.

What should it be called though? greek_accentuation?

This would mean

characters would become greek_accentuation.characters
accentuation would become greek_accentuation.accentuation
syllabify would become greek_accentuation.syllabify

Any other name suggestions? The second seems a little awkward.

Might be a good final step before declaring 1.0.0 😄

Grave accent (again)

I am getting error messages when I try to ask for the accent on a word with a grave accent.

from greek_accentuation.characters import *
from greek_accentuation.syllabify import *
from greek_accentuation.accentuation import *
display_accentuation(get_accentuation(('τὴν')))
Traceback (most recent call last):
File "", line 1, in
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/greek_accentuation/accentuation.py", line 68, in display_accentuation
return accentuation.name.lower()
AttributeError: 'NoneType' object has no attribute 'name'

Syllabify rules

I think you are missing some values in your is_diphthong and is_valid_consonant_cluster to give better results.
Edit: I just realized this is for ancient Greek and my use case is modern Greek, so I am not sure my changes will apply.
I also wrote a bad test to make sure I follow the consensus on syllabifying.

Let me know if you want a pull request, or I can just paste them here.

a "strip accent" function?

I want to create a pedagogical drill where I strip accents and then ask students to reaccent the words.

I could not find a "strip accent" function. Am I missing something? Its not hard to write but I want to make sure I am not missing something!

document the core API

The library has a lot of functions which are obscure, experimental or mostly intended for internal use.

I'll establish what I think (with input from others if they so choose) the "core" API is and document that.

add_necessary_breathing can get UnboundLocalError

add_necessary_breathing to something like h gives

UnboundLocalError: local variable 'last_vowel' referenced before assignment

Incorrect syllabification of αυῖ (e.g. μεμαυῖα)

It appears the syllabifier sees αυ and decides it's a dipthong without checking ahead here: αυῖ is treated as a single syllable, instead of α-υῖ. There may be a few instances where αυ is indeed a dipthong, but if so they are rare.

Further info:
Example code:
In [3]: word="μεμαυῖα"
In [4]: syllabify(word)
Out[4]: ['με', 'μαυῖ', 'α']
(Should be: ['με', 'μα', 'υῖ', 'α'])

Same problem with rarer αυί
Same problem with ηυῖ (e.g. πεπτηυῖαν)
Same behaviour with ευῖ, ουῖ, but there the fix is different (almost always should be ευ-ῖ, ου-ῖ).

Thanks for the tool, hopefully this will be a simple fix :-)

display_accentuation(get_accentuation('ἣ')) -- eta rough breathing and grave -- throws and error

display_accentuation(get_accentuation('ἣ'))
Traceback (most recent call last):
File "", line 1, in
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/greek_accentuation/accentuation.py", line 68, in display_accentuation
return accentuation.name.lower()
AttributeError: 'NoneType' object has no attribute 'name'

Test suite

Would be cool to have a test suite.

(This is a selfish feature request, as I'm working on something similar in JavaScript and I want to validate it also.)