anna-hope / phonemes Goto Github PK

Jason Riggle's chart of phonological features in JSON format + extras

License: MIT License

Python 100.00%

linguistics phonetics phonology phonemes phonological-features ipa-symbols computational-linguistics

phonemes's Introduction

This repository hosts a representation of Jason Riggle's chart of phonological features version 12.12 in a machine-readable JSON format.

The keys in the JSON file are the phonemes' IPA symbols. The values are their English-language name and the binary features from the chart linked above (see the JSON file for an example).

Additionally, this repository provides a script (phonemeviewer.py) which lets you view the phoneme features from the provided JSON file and see what positive (+ or ±) features each phoneme has. Additionally, when given a list of phonemes, the script calculates the 'similarity' between these phonemes and lists the features that every phoneme in the list shares (if any).

Example usage:

> ð
voiced dental fricative
+cons
-son
-syl
-labial
+coronal
+ant
-dist
+dorsal
-pharyngeal
+voice
-SG
-CG
+cont
-strident
-lateral
-del_rel
-nasal

> ð ʃ
0.782608695652174
-del_rel
+coronal
-son
-syl
-pharyngeal
-CG
-labial
-SG
-lateral
+cont
-nasal
+cons
['coronal', 'cont', 'cons']

The script was written for Python 3.4+, but will probably run on Python 3.3 if the backported enum package from 3.4 is installed. You may use http://ipa.typeit.org/full/ to type the IPA symbols into the script.

Please report any inconsistencies you may find between the JSON file and Jason Riggle's chart, or, better yet, please fix them and create pull requests. Please do not report any errors you may find in Jason Riggle's chart here. Instead, send your comments directly to the author; there is a non-zero chance of him responding to them. Additionally, please let me know if this repository references an outdated version of Jason Riggle's chart.

Big thanks to Rafael Abramovitz for helping me with the creation of a machine-readable version of the chart.

phonemes's People

Contributors

Stargazers

Watchers

Forkers

u8621011 asberk lil-lila chronodm aalok-sathe philcombiths computational-linguistics-research bellkev maxxie114 darccio abylouw

phonemes's Issues

/z/ shouldn’t be +dorsal

From what I gather from the PDF, /z/ shouldn’t be dorsal, as /s/ also isn’t.

Thanks, btw :)

/ə/ should be neither +tense nor -tense

Currently, /ə/ (mid central unrounded vowel) collides with /ɛ/ (low-mid front unrounded vowel). I think there's a couple of things going on here:

there should probably be an explicit front feature, so that the /ɛ/, /ɜ/, /ʌ/ trio would be +front, -back, -front, -back, -front, +back. But the file doesn't currently have /ɜ/ anyway, and this seems like a bigger change.
as I read Riggle's chart, /ɘ/, /ə/, and /ɜ/ are distinguished in that /ɘ/ is +tense, /ɜ/ is -tense, and
/ə/ is neither:

(The file doesn't currently have /ɘ/ either, but again, bigger change.)

I propose making setting /ə/ to "tense": 0, which will distinguish it from /ɛ/ and perhaps make it easier to add more vowels later.

/œ/ (low-mid front rounded vowel) should be +round

/p b bʰ f v ɸ β/ shouldn’t be +coronal

Likewisely the previously (sic), from what I gather from the PDF, /p b bʰ f v ɸ β/ shouldn’t be coronal.

Chart linkrot

The phonological chart is no longer at https://dl.dropboxusercontent.com/u/5956329/Riggle/PhonChart_v1212.pdf ; neither is there https://dl.dropboxusercontent.com/u/5956329/Riggle/_Riggle_CV.pdf anymore either :/

palatals, alveolopalatals, and palatalized alveolars

/ɕ/ and /ʃʲ/ are both given as “voiceless alveolo-palatal sibilant” with the same features. From /ʃ/, it seems as though /ʃʲ/ should be “palatalized voiceless palato-alveolar sibilant”, although I admit I'm not sure how I'd hear (or produce) the difference. (/ʃ/ is also -strident — maybe /ʃʲ/ should be as well?)

It looks like you're modeling palatalization as +dorsal, +high, is that correct? This leads to a few pairs that can't be distinguished by features:

/dʲ/ (palatalised voiced alveolar stop) and /ɾʲ/ (palatalised alveolar flap)
- /d/ and /ɾ/ differ only in that /d/ is +dorsal
/ç/ (voiceless palatal fricative) and /ɕ/ (voiceless alveolo-palatal sibilant)
- plus also /ʃʲ/, although as noted above I suspect it should be -strident
/ʝ/ (voiced palatal fricative) and /ʑ/ (voiced alveolo-palatal sibilant)
- voiced counterparts of /ç/ and /ɕ/

For (1), I'm not finding a good canonical explanation in distinctive feature terms of /d/ vs /ɾ/. I found some lecture notes from Michigan State that give /ɾ/ as +cont, although I find that a little suspect. This undergraduate paper by Julianna Sarolta Pándi suggests it's a fool's errand and that none of the schemes the author's found are satisfactory, for various compelling reasons. (The +cont analysis, for instance, might make sense analying the tap vs. trill contrast in Spanish but isn't much use when looking at the flapping of alveolar stops in American English).

For (2) and (3), it doesn't seem like as hopeless a cause, but I'm still not able to come up with a canonical answer. The distinction between /ɕ/ and /ç/ isn't clear in Jason Riggle's chart -- neither in place (alveolo-palatal vs. palatal) or manner (sibilant vs. non-sibilant). Olga Arnaudova has palatals as -coronal, contra Riggle, and has palato-alveolars as +coronal, but doesn't address the coronality of alveolo-palatals. (She has palato-alveolars as +strident and alveolo-palatals as -strident). She also doesn't address siblance. Daniel Recasens cites Ladefoged to the effect that “segments … are expected to be either coronal or dorsal, and the corresponding places of articulation alveolar or postalveolar if the segment is coronal and palatal if the segment is dorsal”. This would also suggest /ç/ should be -coronal (cf. Arnaudova above). (Recasens, while arguing for IPA to incorporate a clear distinction between palatal and alveolopalatal, also says that /ɕ/ is “realized more often with a postalveolar articulation than with an alveolopalatal one.”) Maybe we should ask Riggle what he thinks.

So apart from the idea that /ʃʲ/ should be renamed and -strident, I don't have any concrete suggestions. (I can submit a pull request for that if you agree.) Am I right in thinking /dʲ/–/ɾʲ/, /ç/–/ɕ/ and /ʝ/–/ʑ/ are more a matter of taste/convenience, and just more or less intractable in terms of features?

/ə/ should be -round, -back

ARPABET Encoding

Awesome resource!

I think a useful annotation to add might be ARPABET encodings for each phoneme. I suspect a lot of people who are interested in a machine-readable format would also be using the CMU Dictionary and other ARPABET-encoded speech recognition software.

    "t": {
        "name": "voiceless alveolar stop",
        "arpabet": "T"
        "features": {...}
      }

anna-hope / phonemes Goto Github PK

phonemes's Introduction

phonemes's People

Contributors

Stargazers

Watchers

Forkers

phonemes's Issues

/z/ shouldn’t be +dorsal

/ə/ should be neither +tense nor -tense

/œ/ (low-mid front rounded vowel) should be +round

/p b bʰ f v ɸ β/ shouldn’t be +coronal

Chart linkrot

palatals, alveolopalatals, and palatalized alveolars

/ə/ should be -round, -back

ARPABET Encoding

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent