Giter Club home page Giter Club logo

phonemes's Introduction

This repository hosts a representation of Jason Riggle's chart of phonological features version 12.12 in a machine-readable JSON format.

The keys in the JSON file are the phonemes' IPA symbols. The values are their English-language name and the binary features from the chart linked above (see the JSON file for an example).

Additionally, this repository provides a script (phonemeviewer.py) which lets you view the phoneme features from the provided JSON file and see what positive (+ or ±) features each phoneme has. Additionally, when given a list of phonemes, the script calculates the 'similarity' between these phonemes and lists the features that every phoneme in the list shares (if any).

Example usage:

> ð
voiced dental fricative
+cons
-son
-syl
-labial
+coronal
+ant
-dist
+dorsal
-pharyngeal
+voice
-SG
-CG
+cont
-strident
-lateral
-del_rel
-nasal

> ð ʃ
0.782608695652174
-del_rel
+coronal
-son
-syl
-pharyngeal
-CG
-labial
-SG
-lateral
+cont
-nasal
+cons
['coronal', 'cont', 'cons']

The script was written for Python 3.4+, but will probably run on Python 3.3 if the backported enum package from 3.4 is installed. You may use http://ipa.typeit.org/full/ to type the IPA symbols into the script.

Please report any inconsistencies you may find between the JSON file and Jason Riggle's chart, or, better yet, please fix them and create pull requests. Please do not report any errors you may find in Jason Riggle's chart here. Instead, send your comments directly to the author; there is a non-zero chance of him responding to them. Additionally, please let me know if this repository references an outdated version of Jason Riggle's chart.

Big thanks to Rafael Abramovitz for helping me with the creation of a machine-readable version of the chart.

phonemes's People

Contributors

anna-hope avatar chronodm avatar philcombiths avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

phonemes's Issues

/ə/ should be neither +tense nor -tense

Currently, /ə/ (mid central unrounded vowel) collides with /ɛ/ (low-mid front unrounded vowel). I think there's a couple of things going on here:

  1. there should probably be an explicit front feature, so that the /ɛ/, /ɜ/, /ʌ/ trio would be +front, -back, -front, -back, -front, +back. But the file doesn't currently have /ɜ/ anyway, and this seems like a bigger change.

  2. as I read Riggle's chart, /ɘ/, /ə/, and /ɜ/ are distinguished in that /ɘ/ is +tense, /ɜ/ is -tense, and
    /ə/ is neither:

    screen shot 2018-11-29 at 3 52 24 pm

    (The file doesn't currently have /ɘ/ either, but again, bigger change.)

I propose making setting /ə/ to "tense": 0, which will distinguish it from /ɛ/ and perhaps make it easier to add more vowels later.

palatals, alveolopalatals, and palatalized alveolars

/ɕ/ and /ʃʲ/ are both given as “voiceless alveolo-palatal sibilant” with the same features. From /ʃ/, it seems as though /ʃʲ/ should be “palatalized voiceless palato-alveolar sibilant”, although I admit I'm not sure how I'd hear (or produce) the difference. (/ʃ/ is also -strident — maybe /ʃʲ/ should be as well?)

It looks like you're modeling palatalization as +dorsal, +high, is that correct? This leads to a few pairs that can't be distinguished by features:

  1. /dʲ/ (palatalised voiced alveolar stop) and /ɾʲ/ (palatalised alveolar flap)
    • /d/ and /ɾ/ differ only in that /d/ is +dorsal
  2. /ç/ (voiceless palatal fricative) and /ɕ/ (voiceless alveolo-palatal sibilant)
    • plus also /ʃʲ/, although as noted above I suspect it should be -strident
  3. /ʝ/ (voiced palatal fricative) and /ʑ/ (voiced alveolo-palatal sibilant)
    • voiced counterparts of /ç/ and /ɕ/

For (1), I'm not finding a good canonical explanation in distinctive feature terms of /d/ vs /ɾ/. I found some lecture notes from Michigan State that give /ɾ/ as +cont, although I find that a little suspect. This undergraduate paper by Julianna Sarolta Pándi suggests it's a fool's errand and that none of the schemes the author's found are satisfactory, for various compelling reasons. (The +cont analysis, for instance, might make sense analying the tap vs. trill contrast in Spanish but isn't much use when looking at the flapping of alveolar stops in American English).

For (2) and (3), it doesn't seem like as hopeless a cause, but I'm still not able to come up with a canonical answer. The distinction between /ɕ/ and /ç/ isn't clear in Jason Riggle's chart -- neither in place (alveolo-palatal vs. palatal) or manner (sibilant vs. non-sibilant). Olga Arnaudova has palatals as -coronal, contra Riggle, and has palato-alveolars as +coronal, but doesn't address the coronality of alveolo-palatals. (She has palato-alveolars as +strident and alveolo-palatals as -strident). She also doesn't address siblance. Daniel Recasens cites Ladefoged to the effect that “segments … are expected to be either coronal or dorsal, and the corresponding places of articulation alveolar or postalveolar if the segment is coronal and palatal if the segment is dorsal”. This would also suggest /ç/ should be -coronal (cf. Arnaudova above). (Recasens, while arguing for IPA to incorporate a clear distinction between palatal and alveolopalatal, also says that /ɕ/ is “realized more often with a postalveolar articulation than with an alveolopalatal one.”) Maybe we should ask Riggle what he thinks.

So apart from the idea that /ʃʲ/ should be renamed and -strident, I don't have any concrete suggestions. (I can submit a pull request for that if you agree.) Am I right in thinking /dʲ/–/ɾʲ/, /ç/–/ɕ/ and /ʝ/–/ʑ/ are more a matter of taste/convenience, and just more or less intractable in terms of features?

ARPABET Encoding

Awesome resource!

I think a useful annotation to add might be ARPABET encodings for each phoneme. I suspect a lot of people who are interested in a machine-readable format would also be using the CMU Dictionary and other ARPABET-encoded speech recognition software.

So

    "t": {
        "name": "voiceless alveolar stop",
        "arpabet": "T"
        "features": {...}
      }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.