Giter Club home page Giter Club logo

alignmentmeasures's Introduction

Alignment Measures

This tools loads two sets of time-aligned sequences and outputs several measures of accuracy between them.

This project was initially based on the following paper:

Okko Johannes Räsänen, Unto Kalervo Laine, and Toomas Altosaar: An Improved Speech Segmentation Quality Measure: the R-value

It can be viewed under this link: http://legacy.spa.aalto.fi/research/stt/papers/r_value.pdf

Caveat

There is one minor addition to the paper above. All the segmentation measure papers seem to treat boundaries as entities described only by the time they occur in. The contents of the segments between the boundaries are not checked.

While this may be fine for most uses, this tool assigns a name to each boundary based on the segments located around them. Eg: if a boundary exists between segments titled 'a' and 'b', the boundary will be named "a_b". Hits are counted if and only if the time AND the names of the boundaries (reference and hypothesis) match.

Supported input formats

  • CTM
  • TextGrid

Computed measures

  • Hit rate
  • Over-segmentation rate
  • Precision
  • Recall
  • F-measure
  • r1 (computed as a part of R-value)
  • r2 (computed as a part of R-value)
  • R-value

Requirements

TextGrid input requires the installation of the TextGrid package, as described in the followin repo: https://github.com/kylebgorman/textgrid

It can easily be installed using pip:

pip install TextGrid

Usage

usage: AlignMeasure.py [-h] [--ref-tier REFTIER] [--hyp-tier HYPTIER] ref hyp

Calcualte various alignemnt accuracy measures.

positional arguments:
  ref                   reference segmentation (CTM or TextGrid)
  hyp                   studied segmentation (CTM or TextGrid)

optional arguments:
  -h, --help            show this help message and exit
  --ref-tier REFTIER, -rt REFTIER
                        for TextGrid, use which tier for reference (default:0)
  --hyp-tier HYPTIER, -ht HYPTIER
                        for TextGrid, use which tier for hypothesis
                        (default:0)

Note

The input file type is determined based on the extension only!

Example output

Number of boundaries in reference segmentation: 6
Number of boundaries in studied segmentation: 7
Number of hits: 4
Hit rate (higher=>better_: 66.666667%
Over-segmentation rate (closer-zero=>better): 16.6666666667
Precision (higher=>better): 57.142857%
Recall (higher=>better): 66.666667%
F-measure (higher=>better): 61.538462%
r1 (closer-zero=>better): 37.267799625
r2 (closer-zero=>better): -35.3553390593
R-value (higher=>better): 63.688431%

alignmentmeasures's People

Contributors

danijel3 avatar

Watchers

James Cloos avatar Arif Khan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.