Giter Club home page Giter Club logo

ud_latin-perseus's Introduction

Summary

This Universal Dependencies Latin Treebank consists of an automatic conversion of a selection of passages from the Ancient Greek and Latin Dependency Treebank 2.1

Introduction

The current UD treebank derives from texts taken from the Ancient Greek and Latin Dependency Treebank 2.1 available at

The original data have been semi-automatically annotated. More precisely, morphological annotation and lemmatization have been performed with the help of the Morpheus morphological analyzer, while syntactic annotation has been done manually. The following guidelines have been followed:

Further details can be found at:

This UD release contains parts of the following works:

author work
Augustus Res Gestae
Cicero In Catilinam
Jerome Vulgata
Vergil Aeneid
Ovid Metamorphoses
Petronius Satyricon
Phaerus Fabulae
Propertius Elegies
Sallust Bellum Catilinae
Suetonius Life of Augustus
Tacitus Historiae

Acknowledgement

The current UD data have been converted by Giuseppe G. A. Celano.

The Ancient Greek and Latin treebank is a result of a joint effort between Tufts University and Leipzig University (DH) under the supervision of Prof. Gregory Crane. Current editors of the treebank are Giuseppe G. A. Celano, Gregory Crane, and Bridget Almas.

Authors of the annotations are (in alphabetical order):

Giuseppe G. A. Celano, J. F. Gentile, Robert Gorman, Vanessa Gorman, Jordan Hawkesworth, Yoana Ivanova, Tovah Keynton, Florin Leonte, Alex Lessie, Daniel Lim Libatique, Meg Luthin, Francesco Mambrini, George Matthews, Jack Mitchell, Molly Miller, Jessica Nord, Sean Stewart, Anthony D. Yates, Polina Yordanova, and Sam Zukoff.

Further details can be found at:

Basic statistics

Tree count: 2273 Word count: 29138 Token count: 29138 Dep. relations: 24 of which 0 language specific POS tags: 12 Category=value feature pairs: 34

References:

Bamman, David and Gregory Crane. 2011. The Ancient Greek and Latin Dependency Treebanks. 2011. In Caroline Sporleder, Antal van den Bosch, Kalliopi Zervanou (eds.), Language Technology for Cultural Heritage, 79-98.

Celano, Giuseppe G. A., Gregory Crane, and Bridget Almas. 2014. The Ancient Greek and Latin Dependency treebank 2.0. https://github.com/PerseusDL/treebank_data

Changelog

  • 2023-11-15 v2.13

    • Orthographic univerbation of the conjunctional clitics que and ue, introducing multi-word tokens.
  • 2023-05-15 v2.12

    • Introduced missing UPOS tags: DET for determiners, PART for particles; modified deprels accordingly (e.g., det). Generalised the use of PROPN for proper nouns, and AUX for auxiiaries (mostly sum); modified deprels accordingly (e.g. flat/flat:name for proper nouns).
    • Reannotated compound numerals (e.g. viginti quattuor) from nummod to flat.
    • Replaced iobj with obl/obl:arg.
    • Added :relcl subtype to acl relative clauses.
    • Added :abs subtype for so-called ablativus absolutus, now annotated as advcl:abs.
    • Replaced parataxis to conj when used for coordination.
    • Restored correct order of conjunctions neque and nec, previously found in the text as que ne and c ne.
    • Reattached conjunctions to the second conjunct instead of the first one.
    • Added :neg subtype for negations such as non, now annotated as advmod:neg.
    • Reannotated gerund and gerundive forms as participles (VerbForm=Part) with Aspect=Prosp, of supines as VerbForm=Conv with Aspect=Prosp; based on (Cecchini, 2021).
    • Sistematically added PronType to pronouns.
    • General improvement of morphological annotation. The following morphological features have been added if missing, corrected, or deleted if non relevant: Aspect, Tense, Voice for verbs; Gender; Person for pronouns.
    • Overall, implementation of syntactic and morphologic harmonisation interventions as described in (Gamba & Zeman 2023a) and (Gamba & Zeman 2023b).
  • 2022-11-15 v2.11

    • Fixed adverbially used nominals from advmod to obl.
    • Fixed adverbially used verbs from advmod to advcl.
    • Fixed interjections from advmod to discourse.
    • Fixed non-projective punctuation using Udapi ud.FixPunct.
    • Fixed UPOS tags of copulae: from VERB to AUX.
    • Degree=Sup has been renamed to Degree=Abs in the Latin treebanks.
  • 2018-04-15 v2.2

    • Repository renamed from UD_Latin to UD_Latin-Perseus.
    • Numerical suffixes of senses removed from lemmas and stored as LId attributes in the MISC column.
  • Since UD v1.2 new texts have been added. More information on the changes in the official github repository (see link above).

=== Machine-readable metadata (DO NOT REMOVE!) ================================
Data available since: UD v1.2
License: CC BY-NC-SA 2.5
Includes text: yes
Genre: fiction nonfiction bible
Lemmas: converted from manual
UPOS: converted from manual
XPOS: manual native
Features: converted from manual
Relations: converted from manual
Contributors: Celano, Giuseppe G. A.; Zeman, Daniel; Gamba, Federica
Contributing: elsewhere
Contact: [email protected]
===============================================================================

ud_latin-perseus's People

Contributors

dan-zeman avatar gcelano avatar fginter avatar

Stargazers

Todd Cook avatar Riccardo Magliocchetti avatar Wei Cheng avatar m5w avatar Marius Jøhndal avatar

Watchers

Alexandre Rademaker avatar Robert (Munro) Monarch avatar  avatar Einar Freyr Sigurðsson avatar Bohdan Moskalevskyi avatar Kamen Bonov avatar Sampo Pyysalo avatar  avatar  avatar  avatar Martin Popel avatar James Cloos avatar Josiah Solomon avatar Nathan Schneider avatar Sarves avatar Leonel Figueiredo de Alencar avatar Andre Martins avatar Flammie Pirinen avatar Mohammad Sadegh Rasooli avatar Evpok avatar Slav Petrov avatar Amir Zeldes avatar alane suhr avatar Ronald Cardenas avatar Christopher Manning avatar Agata Savary avatar Per Erik Solberg avatar  avatar mai-om avatar Andrey Kutuzov avatar John Bauer avatar Yevgeni Berzak avatar PA avatar  avatar Mícheál John Ó Meachair avatar Djamé avatar Sara Tonelli avatar Olga Lyashevskaya avatar  avatar  avatar Wei Cheng avatar Oluokun Adedayo avatar Mehmet Oguz Derin avatar Daniel Swanson avatar Liesbeth A avatar Niko Partanen avatar  avatar  avatar Atul Kr. Ojha avatar Marie-Catherine de Marneffe avatar Joakim Nivre avatar Taraka Rama avatar Robert Pugh avatar Colin Batchelor avatar Antonis Anastasopoulos avatar So Miyagawa avatar  avatar Jan Hajic avatar Mayank Jobanputra avatar  avatar Cătălina Mărănduc avatar  avatar git2go avatar  avatar Nizar Habash avatar  avatar Elizabeth Davidson avatar  avatar Kiran Dhakal avatar Johan Hall avatar kim gerdes avatar Ludovica avatar  avatar Loïc Grobol avatar Jack Rueter avatar Koichi Yasuoka avatar  avatar Kira Droganova avatar  avatar  avatar Michael Hahn avatar  avatar  avatar  avatar Lena avatar Manon avatar Oliver Hellwig avatar Arianna Masciolini avatar Gosse Bouma avatar  avatar Slavomír /bulbul/ Čéplö avatar  avatar Mazhar Ali Dootio avatar Kristopher Kyle avatar Alexey Koshevoy avatar  avatar Wong Tak-sum 黃得森 avatar  avatar Flavio avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.