Giter Club home page Giter Club logo

eltec-cze's Introduction

DOI

ELTeC-cze

This is the Czech Novel Corpus for the ELTeC, the European Literary Text Collection, produced by the COST Action Distant Reading for European Literary History (CA16204, https://distant-reading.net).

Contributors

This repository is maintained by the Institute of the Czech National Corpus, Charles University, Prague, https://ucnk.ff.cuni.cz/en/.

Licence

All texts included in this collection are in the public domain.

Citation suggestion

If you use this corpus in your research or teaching, please follow good scholarly practice and use the following citation suggestion to acknowledge your source:

  • Czech Novel Collection (ELTeC-cze), edited by the Institute of the Czech National Corpus. Version v1.0.0, April 2021. In: European Literary Text Collection (ELTeC). COST Action Distant Reading for European Literary History. DOI: https://doi.org/10.5281/zenodo.4662721.

Release notes

General information about ELTeC releases is available at https://github.com/COST-ELTeC/ELTeC.

  • v1.0.0, April 2021: This release includes 100 novels in level 1 encoding. The DOI of this release is: 10.5281/zenodo.4662721

eltec-cze's People

Contributors

annarehorkova avatar lb42 avatar jeziorsky avatar carolinodebrecht avatar christofs avatar michkren avatar

Stargazers

 avatar

Watchers

Magdalena Turska avatar  avatar Dan Zeman avatar James Cloos avatar Fotis Jannidis avatar Tomaž Erjavec avatar  avatar Joanna Byszuk avatar  avatar Gábor Palkó avatar Fabio Ciotti avatar  avatar Anna-Maria Sichani avatar Borja Navarro-Colorado avatar  avatar Martina Scholger avatar  avatar

eltec-cze's Issues

Validity issue: incomplete idno

The TEI Headers are currently not valid because they have an incomplete idno element:

<idno type=""></idno>

is not allowed

Inconsistent links to page images

Most titles include a link to a catalogue in the Czech National Library which provides links to digitized page images of the source (in DJVU format). This link appears in a <bibl type='digitalSource'>, which implies that the ELTeC source was derived directly from those digitized page images.
Five titles (CS0005, 8, 9, 10) provide links to Google books digitizations, which are given in a <bibl type='unspecified'>. For consistency these should probably be in a <bibl type=digitalSource> as well. On the other hand, maybe the page image links should be supplied as a child of the <bibl type="printSource"> describing the printed source text?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.