Giter Club home page Giter Club logo

tei-vanilla's People

Contributors

ottosmops avatar pascalessrq avatar tuurma avatar wolfgangmm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tei-vanilla's Issues

<unitDecl/>

In tei_vanilla.org, <unitDecl/> (as a child of <encodingDesc/>) does not exist.

In my opinion this is compulsory for all not modern texts, as the units (volume, surface, linear, ...) which where used until the 19th century where different from today's units; in plus, they where very different on a local level. At least in today's Switzerland, this was the case.

E.g. 1 malter (volume for crops) had a different size in Zürich than in Bern than in St. Gallen.

variantEncoding

Since we have encodingDesc we should consider to have variantEncoding?

<calendarDesc/>

In tei_vanilla.rng, <calendarDesc/> (as a child of <profileDesc />) does not exist.

Speaking lately -- often -- about dates and calendars, I would appreciate if the calendar used in the treated source text would be defined in <calendarDesc/>.

<msIdentifier/> > <institution/>

< msIdentifier /> in tei_vanilla.rng does not allow < institution /> as its child.

It does allow < repository />.

Question: If the archival document I'm dealing with is (=Signatur)

Stadtarchiv der Ortsbürgergemeinde St. Gallen, Altes Stadtarchiv, Tr. XVIII, 4b

, what do I fill in in < repository />? I would like to have < institution /> just for "Stadtarchiv der Ortsbürgergemeinde St. Gallen", as "Altes Stadtarchiv" is for < repository /> (as far as I can see).

persList and placeList

where would be the place in tei vanilla to place these lists, at the moment we don't have back

[Tutorials]

  • using Git
  • facsimile/IIIF: preparing and deploying images, referencing in TEI encoding, integrating facsimile viewer
  • critical apparatus
  • different kinds of notes
  • Word-based transcriptions
  • HTR-based transcriptions (e.g. Transkribus)
  • NER
  • translations: encoding and alignment
  • searches
  • server setup
  • ...

publicationStmt > authority; > idno

in tei_vanilla.rng does not allow and .

Question: Where do you write, where the archival document comes from, e.g. which archive possesses it? I would write it in .

<standOff/>

In tei_vanilla.rng, <standOff/> does not exist.

In my opinion, <standOff/> container is absolutely needed for linking things and infos.

In a small edition, where there is only 1 document, <standOff/> maybe not needed; but if there is more than 1 document, <standOff/> should be used (compulsory).

All the linking is THE big advantage we have when editing texts, compared to decades ago, when editing was done on paper. So let's use it!

[tutorial]: eltec corpus use case

ELTeC is a multilingual collection of literary texts from the public domain. The ELTeC core promises to contain "at least 10 linguistically annotated subcollections of 100 novels comparable in their internal structure in at least 10 different European languages, totalling at least 1,000 full-text novels. Novels have been chosen among major literary genres for availability and size."

An overview of the current state in ELTeC corpus building can be found here.

This use case study aims to walk the reader through:

  • major decisions on encoding and schema for currently available ELTeC Level1 corpora
  • gathering and organizing the data for publication
  • creation of TEI Publisher 7 application
  • specifying indexes required for search and filtering
  • extending the default TEI Publisher API to cover basic linguistic features, e.g. frequency lists
  • practical consequences of encoding decisions in further steps (e.g. encoding document structure and navigation, flat vs hierarchical taxonomies etc)
  • internationalization

Guidelines: document structure

  • singular documents

  • composite documents (diaries, copiaries, antologies etc)

  • parallel translations

  • basic document structure

    • consequences for navigation

<physDesc/> > <handDesc/>

In tei_vanilla.rng, in <physDesc/> <handDesc/> is not allowed.

I would appreciate if <handDesc/> is allowed, to say if just 1 hand has written my source, or more.

<editorialDecl/>

In tei_vanilla.rng, in <editorialDecl/> it says

<oneOrMore>
    <choice>
        <ref name="tei_model.pLike"/>
        <ref name="tei_model.editorialDeclPart"/>
    </choice>
</oneOrMore>

But:

<define name="tei_model.editorialDeclPart">
    <notAllowed/>
</define >

This doesn't make any sense to me. If it's not allowed, you can just not write it?

<lg/> and others

One of our participants in our workshop is working with a source written in Persian, and he wants to edit the text and add a translation.

Please allow this in the schema.

Guidelines: entity encoding

Metadata

  • persons

    • names & variant names
      • different language versions
    • relationship with places
    • relationship with other people
    • dates (birth, death, other major events)
    • other biographical information and commentaries
    • linking with authority files
  • places

    • names & variant names
      • different language versions
    • geographical coordinates
    • relationships with other places
    • other information and commentaries
    • linking with authority files

References

  • references by name (persName, placeName, orgName, name type="person|place|org"
  • referring strings (rs)
  • identifying the entity
    • pointing to a registry (listPerson/person, listPlace/place, listOrg/org) in the same document (teiHeader/profileDesc)
    • pointing to an aggregated registry within the project (separate TEI documen)
    • pointing to an external resource
      • by full URL
      • using private prefix and identifier

ODD Processing & HTML templates to handle different use case scenarios

  • retrieval from the same document
  • retrieval from a document in db
  • retrieval from external URI/API endpoint

Guidelines: i18n

  • which standard to use for language encoding
  • marking up the primary language of the document
  • encoding full information about languages occurring in the document
  • encoding of foreign words or passages
  • variant names in different languages

locus in msItem

I think that the element locus is quite essential in the msItem.

Guidelines: dating

  • dates in transcription
    • date vs origDate
    • handling uncertainty
    • bc dates and is 0000 a valid year
    • pre-gregorian dates
  • dating in the header
    • date of creation of original source document
    • date(s) of publication
    • period of coverage for the subject of the document
    • ...

<editorialDecl/>

In tei_vanilla.rng, <editorialDecl/> allows free text (<ref name="tei_model.pLike"/>).

This seems like too much freedom for me. I would prefer this to be more structured, e.g. more restricted definitions.

In my opinion, at least <correction/> and <normalization/> have to be compulsory.

This (=<correction/> and <normalization/>) is makes the difference between a simple transcription of a source text and an edition (or scholarly edition).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.