Giter Club home page Giter Club logo

atomic's Introduction

Build Status

Repository will be archived soon!
⚠️ Please note that Atomic is not developed anymore. It serves as architectural prototype for Hexatomic, which is currently under development at https://github.com/hexatomic/hexatomic.
This repository will be archived soon!

Atomic

Software for multi-level annotation of linguistic corpora

Build

mvn install builds the core plugins for Atomic.

Then there are also three Maven profiles, each of which builds a specific version of Atomic:

  1. mvn install -P stable builds only stable features into repository/target/products/.
  2. mvn install -P preview builds stable features and those that can be used productively with caution, and that may include bugs, into repository-preview/target/products/.
  3. mvn install -P experimental builds stable and preview features, and those that are experimental, and hence should not be used productively, into repository-experimental/target/products/).

Build documentation

Documentation is built separately from the product build. If you want to include up-to-date docs in the product build, build docs before product. Documentation has a separate README with details on how to build.

atomic's People

Contributors

florianzipser avatar mgruebsch avatar sdruskat avatar thomaskrause avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

atomic's Issues

error when renaming the toplevel folder

If you rename the folder e.g. when you want to rename the corpus you can't open the documents after you renamed the folder.

atomicrename_1

atomicrename_2

The error message indicates that the SDocument node in the SaltProject file is not updated

  <nodes xsi:type="sCorpusStructure:SCorpus">
      <labels xsi:type="saltCore:SFeature" namespace="salt" name="SNAME" value="ACED0005740006636F72707573" valueString="corpus"/>
      <labels xsi:type="saltCore:SElementId" namespace="graph" name="id" value="ACED000574000C73616C743A2F636F72707573" valueString="salt:/corpus"/>
    </nodes>
    <nodes xsi:type="sCorpusStructure:SDocument">
      <labels xsi:type="saltCore:SFeature" namespace="salt" name="SNAME" value="ACED000574000F636F727075735F646F63756D656E74" valueString="corpus_document"/>
      <labels xsi:type="saltCore:SElementId" namespace="graph" name="id" value="ACED000574001C73616C743A2F636F727075732F636F727075735F646F63756D656E74" valueString="salt:/corpus/corpus_document"/>
      <labels xsi:type="saltCore:SFeature" namespace="salt" name="SDOCUMENT_GRAPH_LOCATION" value="ACED000574004A66696C653A2F686F6D652F74686F6D61732F61746F6D69632D776F726B73706163652F53656365646765546573742F636F727075732F636F727075735F646F63756D656E742E73616C74" valueString="file:/home/thomas/atomic-workspace/SecedgeTest/corpus/corpus_document.salt"/>
    </nodes>

Atomic Version is v0.1.9

Catch "Widget is disposed" errors

When the Annotation Perspective is open, and one of either Level or Sentence View are closed, click actions trigger above exception...

Atomic crashes when trying to use the SaltXMLImporter on a folder without a saltProject.salt file

I tried to open a *.salt file (describing an SDocumentGraph) that I had generated with SaltNPepper. I pointed the Pepper Module Import/SaltXMLImporter to the directory that contained the *.salt file. This is what happened:

Exception in thread "PepperModuleController[SaltXMLImporter]" de.hu_berlin.german.korpling.saltnpepper.salt.saltCommon.exceptions.SaltResourceNotFoundException: Cannot load Object, because the file '/home/arne/saltProject.salt' does not exist.
    at de.hu_berlin.german.korpling.saltnpepper.salt.impl.SaltFactoryImpl.load(SaltFactoryImpl.java:148)
    at de.hu_berlin.german.korpling.saltnpepper.salt.impl.SaltFactoryImpl.loadSCorpusGraph(SaltFactoryImpl.java:273)
    at de.hu_berlin.german.korpling.saltnpepper.salt.impl.SaltFactoryImpl.loadSCorpusGraph(SaltFactoryImpl.java:257)
    at de.hu_berlin.german.korpling.saltnpepper.salt.saltCommon.sCorpusStructure.impl.SCorpusGraphImpl.load(SCorpusGraphImpl.java:542)
    at de.hu_berlin.german.korpling.saltnpepper.pepperModules.saltXML.SaltXMLImporter.importCorpusStructure(SaltXMLImporter.java:66)
    at de.hu_berlin.german.korpling.saltnpepper.pepper.pepperFW.impl.PepperModuleControllerImpl.realImportCorpusStructure(PepperModuleControllerImpl.java:572)
    at de.hu_berlin.german.korpling.saltnpepper.pepper.pepperFW.impl.PepperModuleControllerImpl.run(PepperModuleControllerImpl.java:427)
    at java.lang.Thread.run(Thread.java:744)

Kind regards,
Arne

Relationen benennen, Tagsets definieren, Graph aufräumen

  1. Besteht die Möglichkeit, die Relationen zu benennen?
  2. Besteht die Möglichkeit, Annotationsebenen und ein Tagsets zu definieren, sodass man nicht wiederholt „xy:abc“ beim Benennen der Knoten eingeben muss.
  3. Ideal, aber nicht notwendig, wäre noch eine Art Aufräumfunktion, die bei sehr komplexen Konstituentenstrukturen die annotierten Strukturen platzsparend anordnet.

Pointing-Relationen werden weiterhin nicht bzw. falsch exportiert

In Atomic 0.1.9 werden jetzt zwar SPAN und STRUCT nach RelANNIS exportiert. Die Pointing-Relationen werden aber (von struct-Knoten zu Token oder zwischen zwei Span-Knoten) nicht übertragen.

Ich kann lediglich erkennen, dass beim PAULA-Export folgender Fehler nach wie vor besteht, der dann auch bei der Konvertierung mit SnP angegeben wird: Eine pointing-Relation von einem Strukturknoten zu einem Token wird so angegeben:

[Spitze Klammer] rel id="sPointingRel2" xlink:href="null#structure1" target="nolayer.corpus_document.tok.xml#sTok13"/>

hinter "xlink:href="" müsste der Dateinanem erscheinen, und nicht null stehen.

Pointing-Relationen zwischen Span-Annotationen -also zwischen zwei Einheiten, die aus mehr als einem Token bestehen, werden gar nicht expoertiert.

Beste Grüße

Easier way to create new project on first start [enhancement idea]

Currently, for creating a new project you have to right click on the empty space of the "Navigation" panel and select the right context menu entry. On the first startup this can be difficult to find (the user basically only sees an empty space). One could imagine to

to make the first startup easier (even if the user did not read the manual).

Add feedback to AtomicAL

E.g., on having processed an annotation command such as
"e n1 t1 x:y",
print feedack to command-line:
"Created dominance relation with annotation x:y from node 1 to token 2".

Include assertions before feedback.

Implement colour codes for annotation namespaces

This would need:

  • a preference sheet where namespaces can be read into / manually defined
  • a specific colour on the annotation (also set via prefs page)
  • a view to show namespace in use + their colour

Fehler beim Exportieren nach ANNIS

  1. Das Exportieren nach RelANNIS funktionieren bei mir nicht. Der Exporter läuft immer weiter – ohne Ergebnis.
  2. Bei Exporten von Syntax-Bäumen nach PAULA tritt bei mir folgender Fehler auf:
    Im struct-Dokument werden die untergeordneten Relationen falsch ausgegeben. Sie enthalten nach dem id-Wert und den Anführungszeichen noch weitere Anführungszeichen (1x zu viel).
  struct id="structure1">
  rel id="sDomRel1" " xlink:href="nolayer.corpus_document.tok.xml#sTok4"/>
  rel id="sDomRel2" " xlink:href="nolayer.corpus_document.tok.xml#sTok3"/>
  /struct>
  1. Ich habe die überflüssigen Anführungszeichen entfernt und habe PAULA mit saltnpepper dann nach ReLANNIS konvertiert. In ANNIS wurden dann aber die Bezeichnungen der Knoten nicht angezeigt.

Rename AtomicSaltEditor

Either to "Atomic Sentence Editor" or to "Atomic Span Editor" or to "Atomic Graph Editor"

Menus don't show keyboard shortcut text next to menu item

When running Atomic on Linuxes utilizing the Unity GUI, menus do not include the keyboard shortcut text which should appear to the right of the menu items (e.g., Quit CTRL+Q).

This is due to the fact that Eclipse does not (fully) support the Unity Global Menu, where menus appear in the WM GUI rather than the application. Therefore, key bindings do work in Atomic on Unity Linux (i.e., Ubuntu), but the shortcut is not shown.

However, Unity can be forced to drop use of the Global Menu for applications. To do so, start the application with the prefix UBUNTU_MENUPROXY=0, as in UBUNTU_MENUPROXY=0 ./atomic.

You can start Atomic as above to see keyboard shortcuts in the menus.

This will be fixed by including a custom starter which will be included in Linux-targeted products, i.e., a shell script that can be used instead of the default launcher file atomic. This shell script will do nothing but call UBUNTU_MENUPROXY=0 ./atomic.

Annotation mit Console - verschobene Kantenbezeichnungen

Beim Einsatz der Console ist mir wiederholt ein Fehler aufgefallen: Ich wollte beispielsweise die Kante D9 mit x:y versehen und habe "a d9 x:y" eingegeben. Annotiert wurde x:y aber bei Kante D10. Der Fehler tritt nicht mehr auf, wenn man das Projekt speichert und neu öffnet. Es könnte sein, dass der Fehler dadurch bedingt wird, dass ich vorher eine oder mehrere Kanten gelöscht habe. Es scheint so zu sein, dass die Console mit einem falschen Wert oder einer falschen Zahl arbeitet, die beim Schließen bzw. erneuten Öffnen berichtigt wird. Sollte ich nochmal auf den Fehler treffen, schaue ich mir alles nochmal genauer an und leite es weiter.

atomic folder in zip

Please put all files into atomic folder before zipping, otherwise unzipping can mess up the target folder.

Add sectioning and navigation capabilities to Annotation Graph Editor

For larger corpora (which may take long to load in the Annotation Graph Editor (AGE)), it should be possible to navigate smaller portions of the corpus.

Navigation should be possible by

  1. sentences
  2. selection of text

For 1., the corpus documents must be run through sentence detection. This should be possible a) in the New Project Wizard, b) on import (extra option "document contains sentence spans"), c) when opening a corpus document already existing in the workspace for which sentences haven't been detected.
Sentence detection via Apache OpenNLP ("official" & own models), regex, and third-party sentence detectors (via extension point) should be possible.
Sentences are added to the model as ranges of indices, and as sorted lists of tokens via SProcessingAnnotations on the SDocumentGraph.
Working on these sentences are the following UI elements.

  • A consecutively numbered list of sentences with checkboxes next to them. This view serves as a) access point to a sentence's ID (= consecutive number) and b) a way to select sentences to display in the AGE. The view also provides a Deselect all button.
  • A group of three menu elements (a button <, a textfield displaying the sentence ID, a button >), which makes it possible to navigate through the sentences by ID (use up and down button, enter sentence ID).
  • A set of key bindings (e.g., PgUp, PgDown to go to next, previous sentence).
  • A set of AtomicAL commands for navigation.

For 2., a view should be provided displaying the whole corpus text, which is selectable. The selection will be displayed in the AGE.

Refactor assignment of token IDs in Annotation Graph Editor to be in ascending order across the whole corpus

As requested by @Annotation-123.

As of ffc4185, tokens IDs are assigned in the order in which sentences are selected for display in the Sentence View. I.e., if sentence 5 with 10 tokens is selected, those will get IDs T1-T10. If then, sentence 1 with 10 tokens selected, its tokens will be T11-T20. If both sentences are displayed simultaneously, the order will be T11, T12 .. T19, T20, T1, T2 .. T9, T10.

This is not intuitive and may lead to issues when creating annotations for the whole corpus text.

Hence, token IDs should be in natural ascending order based on their index in the text. This could be done, e.g., in the sentencing step.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.