Giter Club home page Giter Club logo

collaborative-platform's People

Watchers

 avatar  avatar  avatar  avatar  avatar

collaborative-platform's Issues

Make setup instructions IDE-agnostic

As of now, the README contains instructions for PyCharm IDE.
These instructions should work the same for users who do not work with PyCharm.
I think the best thing to do is just to use shell commands.

Changes in annotator needed for Ingredient integration

We have to make some changes in annotator to incorporate Recipes usecase.

The plan is to use <listObject>'s as lists of ingriedients, utensils ets., and in the text of the recipe to annotate only with <objectName> tag. Here is the example:

<body>
            <div type="ingredient">
                <listObject>
                    <object type="ingredient" xml:id="egg">Egg</object>
                    <object type="ingredient" xml:id="sugar">Sugar</object>
                </listObject>
            </div>
            <div type="utensil">
                <listObject>
                    <object type="utensil" xml:id="spoon">Spoon</object>
                </listObject>
            </div>
            <div type="recipe">
                <p>
                    Take three <objectName ref="egg">eggs</objectName>, mix with three <objectName ref="spoon">spoons</objectName> of <objectName ref="sugar">sugar</objectName>. Bon Appetite!
               </p>
            </div>
 </body>

Desired behaviour is that when user annotates new ingredient/utensil/etc., backend will automatically add it to the adequate list, so annotated recipes will always start with proper lists of requisites.
To enable user to annotate in this manner we need changes in annotator.
@Janchorizo, we need changes in annotator frontend. In "Add TEI annotations" tab, when Ingridient, Utensil or ProductionMethod is choosen, we want a list to appear, with avaliable choices of ingridients, utensils or production methods already avaliable in the lists in the file. We can leave parsing those lists into options to you, or we can provide you with the API endpoint returning those. Another option in the list should be "Add new". In case user choses it, the next textField should appear, allowing user to enter name for the new ingriedient/utensil/etc.
Request created by annotator should consist of positions like before, ref set in attribute_name parameter, and asserted_value containing id of object from the list if user chosen an object already avaliable in the file, or the new name inserted by the user otherwise.
@bug-rancher is already changing annotator backend accordlingly.
Also, we'd like @Janchorizo to render those list in more readable manner, possibly adding some headers, and spliting into separate lines.

Models not being created for api_vis

During the migration to using to docker for development but have had some issues regarding the database.
For some reason migrations were not being applied. When creating the migrations individually, Django raised the following exception for the api_vis
model:

  File "/usr/local/lib/python3.7/site-packages/django/db/migrations/loader.py", line 49, in __init__
    self.build_graph()
  File "/usr/local/lib/python3.7/site-packages/django/db/migrations/loader.py", line 275, in build_graph
    self.graph.ensure_not_cyclic()
  File "/usr/local/lib/python3.7/site-packages/django/db/migrations/graph.py", line 274, in ensure_not_cyclic
    raise CircularDependencyError(", ".join("%s.%s" % n for n in cycle))
django.db.migrations.exceptions.CircularDependencyError: projects.0001_initial, api_vis.0001_initial

Seems to me like there is some kind of circular dependency. I would like to know if you have had any similar problems.

User should define names, icons and colours for the TEI entities.

Related to #32

Given that there is already a model for storing the preferences for displaying uncertainty,
we could also provide the ability to define what entities are to be displayed, along with the
colors and icons that should be used.

This could have other implications such as the fact that newer entities would be needed to
be stored and indexed.

This issue could be associated just with the styling and the indexing be done further in time.

Close reading history API call no longer retrieves the contributor.

In branch api-vis.

Calls to the API retrieve file versions with the contributor field empty.

{"url": "https://providedh.ehum.psnc.pl/files/6/version/1/", "version": 1, "ignorance": 0, 
"timestamp": "2019-12-17 15:38:46.520574+00:00", "variation": 0, "contributor": "", 
"credibility": 1, "imprecision": 1, "incompleteness": 2}, 

{"url": "https://providedh.ehum.psnc.pl/files/6/version/2/", "version": 2, "ignorance": 0, 
"timestamp": "2019-12-17 15:38:46.599874+00:00", "variation": 0, "contributor": "", 
"credibility": 1, "imprecision": 1, "incompleteness": 2}, 

{"url": "https://providedh.ehum.psnc.pl/files/6/version/3/", "version": 3, "ignorance": 0,
 "timestamp": "2019-12-17 15:57:55.359992+00:00", "variation": 0, "contributor": "", 
"credibility": 1, "imprecision": 1, "incompleteness": 2}, 

{"url": "https://providedh.ehum.psnc.pl/files/6/version/4/", "version": 4, "ignorance": 0, 
"timestamp": "2019-12-17 16:07:45.151583+00:00", "variation": 0, "contributor": "", 
"credibility": 1, "imprecision": 1, "incompleteness": 2}]}```

Annotation creation fails for files containing non-valid xml:id values.

def __create_certainty_description(self, json, annotation_ids, user_uuid):
target = " ".join(annotation_ids)
xml_id = f"certainty_{self.__file.name}-{self.__certainty_xml_id_number}"
categories = " ".join([get_ana_link(self.__file.project_id, cat) for cat in json["categories"]])
certainty = f'<certainty ana="{categories}" locus="{json["locus"]}" cert="{json["certainty"]}" ' \
f'resp="#{user_uuid}" target="{target}" xml:id="{xml_id}"/>'
new_element = etree.fromstring(certainty)

def __create_certainty_description_for_attribute(self, json, annotation_ids, user_uuid):
target = " ".join(annotation_ids)
xml_id = f"certainty_{self.__file.name}-{self.__certainty_xml_id_number}"
categories = " ".join([get_ana_link(self.__file.project_id, cat) for cat in json["categories"]])
certainty = f'<certainty ana="{categories}" locus="{json["locus"]}" cert="{json["certainty"]}" ' \
f'resp="#{user_uuid}" target="{target}" match="@{json["attribute_name"]}"' \
f'assertedValue="{json["asserted_value"]}" xml:id="{xml_id}"/>'
new_element = etree.fromstring(certainty)

Annotation fails when an xml element is attempted to be created for files which name contains
non valid xml:id values sucha as spaces, semicolons, or others.

This error does not raise an exception that I could see and no response is sent. However,
consecutive calls will fail as the websocket closes after a Timer Exception is raised.

Wrong message after file save in Annotator

Annotator after saving file should display message returned in message field in response. Now even when response has 304 code, Annotator display message "Changes successfully saved.".

Incorrect fragment positions in Annotator "create" request

When user try to add a second tag to tagged fragment, fragment positions are incorrect.
This bug occur on Firefox, but not on Chromium.

How to replicate bug:
On branch "reference_to_element_of_the_list"
Tested on "recipe_0.xml" file: https://raw.githubusercontent.com/providedh/ACDH_Salzburg_recipes/master/outputs/recipe_0.xml

  1. Add first annotation to "Guetten" fragment:
"tag_name": "Ingredient"
"attribute_name": "ref"
"asserted_value": "object_recipe_0_xml-1"

Posisions in this request pointing to "Guetten" fragment, so it's ok

  1. Add second annotation to annotated "Guetten" fragment:
"tag_name": "Ingredient"
"attribute_name": "ref"
"asserted_value": object_recipe_0_xml-2

Positions in this request pointing to "objectName ref="#object_recipe_0_xml-1">Guetten". Selected fragment can't start or end in the middle of the tag.

Allow the category attribute to have none or multiple values

The platform should handle multiple values for the category attribute. This includes
indexing, visual hints and form options in the annotator, and back-end support for
such annotation requests.

  • Change TEI specification if needed.
  • Front-end support for the annotator.
  • Back-end support for the annotator.
  • Front-end support for the TEI stats app.
  • Back-end support for the TEI stats app.

Better color schemes and glyphs for representing uncertainty in annotator

As of now, we are using the scheme of underlining + icons in the annotator to indicate the different kinds/degrees of uncertainty in the text. However, now that we are supporting 5 levels + 2 categories, I've come to think this approach we have been using might be no longer valid.

I wonder if there's a better combination of colors/glyphs/other techniques to convey the 4 (with their respective 5 levels) + 1 categories in the annotator. For example, we could map categories to different text underline styles as illustrated here. For selecting color maps we could fall back to VSUPs: explanation and d3 code (or better, a modification of it).

Annotator integration

Needed changes to finish migration of previous annotation app to the new platform.

  • Create stream-graph on timeline to show uncertainty evolution.
  • Apply previous changes to API.
  • Fix save and history AJAX calls (dependent on previous point).
  • Add alerts for document load, reload, save, annotation, etc.
  • Solve branch for automatic merge.

Creating uncertainty annotations creates ingredient tags.

If the asserted value for locus=name or the tag name is either ingredient, utensil,
or productionMethod, the a tag with such name is created. This is not the desired
behavior if we were to use lists of objects and the correspondent objectName tag in
the body.

Update markups and side view in Annotator with unifications from database

Because now we keep all entity unifications and certainties added to this unifications in database instead of xml file, we need to render this elements in Annotator additionally. Message sent by WebSockets was extended by one more field certainties_from_db, with list of unifications and certainties inside. Every unification/certainty has extra boolean field committed. Committed unifications and certainties for them are sent to all connected users, and uncommitted unifications and certainties for them are sent only to theirs author.

Because of there is a lot of approaches to convert xml to json, I used this standard: https://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html (same as in "Retrieve metadata from header of a TEI file" point of file: https://docs.google.com/document/d/102pYWR1t7Ve2mjuFC5juithyVGAPVqFAPa54qq_yn-M/edit#heading=h.uqg41mc3u89)

Example content of certainties_from_db field:

"certainties_from_db":
[
  {
    "certainty": 
    {
      "@ana": "", 
      "@locus": "value", 
      "@cert": "high", 
      "@resp": "#person2", 
      "@target": "#person_dep_833105r082_tei_depositions_plus_original-1", 
      "@match": "@sameAs", 
      "@assertedValue": "#person_dep_833105r082_tei_depositions_plus_original-2 #person_dep_833105r082_tei_depositions_plus_original-3 Project_1/dep_834165r133_tei_depositions_plus_original.xml#person_dep_834165r133_tei_depositions_plus_original-1 Project_1/dep_834165r133_tei_depositions_plus_original.xml#person_dep_834165r133_tei_depositions_plus_original-2 Project_1/dep_834165r133_tei_depositions_plus_original.xml#person_dep_834165r133_tei_depositions_plus_original-3", 
      "@xml:id": "certainty_dep_833105r082_tei_depositions_plus_original.xml-1"
    }, 
    "committed": true
  }, 
  {
    "certainty": 
    {
      "@ana": "https://providedh.ehum.psnc.pl/api/projects/1/taxonomy/#ignorance", 
      "@locus": "value", 
      "@cert": "medium", 
      "@resp": "#person2", 
      "@target": "#certainty_dep_833105r082_tei_depositions_plus_original.xml-1", 
      "@match": "@sameAs", 
      "@xml:id": "certainty_dep_833105r082_tei_depositions_plus_original.xml-64", 
      "@assertedValue": "some asserted value", 
      "desc": "awesome description"
    }, 
    "committed": true
  }, 
  ...
]

Target in new xml:id-based annotations don't start with #.

The following excerpt is the result of creating a text selection annotation, and a second one using
the sidebar to add one by the id.

<classCode scheme="http://providedh.eu/uncertainty/ns/1.0">
          <certainty 
                  ana="https://example.com/api/projects/1/taxonomy/#ignorance https://example.com/api/projects/1/taxonomy/#credibility https://example.com/api/projects/1/taxonomy/#imprecision https://example.com/api/projects/1/taxonomy/#incompleteness" 
                  locus="value" 
                  cert="unknown"                   
                  resp="#person1" 
                  target="#date_dep_821026r012_tei__1__xml-1" 
                  xml:id="certainty_dep_821026r012_tei__1__xml-2" 
                  assertedValue="asd"/>
          <certainty 
                  ana="https://example.com/api/projects/1/taxonomy/#credibility https://example.com/api/projects/1/taxonomy/#imprecision" 
                  locus="name" 
                  cert="very high" 
                  resp="#person1" 
                  target="date_dep_821026r012_tei__1__xml-1" 
                  xml:id="certainty_dep_821026r012_tei__1__xml-4"/>
</classCode>

Support 5 different levels of uncertainty in user/machine annotations

Decide on the way to approach and implement the necessary changes to support
specifying a floating number for the certainty level of an annotation. This includes
changing the used attribute (possibly to deg), changing the annotator to process
each specific tag and assign styles based on the numerical value, and adding
back-end support for annotations made using this new approach.

Plan supporting persistent filters in the collaborative

Design and discuss how persistent filters can be integrated in the platform.
Such should take into account that a common use case to be supported is:

  • A user is asking for an entity search response but had previously filtered out some
    documents in another app using the locations occurring in the text.

User should define names, descriptions and colours of uncertainty categories

When project is created a user should have possibility to change names, descriptions and colors for user recognized uncertainties. The default values for names and descriptions are in the following template TEI file with default taxonomy:

<?xml version="1.0" encoding="UTF-8"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">	
  <teiHeader>
    <fileDesc>
      <titleStmt>
        <title>Uncertainty Taxonomy for project @ProjectName</title>
      </titleStmt>
    </fileDesc>
    <encodingDesc>
      <classDecl>
        <taxonomy>
          <category>
            <catDesc>User recognized uncertainty</catDesc>
            <category xml:id="ignorance">
              <catDesc>Ignorance</catDesc>
              <desc>Ignorance is related to the fact that information could have been incorrectly assessed by the person gathering or organizing the data. It is also possible that people, not fully sure about how to deal with data, ignore some information and generate uncertainty during the evaluation and decision processes.</desc>
            </category>
            <category xml:id="credibility">
              <catDesc>Credibility</catDesc>
              <desc>Credibility concerns the weight that an agent can attach to its judgment. This concept can be linked to that of biased opinions, which are related to personal visions of the landscape, which can make for significant variations between different groups and individuals, given their backgrounds.</desc>
            </category>
            <category xml:id="imprecision">
              <catDesc>Imprecision</catDesc>
              <desc>Imprecision corresponds to the inability to express the true value because the absence of experimental values does not allow the definition of a probability distribution or because it is difficult to obtain the exact value of a measure.</desc>
            </category>
            <category xml:id="incompleteness">
              <catDesc>Incompleteness</catDesc>
              <desc>Incompleteness corresponds to the fact that not all situations are covered. Often it is impossible to know every possible option available.</desc>
            </category>
          </category>
          <category>
            <catDesc>Machine generated uncertainty</catDesc>
            <category xml:id="algorithmic">
              <catDesc>Algorithmic</catDesc>
            </category>
          </category>
        </taxonomy>
      </classDecl>
    </encodingDesc>
  </teiHeader>
</TEI>

When user define these values for a project, then colors should be stored in the database, and names and descriptions in the generated TEI file with a taxonomy for this project. Identifiers for categories should be generated form names by applying lowercase method and replacing whitespaces with a dash.

After defining these values, only colors should be changable in Project Settings View.

The genareted TEI file should be exposed in some URL in a scope of the project in order to ana attribute could refer to the nodes of this XML. It's connected with the task #11.

There is a need for API that returns the names and colors of all uncertainty categories, for instance in order to the annotator could properly handle them.

Support “machine-generated” uncertainty category

Decide o the way to approach and implement the necessary changes needed to
support a new machine-generated category. This would include handling this
new value in the annotator.

A possible approach at this would be relying on the resp attribute to associate
the annotation to a specific algorithm. This would allow to specify further information regarding the algorithm. This approach would probably require the category
attribute to be optional.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.