providedh / collaborative-platform Goto Github PK
View Code? Open in Web Editor NEWCollaboration made easy
License: GNU Affero General Public License v3.0
Collaboration made easy
License: GNU Affero General Public License v3.0
As of now, the README contains instructions for PyCharm IDE.
These instructions should work the same for users who do not work with PyCharm.
I think the best thing to do is just to use shell commands.
Prepare a branch with simplified registration system just for the workshops, as Austrian party requested.
We have to make some changes in annotator to incorporate Recipes usecase.
The plan is to use <listObject>'s
as lists of ingriedients, utensils ets., and in the text of the recipe to annotate only with <objectName>
tag. Here is the example:
<body>
<div type="ingredient">
<listObject>
<object type="ingredient" xml:id="egg">Egg</object>
<object type="ingredient" xml:id="sugar">Sugar</object>
</listObject>
</div>
<div type="utensil">
<listObject>
<object type="utensil" xml:id="spoon">Spoon</object>
</listObject>
</div>
<div type="recipe">
<p>
Take three <objectName ref="egg">eggs</objectName>, mix with three <objectName ref="spoon">spoons</objectName> of <objectName ref="sugar">sugar</objectName>. Bon Appetite!
</p>
</div>
</body>
Desired behaviour is that when user annotates new ingredient/utensil/etc., backend will automatically add it to the adequate list, so annotated recipes will always start with proper lists of requisites.
To enable user to annotate in this manner we need changes in annotator.
@Janchorizo, we need changes in annotator frontend. In "Add TEI annotations" tab, when Ingridient, Utensil or ProductionMethod is choosen, we want a list to appear, with avaliable choices of ingridients, utensils or production methods already avaliable in the lists in the file. We can leave parsing those lists into options to you, or we can provide you with the API endpoint returning those. Another option in the list should be "Add new". In case user choses it, the next textField should appear, allowing user to enter name for the new ingriedient/utensil/etc.
Request created by annotator should consist of positions like before, ref set in attribute_name parameter, and asserted_value containing id of object from the list if user chosen an object already avaliable in the file, or the new name inserted by the user otherwise.
@bug-rancher is already changing annotator backend accordlingly.
Also, we'd like @Janchorizo to render those list in more readable manner, possibly adding some headers, and spliting into separate lines.
Provide the user with the ability to convert plain-text documents to the TEI format.
During the migration to using to docker for development but have had some issues regarding the database.
For some reason migrations were not being applied. When creating the migrations individually, Django raised the following exception for the api_vis
model:
File "/usr/local/lib/python3.7/site-packages/django/db/migrations/loader.py", line 49, in __init__
self.build_graph()
File "/usr/local/lib/python3.7/site-packages/django/db/migrations/loader.py", line 275, in build_graph
self.graph.ensure_not_cyclic()
File "/usr/local/lib/python3.7/site-packages/django/db/migrations/graph.py", line 274, in ensure_not_cyclic
raise CircularDependencyError(", ".join("%s.%s" % n for n in cycle))
django.db.migrations.exceptions.CircularDependencyError: projects.0001_initial, api_vis.0001_initial
Seems to me like there is some kind of circular dependency. I would like to know if you have had any similar problems.
At least divide text to divisions (div elements), headers (head elements) and paragraphs (p elements).
Add a descriptive landing page for newcomers to get an understanding of the
available functionality.
The relational database already holds the definition of ingredients in on the table. This can be used to have ingredients (and other entities) already annotated in the TEI documents.
Implementing a decorator for Django views that allows to do a/b testing by providing a set of static files to be sorted.
As the title
Related to #32
Given that there is already a model for storing the preferences for displaying uncertainty,
we could also provide the ability to define what entities are to be displayed, along with the
colors and icons that should be used.
This could have other implications such as the fact that newer entities would be needed to
be stored and indexed.
This issue could be associated just with the styling and the indexing be done further in time.
As the title.
Dependent on #8
One of the needed changes related to #14
Add the option for modifying the current entity list of colours and icons. It would require to make choices permanent.
In branch api-vis
.
Calls to the API retrieve file versions with the contributor
field empty.
{"url": "https://providedh.ehum.psnc.pl/files/6/version/1/", "version": 1, "ignorance": 0,
"timestamp": "2019-12-17 15:38:46.520574+00:00", "variation": 0, "contributor": "",
"credibility": 1, "imprecision": 1, "incompleteness": 2},
{"url": "https://providedh.ehum.psnc.pl/files/6/version/2/", "version": 2, "ignorance": 0,
"timestamp": "2019-12-17 15:38:46.599874+00:00", "variation": 0, "contributor": "",
"credibility": 1, "imprecision": 1, "incompleteness": 2},
{"url": "https://providedh.ehum.psnc.pl/files/6/version/3/", "version": 3, "ignorance": 0,
"timestamp": "2019-12-17 15:57:55.359992+00:00", "variation": 0, "contributor": "",
"credibility": 1, "imprecision": 1, "incompleteness": 2},
{"url": "https://providedh.ehum.psnc.pl/files/6/version/4/", "version": 4, "ignorance": 0,
"timestamp": "2019-12-17 16:07:45.151583+00:00", "variation": 0, "contributor": "",
"credibility": 1, "imprecision": 1, "incompleteness": 2}]}```
During unification in Annotator (set Attribute name
to sameAs
), there is no field attribute_name
in request send to WebSocket.
If user at first write "sameAs" in Attribute name
field, and then change Tag name
to Person, References
field will not show up.
Annotation fails when an xml element is attempted to be created for files which name contains
non valid xml:id values sucha as spaces, semicolons, or others.
This error does not raise an exception that I could see and no response is sent. However,
consecutive calls will fail as the websocket closes after a Timer Exception is raised.
Annotator after saving file should display message returned in message
field in response. Now even when response has 304 code, Annotator display message "Changes successfully saved.".
As in title.
When user try to add a second tag to tagged fragment, fragment positions are incorrect.
This bug occur on Firefox, but not on Chromium.
How to replicate bug:
On branch "reference_to_element_of_the_list"
Tested on "recipe_0.xml" file: https://raw.githubusercontent.com/providedh/ACDH_Salzburg_recipes/master/outputs/recipe_0.xml
"tag_name": "Ingredient"
"attribute_name": "ref"
"asserted_value": "object_recipe_0_xml-1"
Posisions in this request pointing to "Guetten" fragment, so it's ok
"tag_name": "Ingredient"
"attribute_name": "ref"
"asserted_value": object_recipe_0_xml-2
Positions in this request pointing to "objectName ref="#object_recipe_0_xml-1">Guetten". Selected fragment can't start or end in the middle of the tag.
The platform should handle multiple values for the category attribute. This includes
indexing, visual hints and form options in the annotator, and back-end support for
such annotation requests.
As of now, we are using the scheme of underlining + icons in the annotator to indicate the different kinds/degrees of uncertainty in the text. However, now that we are supporting 5 levels + 2 categories, I've come to think this approach we have been using might be no longer valid.
I wonder if there's a better combination of colors/glyphs/other techniques to convey the 4 (with their respective 5 levels) + 1 categories in the annotator. For example, we could map categories to different text underline styles as illustrated here. For selecting color maps we could fall back to VSUPs: explanation and d3 code (or better, a modification of it).
Needed changes to finish migration of previous annotation app to the new platform.
If the asserted value for locus=name
or the tag name is either ingredient
, utensil
,
or productionMethod
, the a tag with such name is created. This is not the desired
behavior if we were to use lists of objects and the correspondent objectName tag in
the body.
Because now we keep all entity unifications and certainties added to this unifications in database instead of xml file, we need to render this elements in Annotator additionally. Message sent by WebSockets was extended by one more field certainties_from_db
, with list of unifications and certainties inside. Every unification/certainty has extra boolean field committed
. Committed unifications and certainties for them are sent to all connected users, and uncommitted unifications and certainties for them are sent only to theirs author.
Because of there is a lot of approaches to convert xml to json, I used this standard: https://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html (same as in "Retrieve metadata from header of a TEI file" point of file: https://docs.google.com/document/d/102pYWR1t7Ve2mjuFC5juithyVGAPVqFAPa54qq_yn-M/edit#heading=h.uqg41mc3u89)
Example content of certainties_from_db
field:
"certainties_from_db":
[
{
"certainty":
{
"@ana": "",
"@locus": "value",
"@cert": "high",
"@resp": "#person2",
"@target": "#person_dep_833105r082_tei_depositions_plus_original-1",
"@match": "@sameAs",
"@assertedValue": "#person_dep_833105r082_tei_depositions_plus_original-2 #person_dep_833105r082_tei_depositions_plus_original-3 Project_1/dep_834165r133_tei_depositions_plus_original.xml#person_dep_834165r133_tei_depositions_plus_original-1 Project_1/dep_834165r133_tei_depositions_plus_original.xml#person_dep_834165r133_tei_depositions_plus_original-2 Project_1/dep_834165r133_tei_depositions_plus_original.xml#person_dep_834165r133_tei_depositions_plus_original-3",
"@xml:id": "certainty_dep_833105r082_tei_depositions_plus_original.xml-1"
},
"committed": true
},
{
"certainty":
{
"@ana": "https://providedh.ehum.psnc.pl/api/projects/1/taxonomy/#ignorance",
"@locus": "value",
"@cert": "medium",
"@resp": "#person2",
"@target": "#certainty_dep_833105r082_tei_depositions_plus_original.xml-1",
"@match": "@sameAs",
"@xml:id": "certainty_dep_833105r082_tei_depositions_plus_original.xml-64",
"@assertedValue": "some asserted value",
"desc": "awesome description"
},
"committed": true
},
...
]
The following excerpt is the result of creating a text selection annotation, and a second one using
the sidebar to add one by the id.
<classCode scheme="http://providedh.eu/uncertainty/ns/1.0">
<certainty
ana="https://example.com/api/projects/1/taxonomy/#ignorance https://example.com/api/projects/1/taxonomy/#credibility https://example.com/api/projects/1/taxonomy/#imprecision https://example.com/api/projects/1/taxonomy/#incompleteness"
locus="value"
cert="unknown"
resp="#person1"
target="#date_dep_821026r012_tei__1__xml-1"
xml:id="certainty_dep_821026r012_tei__1__xml-2"
assertedValue="asd"/>
<certainty
ana="https://example.com/api/projects/1/taxonomy/#credibility https://example.com/api/projects/1/taxonomy/#imprecision"
locus="name"
cert="very high"
resp="#person1"
target="date_dep_821026r012_tei__1__xml-1"
xml:id="certainty_dep_821026r012_tei__1__xml-4"/>
</classCode>
Decide on the way to approach and implement the necessary changes to support
specifying a floating number for the certainty level of an annotation. This includes
changing the used attribute (possibly to deg), changing the annotator to process
each specific tag and assign styles based on the numerical value, and adding
back-end support for annotations made using this new approach.
Dependent on #9
When the browser window is big enough (I haven't check exactly, but I use 4K screen), the breadcrumb navigation is no longer clicable, and double-click on "file" results in the first word in recipe being selected.
Design and discuss how persistent filters can be integrated in the platform.
Such should take into account that a common use case to be supported is:
When project is created a user should have possibility to change names, descriptions and colors for user recognized uncertainties. The default values for names and descriptions are in the following template TEI file with default taxonomy:
<?xml version="1.0" encoding="UTF-8"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
<teiHeader>
<fileDesc>
<titleStmt>
<title>Uncertainty Taxonomy for project @ProjectName</title>
</titleStmt>
</fileDesc>
<encodingDesc>
<classDecl>
<taxonomy>
<category>
<catDesc>User recognized uncertainty</catDesc>
<category xml:id="ignorance">
<catDesc>Ignorance</catDesc>
<desc>Ignorance is related to the fact that information could have been incorrectly assessed by the person gathering or organizing the data. It is also possible that people, not fully sure about how to deal with data, ignore some information and generate uncertainty during the evaluation and decision processes.</desc>
</category>
<category xml:id="credibility">
<catDesc>Credibility</catDesc>
<desc>Credibility concerns the weight that an agent can attach to its judgment. This concept can be linked to that of biased opinions, which are related to personal visions of the landscape, which can make for significant variations between different groups and individuals, given their backgrounds.</desc>
</category>
<category xml:id="imprecision">
<catDesc>Imprecision</catDesc>
<desc>Imprecision corresponds to the inability to express the true value because the absence of experimental values does not allow the definition of a probability distribution or because it is difficult to obtain the exact value of a measure.</desc>
</category>
<category xml:id="incompleteness">
<catDesc>Incompleteness</catDesc>
<desc>Incompleteness corresponds to the fact that not all situations are covered. Often it is impossible to know every possible option available.</desc>
</category>
</category>
<category>
<catDesc>Machine generated uncertainty</catDesc>
<category xml:id="algorithmic">
<catDesc>Algorithmic</catDesc>
</category>
</category>
</taxonomy>
</classDecl>
</encodingDesc>
</teiHeader>
</TEI>
When user define these values for a project, then colors should be stored in the database, and names and descriptions in the generated TEI file with a taxonomy for this project. Identifiers for categories should be generated form names by applying lowercase method and replacing whitespaces with a dash.
After defining these values, only colors should be changable in Project Settings View.
The genareted TEI file should be exposed in some URL in a scope of the project in order to ana
attribute could refer to the nodes of this XML. It's connected with the task #11.
There is a need for API that returns the names and colors of all uncertainty categories, for instance in order to the annotator could properly handle them.
Implement the presented distant-reading vis for date entities.
Add the ability to toggle the visibility for specific certainty levels and categories, and
specific entity types.
Add back-end code for adding ids to certainty tags Needed for
creating annotations based on previously done ones.
Ease backward navigating in the annotator with a nav-bar and possibly breadcrumbs.
To be done at all levels (including annotator, indexing, etc).
Decide o the way to approach and implement the necessary changes needed to
support a new machine-generated category. This would include handling this
new value in the annotator.
A possible approach at this would be relying on the resp attribute to associate
the annotation to a specific algorithm. This would allow to specify further information regarding the algorithm. This approach would probably require the category
attribute to be optional.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.