Giter Club home page Giter Club logo

openbiodiv's Introduction

OpenBiodiv PhD Project

Hi! Welcome to the OpenBiodiv PhD project by Viktor Senderov at Pensoft as part of the Marie-Curie training network for genomics, bioinformatics, and systematics of the BIG 4 insect groups.

What is OpenBiodiv

OpenBiodiv is a knowledge management system of biodiversity data. Its components are

More information on what OpenBiodiv is can be found in the draft text of a chapter of my dissertation (Chapter 3).

Papers that have been published on OpenBiodiv

Please use 2 for citing.

  1. Senderov, Viktor, Kiril Simov, Nico Franz, Pavel Stoev, Terry Catapano, Donat Agosti, Guido Sautter, Robert A. Morris, and Lyubomir Penev. “OpenBiodiv-O: Ontology of the OpenBiodiv Knowledge Management System.” Journal of Biomedical Semantics 9, no. 1 (January 2018). https://doi.org/10.1186/s13326-017-0174-5.
  2. Senderov, Viktor, and Lyubomir Penev. “The Open Biodiversity Knowledge Management System in Scholarly Publishing.” Research Ideas and Outcomes 2 (January 11, 2016): e7757. https://doi.org/10.3897/rio.2.e7757.
  3. Senderov, Viktor, Teodor Georgiev, and Lyubomir Penev. “Online Direct Import of Specimen Records into Manuscripts and Automatic Creation of Data Papers from Biological Databases.” Research Ideas and Outcomes 2 (September 23, 2016): e10617. https://doi.org/10.3897/rio.2.e10617.
  4. Cardoso, Pedro, Pavel Stoev, Teodor Georgiev, Viktor Senderov, and Lyubomir Penev. “Species Conservation Profiles Compliant with the IUCN Red List of Threatened Species.” Biodiversity Data Journal 4 (September 1, 2016): e10356. https://doi.org/10.3897/BDJ.4.e10356.
  5. Arriaga-Varela, Emmanuel, Matthias Seidel, Albert Deler-Hernández, Viktor Senderov, and Martin Fikácek. “A Review of the Cercyon Leach (Coleoptera, Hydrophilidae, Sphaeridiinae) of the Greater Antilles.” ZooKeys 681 (June 21, 2017): 39–93. https://doi.org/10.3897/zookeys.681.12522.
  6. Penev, Lyubomir, Teodor Georgiev, Peter Geshev, Seyhan Demirov, Viktor Senderov, Iliyana Kuzmova, Iva Kostadinova, Slavena Peneva, and Pavel Stoev. “ARPHA-BioDiv: A Toolbox for Scholarly Publication and Dissemination of Biodiversity Data Based on the ARPHA Publishing Platform.” Research Ideas and Outcomes 3 (April 5, 2017): e13088. https://doi.org/10.3897/rio.3.e13088.
  7. Penev, Lyubomir, Daniel Mietchen, Vishwas Chavan, Gregor Hagedorn, Vincent Smith, David Shotton, Éamonn Ó Tuama, et al. “Strategies and Guidelines for Scholarly Publishing of Biodiversity Data.” Research Ideas and Outcomes 3 (February 28, 2017): e12431. https://doi.org/10.3897/rio.3.e12431.

Presentations

Select presentations given as part of the OpenBiodiv project are to be found in the presentations directory.

Visualisation OpenBiodiv Ontology

here

Social Media

Note

The system is called OpenBiodiv or OBKMS. OBKMS is the older name, but we prefer to call it OpenBiodiv recently.

openbiodiv's People

Contributors

daniel-mietchen avatar howkins avatar kirilsimov avatar mdmtrv avatar pdatascience avatar tcatapano avatar vsenderov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openbiodiv's Issues

Selecting a name and domain for the OBKMS

After TDWG, we are discussing registering a domain for OBKMS and possibly a new short mnemonic name for it (Open Biodiversity Knowledge Management System will still be the full official long name). Here are some ideas.

Domain Name New Name for OBKMS
openbiodiv.net OpenBiodiv
openbiodiv.org OpenBiodiv
biolod.net BioLOD
org is taken
biodiversitylod.net Biodiversity LOD
biodiversitylod.org Biodiversity LOD
biodiversitygraph.net Biodiversity Graph or Biodiversity Knowledge Graph
biodiversitygraph.org Biodiversity Graph or Biodiversity Knowledge Graph
openbkms.net OpenBKMS
openbkms.org OpenBKMS
obkms.com OBKMS
net and org are taken
bioepiphany.org BioEpiphany
bioepiphany.net BioEpiphany

Please everyone comment on your preference!
@myrmoteras @lyubomirpenev @tcatapano @teodorgeorgiev @eotuama

discussion of material cited/specimens in RDF Guide

@pdatascience Aside from type designation, the RDF guide OpenBiodiv/ontology/RDF_Guide.md does not discuss what citations of materials/specimens from taxon treatments specifcally, or the relationships between taxon concepts and specimens more generally. Is this work to be done?

GraphDB Behavior of `owl:inverseOf` Investigation

Sometimes it is more convenient to use an inverse property rather than direct property. E.g. the realization of a fabio:ResearchPaper is a fabio:JournalArticle, but it may be more convenient to talk about the article as a realization of the paper. This is why we explicitly define frbr:realization as the inverse of frbr:realizationOf.

image

Question

Why doens't GraphDB materialize the relation

frbr:realization owl:inverseOf frbr:realizationOf

Database: obkms_i6

image

Process to Dump XML Files in an OpenBiodivDirectory

Also create a script to keep adding new contents to this directory. This directory must contain a file with the following information (can be YAML file, e.g.):

# starts with a date
 - dump_date1:
  # for every date a list of files that were dumped
      - file1 
         # for every file the rdfizers that were run on it
            - version1 # version of the RDF-izer applied to it
            - version2
      - file2
              # this file has not be rdfized
- dump_date2
...

Get Taxonomic Name Usages Right

It is important to get this functionality of the Search-and-Browse interface right. Taxonomic name usages are part of the Taxonomic Concept template in the moment. They need to be moved to a new Taxonomic Name template.

Currently there are some issues

  • the database doesn't seem to have full information on taxonomic name usages
  • we are not sure whether the information that is in the database is displayed correctly
  • it should be possible to click on the individual taxonomic name usages

e.g. try "Coleoptera"

after #26 is resolved, we will be able to better troubleshoot this.

Taxonomic Name Template

New templates needed:
Taxonomic name template (Different from taxonomic concept template)

Some TODO's for Iteration 2

  • add rule to match authors with affiliation string (not only institution ID)
  • figure out why processing of BDJ stopped
  • expand to all Pensoft journals
  • expand to Plazi
  • improve logger with eventType and eventDate
  • zookeys email
  • matching of keywords - lookup problem -> look at the lookup problems below
  • NS troubleshooting when you load both fabio and foaf
  • comas, such as in University of California, berkeley
  • matching statistics from the log
  • SPARQL authors that have been multiple papers
  • SPARQL authors that have the same name, but different ID's
  • SPARQL specific authors such as Lyubomir Penev
  • SPARQL authors that are not members of any organization or have no emails or both

Ontology IRI does not resolve to RDF/XML

I thought http://openbiodiv.net/ontology/ is the canonical IRI for the OpenBiodiv-O ontology, but if so, it only resolves to HTML, even if application/rdf+xml is requested:

$ curl -D - -s -o /dev/null -H "Accept: application/rdf+xml" http://openbiodiv.net/ontology/
HTTP/1.1 200 OK
Date: Tue, 11 Dec 2018 22:03:16 GMT
Server: Apache/2.4.25 (Debian)
X-Powered-By: PHP/7.0.30-0+deb9u1
Set-Cookie: PHPSESSID=2953qaofang4jsun7a6fh7rh75; path=/; domain=.openbiodiv.net
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate
Pragma: no-cache
Cache-Control: no-cache, private
Set-Cookie: laravel_session=eyJpdiI6IjdRZFZld040a2t4YWR6N1Z5SEF4SkE9PSIsInZhbHVlIjoiNklyR1A5NG9pVzNQbld2VU0wYmkxaVlKRVwvWnhqK2FzbGhOaVpIeko0eFRDc2xTSXJmK3ZXbzE3K2RkTWhJZlB1NFJPSDZDaHpmRTZ6QTdTeDkzWHFRPT0iLCJtYWMiOiI0MDdjZDc2ODNiM2RiZWM1MmI4MDRlZGVkYWJhOGVhM2M2N2YwZTFlMWQ1NjVjYjcyMjhkMjIwY2ZkZTVmZTczIn0%3D; path=/; httponly
Vary: Accept-Encoding
Content-Length: 1513
Content-Type: text/html; charset=UTF-8

(It isn't lying about what is returned -- it really does return an HTML page.)

If the ontology/openbiodiv-ontology.ttl file in the repo is the authoritative version of the ontology (is it?), it seems to give http://openbiodiv.net/openbiodiv-ontology/ as the ontology's IRI, not the one above. However, dereferencing it has the same result.

What are use cases for our OBKMS? What are questions, services we can “only” provide because we have the OBKMS?

From: Donat Agosti [mailto:[email protected]]
Sent: Friday, October 28, 2016 11:07 AM
To: Viktor Senderov [email protected]; Bob Morris [email protected]; Eamonn O Tuama [email protected]; Guido Sautter [email protected]; Kiril Simov [email protected]; Lyubomir Penev [email protected]; Nico Franz [email protected]; Pavel Stoev [email protected]; Stefan Daume [email protected]; Teodor Georgiev [email protected]; Terry Catapano [email protected]
Subject: RE: #obkms_biweekly next week on Tuesday?

Dear Viktor

I will try to be available.

A related issue: What are use cases for our OBKMS? What are questions, services we can “only” provide because we have the OBKMS?

We had some discussions here in regards of writing a blog to promote the idea of LOD in general, but making use of our work.

Thanks for some insights

Donat

Beautiful Not Found Icon

if you write "dfsasfasdfasdfds" we should get a beautiful not found

icon. should talk to slavena

Map abbreviate scientific names

for example map

pensoft:5da2129b-8adb-43db-bef4-df28d6057ee8 rdf:type nomen:ScientificName ;
skos:prefLabel "H. aradensis" ;
dwc:rank "species" ;
dwc:species "aradensis" ;
dwc:genus "H." . }

to "Heser aradensis"

Make GraphDB Visible from Outside

Background

For the JBS paper we need to share the database with the world. Security has been implemented on the test database (192.168.83.196:7777).

Steps

  • (1) Security must be implemented also on the production DB
  • (2) production DB must be made available for the outside.
  • (3) Link to outside-production-DB must be put on the web-page.
  • (4) API calls need to be adapted to fit security, also see #30

Usage of NOMEN.

I am putting here the discussion that I have with the NOMEN developers about how to use it for OBKMS:

. I’ve been trying to use NOMEN to describe a relative simple BDJ article:

http://biodiversitydatajournal.com/articles.php?id=4701&display_type=element&element_type=4&element_id=466019&element_name=

To make things easier for you, I am pasting the nomenclatural section here:

Trimerina microchaeta Hendel, 1932
Nomenclature
Trimerina microchaeta Hendel 1932: 11
= Trimerina indistincta Krivosheina 2004: 631 (syn. nov.)

We have a very simple case of heterotypic synonymy. This ought to be relatively simple to model with NOMEN!

I am attaching my attempt at this as two files. bdj4701.xml is the XML of the article in TaxPub format, while bdj4701.ttl is a manually created RDF in Turtle describing the nomenclature and a few other things. It is not clean of errors yet! Here’s an explanation of the basic idea and some questions to Dmitry and Matt:

Basically the article, http://dx.doi.org/10.3897/BDJ.3.e4701, contains a treatment section, :t-microchaeta-treatment. The treatment section a nomenclature section, :nomenclature-section. The nomenclature section contains nomenclatural citations and it realizes the information content of a nomenclature act :trimerina-microchaeta-act. The nomenclature act is linked via DwC terms to a literal name, and via dwciri: terms to instances of what I conveniently call nomen:ICZN_name. The links themselves can are either dwciri:scientificName (for the most recent name) or nomen:ICZN_synonym (again convenience name – I will probably create those classes with owl:sameAs to correspond to the cryptic ID’s that NOMEN uses).

Is that more or less the idea, or am I missing something fundamental here? Thank you so much!

Documentation and Examples

Background

The database is now visible from the outside (#31 , #30).

Objectives

  1. Documentation and examples on how to work with the database directly (SPARQL queries, text).
  2. How to find the same information from the search interface. This may require to expand the templates.

Person Template: Graph is too fat if the person has only 1 article

The graph should not rescale with the data.

e.g. : http://openbiodiv.net/e953281a-613b-41ec-91c8-736da3411693
Name: Andreas Müller

The ultimate solution is to fix the length of either the X or Y axis.

For example we can fix the X axis to 2010 to 2017 (but this is not good as we might have more years)

Or if we flip the graph so that the X axis is number of articles, then we need to fix to 0..20 (and ignore if someone has more than 20 article, which is highly unlikely)

@teodorgeorgiev @howkins what do you think?

Best

Adapting the API Calls to Work With Security

Background for the JBS paper we need to make the database accessible from outside, we are implementing security on the database. I have already enabled security on the test database (192.168.83.196:7777).

Solution Adapt the search-API calls to use username/password.

Openbiodiv.net front-end

SPARQL

Taxonomic Concept / Usage statistics

PREFIX po: <http://www.essepuntato.it/2008/12/pattern#>
PREFIX pkm: <http://proton.semanticweb.org/protonkm#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX openbiodiv: <http://openbiodiv.net/>
PREFIX fabio: <http://purl.org/spar/fabio/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dwc: <http://rs.tdwg.org/dwc/terms/>
PREFIX dwciri: <http://rs.tdwg.org/dwc/iri/>
PREFIX doco: <http://purl.org/spar/doco/>
PREFIX deo: <http://purl.org/spar/deo/>
PREFIX sro: <http://salt.semanticauthoring.org/ontologies/sro#>
PREFIX prism: <http://prismstandard.org/namespaces/basic/2.0/>
PREFIX c4o: <http://purl.org/spar/c4o/>
SELECT ?component (count (?tnu) as ?mentions_in_component) ?class ?doi ?label

WHERE {
 BIND(URI("http://openbiodiv.net/559eebc1-ae14-4434-ba57-c03b1aae67b5") as ?name) 
 
 ?tnu pkm:mentions ?name.
 ?component po:contains ?tnu.
 ?component rdf:type ?class.
 VALUES ?class {fabio:JournalArticle openbiodiv:Treatment doco:Figure deo:Introduction sro:Abstract openbiodiv:KeywordGroup  doco:Title }
 ?component (prism:doi)|(po:isContainedBy/prism:doi) ?doi.
 ?component (rdfs:label)|(po:isContainedBy/rdfs:label) ?label_
  
 OPTIONAL {?component rdf:type doco:Figure. ?component c4o:hasContent ?figure_content.}
 BIND (COALESCE(?figure_content, ?label_) as ?label)
}
GROUP BY  ?doi ?class ?component ?label

Automatic YAML file generation

After the mapping has been established, comments as well as possible domains for properties and classes can be extracted from the ontology automatically

Misinterpretaion of certain queries

Certain queries are misinterpreted. For example when the user types "Page" meaning a last name, he or she gets an empty page as the ontology element foaf:page is matched. Better would be in this case to

(a) display Alternative Implementation component is displayed
(b) display foaf:Page in the generic template header (even if it still empty)

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.