Giter Club home page Giter Club logo

Comments (4)

seebi avatar seebi commented on July 4, 2024

If you plan to release and version the ontology core and its extensions separately, you should not put it in one repository.

from dataid-ontology.

jimkont avatar jimkont commented on July 4, 2024

dmp is mostly stable now and we can align the versioning if needed.
Keeping them in separate repositories will add more overhead for publishing

from dataid-ontology.

chile12 avatar chile12 commented on July 4, 2024

DMP is quiet closely connected with the purposes of (core) dataid. This has
manifested itself by properties seeping from dmp to dataid (eg similar
data, software requirement). DMP is not stable yet since there are still
details to figure out (like repository descriptions). See draft version for
more.
@dimitris: i'm not sure about your question, since DMP has already a branch in this repo

from dataid-ontology.

chile12 avatar chile12 commented on July 4, 2024

Hello everyone,

as you might have noticed we had some troubling issues with abstracts files
in general and English abstracts in particular.

We have remedied those issues by rerunning the full abstract extractions
for the 10 languages most affected by these issues
(de,en,es,fr,it,ja,ko,nl,pl,pt).

Secondarily, we used this as an opportunity to test the the NLP Interchange
Format (NIF)
http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core/nif-core.htmlextraction
on the abstracts of those languages, extraction three new datasets in the
process:

  • nif-context: the full text of a page as context (including begin and
    end index)
  • nif-page-structure: the structure of the page in sections and
    paragraphs (titles, subsections etc.)
  • nif-text-links: all in-text links to other DBpedia resources as well
    as external references

While for this test run we only include the first section (the abstract) of
every page in the context, we are trying (hopefully by the next release) to
extend the context to the full text of all Wikipedia pages, portraying its
structure and providing the foundation for future NLP fact extraction tasks.

You can download these files from here
http://wiki.dbpedia.org/nif-abstract-datasetsor directly here
http://downloads.dbpedia.org/2016-04/ext/nif-abstracts/.

Furthermore, Magnus discovered that all Wikidata normalized files
(wkd_uris) for the English language edition had faulty predicates, so we
reproduced these as well.

We hope to have covered all shortcomings of the last release by this
measure.

Please note: Patrick from Open Link is still in the process of updating the
public endpoint of DBpedia with the new abstracts while I'm writing this
message.

Markus Freudenberg

Release Manager, DBpedia http://wiki.dbpedia.org

from dataid-ontology.

Related Issues (8)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.