krr-oxford / deeponto Goto Github PK
View Code? Open in Web Editor NEWA package for ontology engineering with deep learning and language models.
Home Page: https://krr-oxford.github.io/DeepOnto/
License: Apache License 2.0
A package for ontology engineering with deep learning and language models.
Home Page: https://krr-oxford.github.io/DeepOnto/
License: Apache License 2.0
Is your feature request related to a problem? Please describe.
Using the ELK reasoner prints a lot of console message.
Describe the solution you'd like
Disable INFO level message.
Dear all,
thank you for DeepOnto.
I was wondering whether there is an example code for consistency checking, e.g.
from deeponto.onto import Ontology
onto = Ontology("path_to_ontology.owl", "hermit")
assert onto.consistent()
how can find embedding ontology for own dataset?
Describe the bug
Under some circumstances dureing the mapping extensions stage the tokenizer throws the error IndexError: list index out of range
.
The error originates at bert_classifier.py line 185.
This is the same error and same location inside the tokenizer of huggingface/tokenizers#993 , which was caused by the data passed to the tokenizer.
To Reproduce
I have reproduced this error with these settings:
Logs & stack trace | max_length_for_input |
batch_size_for_training |
Source ontology | Target ontology |
---|---|---|---|---|
link | 256 | 16 | music-representation.owl | musicClasses.owl @ 2ebb641 |
link | 128 | 8 | core.owl | musicClasses.owl @ ebc2d09 |
Expected behavior
The stage and the pipeline should complete successfully
Platform:
I have tried to run BERTMap, but got the following error:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)
In fact, it is a bug that was introduced in Transformer 4.12.3 and has been fixed in 4.13.0. For short, the output of Tokenizer is BatchEncoding
, but the Trainer only transfers Union[torch.Tensor, Tuple, List, Dictionary]
to GPU.
(Please refer to this link for more details When running the Trainer cell, it found two devices (cuda:0 and CPU))
I think this bug is introduced in this commit 086a25cae945d496765cbbb09b36f9780d676ac7. Please consider fixing the version of Transformer.
This is Shriram and I mailed you recently regarding my interest in making use of DeepOnto, I am currently using 2 different autonomous vehicles ontology and am unable to run the BertMap model due to "ValueError: evaluation strategy steps requires either non-zero --eval_steps or --logging_steps". I am unaware as to where this error is arising from.
/usr/local/lib/python3.10/dist-packages/transformers/training_args.py in post_init(self)
1301 self.eval_steps = self.logging_steps
1302 else:
-> 1303 raise ValueError(
1304 f"evaluation strategy {self.evaluation_strategy} requires either non-zero --eval_steps or"
1305 " --logging_steps"
ValueError: evaluation strategy steps requires either non-zero --eval_steps or --logging_steps
this is the entire error I am getting,
Could the number of instances in my ontology be any reason for this error? I even tried multiple value changes to my config yaml file, none of them work. Kindly help me with the same.
Thanks in advance!
Describe the bug
While running BERTMap I'm receiving an error "ZeroDivisionError: division by zero"
To Reproduce
Launch BERTMap with these input files:
- configuration: bertmap.yaml
- source ontology: ontology-network.ttl
- target ontology: music.owl
Expected behavior
Mapping search between the ontologies should work normally
Actual output
[Time: 00:18:47] - [PID: 172] - [Model: bertmap]
Load the following configurations:
{
"model": "bertmap",
"output_path": "/content",
"annotation_property_iris": [
"http://www.w3.org/2000/01/rdf-schema#label",
"http://www.geneontology.org/formats/oboInOwl#hasSynonym",
"http://www.geneontology.org/formats/oboInOwl#hasExactSynonym",
"http://www.w3.org/2004/02/skos/core#exactMatch",
"http://www.ebi.ac.uk/efo/alternative_term",
"http://www.orpha.net/ORDO/Orphanet_#symbol",
"http://purl.org/sig/ont/fma/synonym",
"http://www.w3.org/2004/02/skos/core#prefLabel",
"http://www.w3.org/2004/02/skos/core#altLabel",
"http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#P108",
"http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#P90"
],
"known_mappings": null,
"auxiliary_ontos": [],
"bert": {
"pretrained_path": "bert-base-uncased",
"max_length_for_input": 128,
"num_epochs_for_training": 3.0,
"batch_size_for_training": 16,
"batch_size_for_prediction": 128,
"resume_training": null
},
"global_matching": {
"enabled": true,
"num_raw_candidates": 200,
"num_best_predictions": 10,
"mapping_extension_threshold": 0.8,
"mapping_filtered_threshold": 0.9
}
}
[Time: 00:00:00] - [PID: 172] - [Model: bertmap]
Load the following configurations:
{
"model": "bertmap",
"output_path": "/content",
"annotation_property_iris": [
"http://www.w3.org/2000/01/rdf-schema#label",
"http://www.geneontology.org/formats/oboInOwl#hasSynonym",
"http://www.geneontology.org/formats/oboInOwl#hasExactSynonym",
"http://www.w3.org/2004/02/skos/core#exactMatch",
"http://www.ebi.ac.uk/efo/alternative_term",
"http://www.orpha.net/ORDO/Orphanet_#symbol",
"http://purl.org/sig/ont/fma/synonym",
"http://www.w3.org/2004/02/skos/core#prefLabel",
"http://www.w3.org/2004/02/skos/core#altLabel",
"http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#P108",
"http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#P90"
],
"known_mappings": null,
"auxiliary_ontos": [],
"bert": {
"pretrained_path": "bert-base-uncased",
"max_length_for_input": 128,
"num_epochs_for_training": 3.0,
"batch_size_for_training": 16,
"batch_size_for_prediction": 128,
"resume_training": null
},
"global_matching": {
"enabled": true,
"num_raw_candidates": 200,
"num_best_predictions": 10,
"mapping_extension_threshold": 0.8,
"mapping_filtered_threshold": 0.9
}
}
INFO:bertmap:Load the following configurations:
{
"model": "bertmap",
"output_path": "/content",
"annotation_property_iris": [
"http://www.w3.org/2000/01/rdf-schema#label",
"http://www.geneontology.org/formats/oboInOwl#hasSynonym",
"http://www.geneontology.org/formats/oboInOwl#hasExactSynonym",
"http://www.w3.org/2004/02/skos/core#exactMatch",
"http://www.ebi.ac.uk/efo/alternative_term",
"http://www.orpha.net/ORDO/Orphanet_#symbol",
"http://purl.org/sig/ont/fma/synonym",
"http://www.w3.org/2004/02/skos/core#prefLabel",
"http://www.w3.org/2004/02/skos/core#altLabel",
"http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#P108",
"http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#P90"
],
"known_mappings": null,
"auxiliary_ontos": [],
"bert": {
"pretrained_path": "bert-base-uncased",
"max_length_for_input": 128,
"num_epochs_for_training": 3.0,
"batch_size_for_training": 16,
"batch_size_for_prediction": 128,
"resume_training": null
},
"global_matching": {
"enabled": true,
"num_raw_candidates": 200,
"num_best_predictions": 10,
"mapping_extension_threshold": 0.8,
"mapping_filtered_threshold": 0.9
}
}
[Time: 00:18:47] - [PID: 172] - [Model: bertmap]
Save the configuration file at /content/bertmap/config.yaml.
[Time: 00:00:00] - [PID: 172] - [Model: bertmap]
Save the configuration file at /content/bertmap/config.yaml.
INFO:bertmap:Save the configuration file at /content/bertmap/config.yaml.
[Time: 00:18:47] - [PID: 172] - [Model: bertmap]
Construct new text semantics corpora and save at /content/bertmap/data/text-semantics.corpora.json.
[Time: 00:00:00] - [PID: 172] - [Model: bertmap]
Construct new text semantics corpora and save at /content/bertmap/data/text-semantics.corpora.json.
INFO:bertmap:Construct new text semantics corpora and save at /content/bertmap/data/text-semantics.corpora.json.
---------------------------------------------------------------------------
ZeroDivisionError Traceback (most recent call last)
[<ipython-input-12-a888744a31b2>](https://localhost:8080/#) in <cell line: 1>()
----> 1 bertmap = BERTMapPipeline(src_onto, tgt_onto, config)
6 frames
[/usr/local/lib/python3.10/dist-packages/deeponto/align/bertmap/pipeline.py](https://localhost:8080/#) in __init__(self, src_onto, tgt_onto, config)
119 # load or construct the corpora
120 self.corpora_path = os.path.join(self.data_path, "text-semantics.corpora.json")
--> 121 self.corpora = self.load_text_semantics_corpora()
122
123 # load or construct fine-tune data
[/usr/local/lib/python3.10/dist-packages/deeponto/align/bertmap/pipeline.py](https://localhost:8080/#) in load_text_semantics_corpora(self)
251 corpora.save(self.data_path)
252
--> 253 return self.load_or_construct(self.corpora_path, data_name, construct)
254
255 self.logger.info(f"No training needed; skip the construction of {data_name}.")
[/usr/local/lib/python3.10/dist-packages/deeponto/align/bertmap/pipeline.py](https://localhost:8080/#) in load_or_construct(self, data_file, data_name, construct_func, *args, **kwargs)
227 else:
228 self.logger.info(f"Construct new {data_name} and save at {data_file}.")
--> 229 construct_func(*args, **kwargs)
230 # load the data file that is supposed to be saved locally
231 return FileUtils.load_file(data_file)
[/usr/local/lib/python3.10/dist-packages/deeponto/align/bertmap/pipeline.py](https://localhost:8080/#) in construct()
241
242 def construct():
--> 243 corpora = TextSemanticsCorpora(
244 src_onto=self.src_onto,
245 tgt_onto=self.tgt_onto,
[/usr/local/lib/python3.10/dist-packages/deeponto/align/bertmap/text_semantics.py](https://localhost:8080/#) in __init__(self, src_onto, tgt_onto, annotation_property_iris, class_mappings, auxiliary_ontos)
517 # build intra-ontology corpora
518 # negative sample ratios are by default
--> 519 self.intra_src_onto_corpus = IntraOntologyTextSemanticsCorpus(src_onto, annotation_property_iris)
520 self.add_samples_from_sub_corpus(self.intra_src_onto_corpus)
521 self.intra_tgt_onto_corpus = IntraOntologyTextSemanticsCorpus(tgt_onto, annotation_property_iris)
[/usr/local/lib/python3.10/dist-packages/deeponto/align/bertmap/text_semantics.py](https://localhost:8080/#) in __init__(self, onto, annotation_property_iris, soft_negative_ratio, hard_negative_ratio)
310 self.onto = onto
311 # $\textsf{BERTMap}$ does not apply synonym transitivity
--> 312 self.thesaurus = AnnotationThesaurus(onto, annotation_property_iris, apply_transitivity=False)
313
314 self.synonyms = self.thesaurus.synonym_sampling()
[/usr/local/lib/python3.10/dist-packages/deeponto/align/bertmap/text_semantics.py](https://localhost:8080/#) in __init__(self, onto, annotation_property_iris, apply_transitivity)
74 self.annotation_property_iris = iris
75 total_number_of_annotations = sum([len(v) for v in self.annotation_index.values()])
---> 76 self.average_number_of_annotations_per_class = total_number_of_annotations / len(self.annotation_index)
77
78 # synonym groups
ZeroDivisionError: division by zero
Following the stack trace I see that the code uses the length of self.annotation_index
as denominator, but apparently this length is zero. This is a dictionary built by Ontology::build_annotation_index()
based on annotation_property_iris
, which as can be seen above is correctly populated and not empty. So I suspect the bug is located somewhere in this function, but I wasn't able to understand exactly where.
Desktop (please complete the following information):
While exploring the documentation I just came across a small typo in the example code snippets. There is a quotation mark missing in the a line of code there (see the link below). Nothing concerning but I just wanted to let you know. It may cause some unexpected error for those who copy-paste the code :)
onto.get_subsumption_axioms(entity_type="Classes)
--> onto.get_subsumption_axioms(entity_type="Classes")
Describe the bug
The BERTMap model got stuck at the mapping extension phase.
To Reproduce
Steps to reproduce the behavior:
Run BERTMap on SNOMED-FMA (Body) task.
Hi, I am not able to generate the exact H@1 and MRR for EditSim for the FMA SNOMED task as reported in Table 4 in https://arxiv.org/pdf/2205.03447.pdf.
This is the command used:
python om_eval.py --saved_path './om_results' --pred_path './onto_match_experiment2/edit_sim/global_match/src2tgt' --ref_anchor_path 'data/equiv_match/refs/snomed2fma.body/unsupervised/src2tgt.rank/for_eval' --hits_at 1
These are the generated numbers: H@1: .841 and MRR: .89
Reported nos. in the paper: H@1: 869 and MRR: .895
I am not sure why the numbers are not consistent.
Is there anything that needs to be modified in the code to get the reported numbers?
Bug Description
I'm trying to verbalise a class expression. The code I'm executing is as follows:
from deeponto.onto import Ontology, OntologyVerbaliser, OntologySyntaxParser
onto = Ontology("ontology.owl")
verbaliser = OntologyVerbaliser(onto)
complex_concepts = list(onto.get_asserted_complex_classes())
v_concept = verbaliser.verbalise_class_expression(complex_concepts[0])
Where ontology.owl
is a simple ontology of RDF/XML syntax that contains an atomic concept, a datatype property and a complex concept. The whole ontology provided in Additional Context
I get the following error:
Traceback (most recent call last):
File "/home/pg-xai2/sampling/examples/prova_deeponto.py", line 42, in <module>
v_concept = verbaliser.verbalise_class_expression(complex_concepts[0])
File "/home/pg-xai2/.conda/envs/ontolearn/lib/python3.9/site-packages/deeponto/onto/verbalisation.py", line 227, in verbalise_class_expression
return self._verbalise_junction(parsed_class_expression)
File "/home/pg-xai2/.conda/envs/ontolearn/lib/python3.9/site-packages/deeponto/onto/verbalisation.py", line 334, in _verbalise_junction
other_children.append(self.verbalise_class_expression(child))
File "/home/pg-xai2/.conda/envs/ontolearn/lib/python3.9/site-packages/deeponto/onto/verbalisation.py", line 214, in verbalise_class_expression
return self._verbalise_iri(parsed_class_expression)
File "/home/pg-xai2/.conda/envs/ontolearn/lib/python3.9/site-packages/deeponto/onto/verbalisation.py", line 254, in _verbalise_iri
verbal = self.vocab[iri] if not self.keep_iri else iri_node.text
KeyError: 'http://dl-learner.org/mutagenesis#Compound'
This is the printed complex concept (maybe you can just try to manually construct this concept and test it out):
ObjectIntersectionOf(<http://dl-learner.org/mutagenesis#Compound> DataSomeValuesFrom(<http://dl-learner.org/mutagenesis#act> DatatypeRestriction(xsd:decimal facetRestriction(minInclusive "0.04"^^xsd:decimal))))
To Reproduce
Execute the code described above using the given ontology.
Additional context
OS: Linux
content of ontology.owl
:
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xml:base="http://dl-learner.org/mutagenesis"
xmlns="http://dl-learner.org/mutagenesis#">
<owl:Ontology rdf:about="http://dl-learner.org/mutagenesis"/>
<owl:DatatypeProperty rdf:about="#act">
<rdfs:domain rdf:resource="#Compound"/>
<rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#double"/>
</owl:DatatypeProperty>
<owl:Class rdf:about="#Compound"/>
<owl:Class rdf:about="http://dl-learner.org/Pred_1">
<rdfs:subClassOf rdf:resource="http://www.w3.org/2002/07/owl#Thing"/>
<owl:equivalentClass>
<owl:Class>
<owl:intersectionOf rdf:parseType="Collection">
<rdf:Description rdf:about="#Compound"/>
<owl:Restriction>
<owl:onProperty rdf:resource="#act"/>
<owl:someValuesFrom>
<rdfs:Datatype>
<owl:onDatatype rdf:resource="http://www.w3.org/2001/XMLSchema#decimal"/>
<owl:withRestrictions>
<rdf:Description>
<rdf:first>
<rdf:Description>
<xsd:minInclusive rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">0.04</xsd:minInclusive>
</rdf:Description>
</rdf:first>
<rdf:rest rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#nil"/>
</rdf:Description>
</owl:withRestrictions>
</rdfs:Datatype>
</owl:someValuesFrom>
</owl:Restriction>
</owl:intersectionOf>
</owl:Class>
</owl:equivalentClass>
</owl:Class>
</rdf:RDF>
I tried other ontologies as well including Carcinogenesis and the whole Mutagenesis which you can find here. Since they do not contain complex concepts I tried to verbalize a sub class axioms like following:
# get subsumption axioms from the ontology
subsumption_axioms = onto.get_subsumption_axioms(entity_type="Classes")
# verbalise the first subsumption axiom
v_sub, v_super = verbaliser.verbalise_class_subsumption_axiom(subsumption_axioms[0])
The same kind of error as mentioned earlier occurred.
In addition to a library, consider also creating a Dockerfile which uses FastAPI to serve web APIs that can be used. For instance, instead of having to import the library, I can deploy a docker container and call the APIs for which I will provide all the necessary inputs.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.