Giter Club home page Giter Club logo

oxiqle-tests's Introduction

oxiqle-tests

Repository for all the tests and supplementary code of the OXIQLE project.

Other used repositories:

Setup data

The patient data RDF file is already provided as patients.ttl. Alternatively, it can be built using the rdf_builder.py script. The resulting RDF file should be uploaded in the web interface of Apache Jena Fuseki (https://jena.apache.org/documentation/fuseki2/) in an in-memory database called Patient_FHIR.

For the pangenome graphs a pangenome is dowloaded from https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=pangenomes/freeze/freeze1/pggb/chroms/ (either chrY.hprc-v1.0-pggb.gfa.gz or chr19.hprc-v1.0-pggb.gfa.gz) and extracted. The script pathname_changer.sh is used to replace all occurences of # with / (otherwise OXIQLE will give errors).

For the pregenerated triples a database per graph is necessary. To generate it please use the following commands (oxigraph-gfa should be available on the PATH as oxigraph)

oxigraph load -l db_chrY -f chrY.hprc-v1.0-pggb.gfa
oxigraph load -l db_chr19 -f chr19.hprc-v1.0-pggb.gfa   # Be careful with this, as the resulting databases will be very large

Please note that the files are generated using the original pangenomes and not the ones with replaced slashes. Also note that generating these databases takes a lot of time and disk space.

rdf_builder.py

This script is used to generate an RDF file from the Uni Leipzig Kerndatensatzkonforme FHIR Testdaten. It requires a running HAPI FHIR server (https://github.com/hapifhir/hapi-fhir-jpaserver-starter?tab=readme-ov-file), this is best done using the provided Docker container.

Additionally, it needs a file contain all sample names, named paths.txt in the form:

HG00621
HG00673
HG01106
HG01109
HG01243
HG01258
HG01358
HG01928
HG01952
HG02055

Then it can be run inside the folder containing all the JSON files. For the OXIQLE project, this was done inside the extracted zip folder of the first data set (https://github.com/medizininformatik-initiative/kerndatensatz-testdaten/blob/master/Test_Data/POLAR_WP_1.1_v2/POLAR_WP_1.1_v2-POLAR_WP1.1_00001-POLAR_WP1.1_01650.json.zip).

python3 rdf_builder.py

The resulting RDF file will be written to a file called out.ttl.

pathname_changer.sh

This file can be used to change all the pathnames in a GFA file from using # as the separator to using /.

./pathname_changer.sh <INPUT_GFA> <OUTPUT_GFA>

Benchmarking

To benchmark note that you need to have all programs compiled and on the PATH, the data should be in this directory and the Apache Jena Fuseki Server should be running with the dataset loaded.

Programs needed on path:

The four benchmarking scripts take two arguments: the path of the sapfhir-cli jar-file and the path of the generated oxigraph database (if this guide was followed db_chrY or db_chr19 depending on the benchmark).

The four scripts are

runtime_chry.sh <SAPFHIR_CLI_JAR> <OXIGRAPH_DB>
runtime_chr19.sh <SAPFHIR_CLI_JAR> <OXIGRAPH_DB>
memory_chry.sh <SAPFHIR_CLI_JAR> <OXIGRAPH_DB>
memory_chr19.sh <SAPFHIR_CLI_JAR> <OXIGRAPH_DB>

All of them can take quite a while to execute. All commands that ran into errors or could not be executed are commented out.

oxiqle-tests's People

Contributors

heringerp avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.