typedb / biograkn Goto Github PK

BioGrakn Knowledge Graph

Home Page: https://blog.grakn.ai/biograkn-accelerating-biomedical-knowledge-discovery-with-a-grakn-knowledge-graph-84706768d7d4

biograkn bioinformatics biomedical knowledge-graph knowledge-discovery

biograkn's Introduction

BioGrakn

BioGrakn is a collection of knowledge graphs of biomedical data demonstrating the following use-cases:

Use Case	keyspace name	Datasets
1. Precision Medicine	precision_medicine	ClinicalTrials.gov, ClinVar, CTDBase, DisGeNet, Drugs@FDA, HGNC, and PharmGKB
2. Text Mining	text_mining	PubMed
3. BLAST	blast	N/A
4. Disease Network	disease_network	Uniprot, Reactome, DGIdb, DisGeNET, HPA-Tissue, EBI IntAct, Kaneko, Gene Expression Omnibus and TissueNet

BioGrakn provides an intuitive way to query interconnected and heterogeneous biomedical data in one single place. The schema that models the underlying knowledge graph alongside the descriptive query language, Graql, makes writing complex queries an extremely straightforward and intuitive process. Furthermore, the automated reasoning capability of Grakn, allows BioGrakn to become an intelligent database of biomedical data that infers implicit knowledge based on the explicitly stored data. BioGrakn can understand biological facts, infer based on new findings and enforce research constraints, all at query (run) time.

Quickstart

Download BioGrakn
Unzip the downloaded file.
cd into the unzipped folder, via terminal or command prompt.
run ./grakn server start
Download Grakn Workbase 1.2.2 (note that, at the moment, newer versions of Grakn Workbase are not yet compatible with BioGrakn)

Interacting With BioGrakn

Queries can be run over BioGrakn, via Graql Console, Grakn Clients and Grakn Workbase.

Via Graql Console

While inside the unzipped folder, via terminal or command prompt, run: ./grakn console -k keyspace_name. The console is now ready to answer your queries.

Via Grakn Clients

Grakn Clients are available for Java, Node.js and Python. Using these clients, you will be able to perform read and write operations over BioGrakn.

See an example of how this is done in the Grakn <> BLAST integration example, using the Python client.

Via Grakn Workbase

Download the latest release of Grakn Workbase, install and run it.

Read the documentation on Workbase or watch a short series of videos about using workbase with the Grakn <> BLAST integration example.

biograkn's People

Contributors

Stargazers

Watchers

biograkn's Issues

BioGrakn DN has duplicate genes

Description

Remove BioGrakn DN duplicate genes

Environment

N/A

Reproducible Steps

N/A

Expected Output

N/A

Actual Output

N/A

Additional information

N/A

Instructions to compile from source in a Ubuntu machine

Hi,

I was just wondering if you have any instructions to compile biograkn from source
on a ubuntu machine?

Thanks,
George

Make sure BioGrakn-nightly workflow is triggered only once after commit on master

Currently, BioGrakn-nightly workflow on circle ci is triggered everyday according to the cron scheduled. Configure it to trigger only once after a commit is made on master.

Unable to understand how to build for migrate.java to work

Description

This is for the usecase textmining.I have loaded schema in keyspace, but then when i try to migrate data using migrate.java, I cant understand how to get it to work

i initially tried javac migrate.java but it failed because repo uses bazel for building.

hence i went to the textmining directory, and ran

bazel build //migrate/migrator-bin.jar

to which i am getting the following error:

build aborted: no such package '@antlr4_tool//jar'

Traceback:

DEBUG: Rule 'io_bazel_rules_python' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1545256788 -0500"
DEBUG: Rule 'com_github_grpc_grpc' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1550231355 +0300"
DEBUG: Rule 'stackb_rules_proto' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1561665037 -0600"
DEBUG: Rule 'graknlabs_grakn_core' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1572870526 +0000"
DEBUG: Rule 'graknlabs_benchmark' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1571391871 +0100"
DEBUG: Rule 'graknlabs_graql' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1572870025 +0000"
DEBUG: Rule 'rules_antlr' indicated that a canonical reproducible form can be obtained by modifying arguments commit = "397361a4d252a7186bc33add33144f4ede2a3899", shallow_since = "1559662328 +0200" and dropping ["tag"]
DEBUG: Rule 'graknlabs_client_python' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1569623464 +0000"
DEBUG: Rule 'graknlabs_bazel_distribution' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1572869706 +0300"
DEBUG: Rule 'com_github_google_bazel_common' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1551104077 +0300"
DEBUG: Rule 'graknlabs_protocol' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1572870033 +0000"
INFO: Call stack for the definition of repository 'antlr4_tool' which is a http_jar (rule definition at /home/aditya/.cache/bazel/_bazel_aditya/1271badea2afdf63b6fcb04c93a3b0e2/external/bazel_tools/tools/build_defs/repo/http.bzl:347:12):
 - /home/aditya/.cache/bazel/_bazel_aditya/1271badea2afdf63b6fcb04c93a3b0e2/external/rules_antlr/antlr/deps.bzl:49:5
 - /home/aditya/.cache/bazel/_bazel_aditya/1271badea2afdf63b6fcb04c93a3b0e2/external/rules_antlr/antlr/deps.bzl:27:9
 - /home/aditya/Projects/RD/biograkn/WORKSPACE:120:1
ERROR: An error occurred during the fetch of repository 'antlr4_tool':
   java.io.IOException: Error downloading [http://central.maven.org/maven2/org/antlr/antlr4/4.7.2/antlr4-4.7.2.jar] to /home/aditya/.cache/bazel/_bazel_aditya/1271badea2afdf63b6fcb04c93a3b0e2/external/antlr4_tool/jar/downloaded.jar: Unknown host: central.maven.org
INFO: Call stack for the definition of repository 'remotejdk11_linux' which is a http_archive (rule definition at /home/aditya/.cache/bazel/_bazel_aditya/1271badea2afdf63b6fcb04c93a3b0e2/external/bazel_tools/tools/build_defs/repo/http.bzl:229:16):
 - /DEFAULT.WORKSPACE.SUFFIX:199:1
INFO: Call stack for the definition of repository 'com_google_protobuf' which is a http_archive (rule definition at /home/aditya/.cache/bazel/_bazel_aditya/1271badea2afdf63b6fcb04c93a3b0e2/external/bazel_tools/tools/build_defs/repo/http.bzl:229:16):
 - /home/aditya/.cache/bazel/_bazel_aditya/1271badea2afdf63b6fcb04c93a3b0e2/external/com_github_grpc_grpc/bazel/grpc_deps.bzl:125:9
 - /home/aditya/Projects/RD/biograkn/WORKSPACE:83:1
INFO: Call stack for the definition of repository 'com-google-protobuf-protobuf-java' which is a jar_artifact (rule definition at /home/aditya/.cache/bazel/_bazel_aditya/1271badea2afdf63b6fcb04c93a3b0e2/external/graknlabs_grakn_core/dependencies/maven/dependencies.bzl:40:16):
 - /home/aditya/.cache/bazel/_bazel_aditya/1271badea2afdf63b6fcb04c93a3b0e2/external/graknlabs_grakn_core/dependencies/maven/dependencies.bzl:58:5
 - /home/aditya/.cache/bazel/_bazel_aditya/1271badea2afdf63b6fcb04c93a3b0e2/external/graknlabs_grakn_core/dependencies/maven/dependencies.bzl:667:9
 - /home/aditya/Projects/RD/biograkn/WORKSPACE:103:1
INFO: Call stack for the definition of repository 'org-apache-lucene-lucene-analyzers-common' which is a jar_artifact (rule definition at /home/aditya/Projects/RD/biograkn/dependencies/maven/dependencies.bzl:40:16):
 - /home/aditya/Projects/RD/biograkn/dependencies/maven/dependencies.bzl:58:5
 - /home/aditya/Projects/RD/biograkn/dependencies/maven/dependencies.bzl:120:9
 - /home/aditya/Projects/RD/biograkn/WORKSPACE:72:1
INFO: Call stack for the definition of repository 'io-netty-netty-all' which is a jar_artifact (rule definition at /home/aditya/.cache/bazel/_bazel_aditya/1271badea2afdf63b6fcb04c93a3b0e2/external/graknlabs_grakn_core/dependencies/maven/dependencies.bzl:40:16):
 - /home/aditya/.cache/bazel/_bazel_aditya/1271badea2afdf63b6fcb04c93a3b0e2/external/graknlabs_grakn_core/dependencies/maven/dependencies.bzl:58:5
 - /home/aditya/.cache/bazel/_bazel_aditya/1271badea2afdf63b6fcb04c93a3b0e2/external/graknlabs_grakn_core/dependencies/maven/dependencies.bzl:667:9
 - /home/aditya/Projects/RD/biograkn/WORKSPACE:103:1
INFO: Call stack for the definition of repository 'edu-stanford-nlp-stanford-corenlp-java-models' which is a jar_artifact (rule definition at /home/aditya/Projects/RD/biograkn/dependencies/maven/dependencies.bzl:40:16):
 - /home/aditya/Projects/RD/biograkn/dependencies/maven/dependencies.bzl:58:5
 - /home/aditya/Projects/RD/biograkn/dependencies/maven/dependencies.bzl:120:9
 - /home/aditya/Projects/RD/biograkn/WORKSPACE:72:1
ERROR: /home/aditya/.cache/bazel/_bazel_aditya/1271badea2afdf63b6fcb04c93a3b0e2/external/graknlabs_graql/grammar/BUILD:24:1: every rule of type antlr4 implicitly depends upon the target '@antlr4_tool//jar:jar', but this target could not be found because of: no such package '@antlr4_tool//jar': java.io.IOException: Error downloading [http://central.maven.org/maven2/org/antlr/antlr4/4.7.2/antlr4-4.7.2.jar] to /home/aditya/.cache/bazel/_bazel_aditya/1271badea2afdf63b6fcb04c93a3b0e2/external/antlr4_tool/jar/downloaded.jar: Unknown host: central.maven.org
Documentation for implicit attribute deps of rules of type antlr4:

The dependencies to use. Defaults to the official ANTLR 4 release, but if
you need to use a different version, you can specify the dependencies here.

DEBUG: Rule 'graknlabs_common' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1572870041 +0000"
DEBUG: Rule 'graknlabs_client_java' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1570795516 +0000"
ERROR: Analysis of target '//migrator:migrator-bin' failed; build aborted: no such package '@antlr4_tool//jar': java.io.IOException: Error downloading [http://central.maven.org/maven2/org/antlr/antlr4/4.7.2/antlr4-4.7.2.jar] to /home/aditya/.cache/bazel/_bazel_aditya/1271badea2afdf63b6fcb04c93a3b0e2/external/antlr4_tool/jar/downloaded.jar: Unknown host: central.maven.org
INFO: Elapsed time: 7.458s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (4 packages loaded, 0 targets configured)

I wanted to know how to properly build the project

Environment

OS (where Grakn server runs): Ubuntu 18.04
Grakn version (and platform): grakn core 1.5.7

The elements in the BioGrakn DN Workspace screenshots look very small.

Description

The elements in the BioGrakn DN Workspace screenshots look very small.

Reproducible Steps

Steps to create the smallest reproducible scenario:

Visit the README of BioGrakn DN

Expected Output

To see the workbase screenshots without having my nose touch the screen.

Actual Output

My nose touches the screen.

Additional information

The screenshots need to be taken at a lower resolution.

Bazel RBE not working with bazel build //... and bazel test //...

$bazel build //... OR $bazel test //...

ERROR: The Build Event Protocol upload failed: INVALID_ARGUMENT: INVALID_ARGUMENT: Request contains an invalid argument.
INFO: Partial Build Event Protocol results may be available at https://source.cloud.google.com/results/invocations/61a83ee3-9b74-4518-b896-18cee2c2ea60
INFO: Build completed successfully, 333 total actions
Exited with code 38

Model the schema to represent the Precision Medicine use-case

BioGrakn: release version 0.1

Biograkn server start fails: Process exited with code '1': ''

Description

Starting BioGrakn on the command line fails with the following error:
Process exited with code '1': ''

The logs point to the following exception:

2020-02-18 11:57:33,417 [main] ERROR grakn.core.daemon.GraknDaemon - An error has occurred during boot-up. Please run 'grakn server status' or check the logs located under the 'logs' directory.
grakn.core.daemon.exception.GraknDaemonException: Process exited with code '1': ''
	at grakn.core.daemon.executor.Storage.start(Storage.java:222)
	at grakn.core.daemon.executor.Storage.startIfNotRunning(Storage.java:147)
	at grakn.core.daemon.GraknDaemon.serverStart(GraknDaemon.java:184)
	at grakn.core.daemon.GraknDaemon.run(GraknDaemon.java:136)
	at grakn.core.daemon.GraknDaemon.main(GraknDaemon.java:72)

Environment

OS (where Grakn server runs): MacOS 10.15.3
Grakn version (and platform): Grakn Core 1.5.7
Other environment details:

Reproducible Steps

Steps to create the smallest reproducible scenario:
Just following the steps on the README file

Keyspace "precision_medicine" has no person data

Not sure if this is a bug or deliberate miss. When querying "person" on the keyspace "precision_medicine" I got nothing, which failed all rule-based inferences as showed in the Youtube tutorial https://www.youtube.com/watch?v=E__0XhGHXnI

missing csv file : protein protein interaction dataset (PPIs)

{ Not able to compile due to missing csv file in dataset }

Environment

OS (where Grakn server runs): { Mac OS 10 and Windows 10 }
Grakn version (and platform): { Grakn Core 1.5.7 }

Reproducible Steps

Steps to create the smallest reproducible scenario:

{ e.g. Run diseasenetworks/migrate.py }
{ e.g. Load stops with missing file (intact folder missing) }

Expected Output

{ expected to finish migration }

Actual Output

{ should be resolved with missing file }

Additional information

{ this may come from a 3rd party, but we didn't see a link }

Model the schema to represent the Text Mining use-case

Problem to Solve

Model CoreNLP output to Grakn Schema

Current Workaround

N/A

Proposed Solution

Write Schema to map CoreNLP output to grakn schema types

Additional Information

N/A

Implement the migration scripts in order ingest output of CoreNLP into Grakn

Problem to Solve

Migrating Core NLP output to Grakn

Current Workaround

N/A

Proposed Solution

Implement script to map output to schema and migrate using the Java client

Additional Information

N/A

BLAST: write the example code for importing BLAST results into the knowledge graph

The example code is to be written in Python using the Biopython library.

Circle CI build timeout reached when migrating BioGrakn

When trying to migrate BioGrakn over Circle CI, it times out after 5 hours since the migration takes longer than 5 hours.

Job: https://circleci.com/gh/graknlabs/biograkn/1637

Additional information: https://discuss.circleci.com/t/maximum-build-time-for-builds/18383

Suggestion 1: Splitting the migration into multiple bazel targets and then run them across multiple circle ci jobs.

after typing ./grakn server start, the grakn server didn't set up properly

Description

PS E:\biograkn> + ./grakn server start
程序“grakn”无法运行: 没有应用程序与此操作的指定文件有关联。所在位置行:1 字符: 1

./grakn server start

所在位置行:1 字符: 1

./grakn server start

  + CategoryInfo          : ResourceUnavailable: (:) [], ApplicationFailedException
  + FullyQualifiedErrorId : NativeCommandFailed

Environment

OS (where Grakn server runs): { Windows 10 }
Grakn version (and platform): { Grakn Core 1.5.7_biograkn_0.2 }
Other environment details:

Reproducible Steps

Steps to create the smallest reproducible scenario:

{ e.g. Run ... }
{ e.g. Load ... }
{ e.g. Query ... }
{ e.g. See error ... }

Wrong content in download section?

Am I getting something wrong or does the unzipped file from the download section only contains "grakn-core-all-mac" instead of the BioGrakn ? Correct me if I'm wrong!

Grakn needs Java 1.8 in order to run

I am getting the following error when ./grakn server start inside of the unzipped https://github.com/graknlabs/biograkn#download folder:

Unsupported Java version [12] found. Grakn needs Java 1.8 in order to run.

I currently have the newest Java version:

OpenJDK Runtime Environment (build 12.0.1+12)
OpenJDK 64-Bit Server VM (build 12.0.1+12, mixed mode, sharing)

Could you please actualise the application to work with the newest Java version?

Many thanks,

David

Improve conciseness by not having to declare 'test_class' attribute for java_test

document to run Java Text Mining code

Please replace every line in curly brackets { like this } with an appropriate description, and remove this line.

Description

{ I am new to Grakn as well as Java. Can you please provide a document which helps novice like us to run the Java code for Text mining in grakn. Or Can the same be implemented in Python }
Request to provide a step by step instructions.

Implement CoreNLP API

Problem to Solve

Need an NLP tool for the text-mining use case

Current Workaround

N/A

Proposed Solution

-Import Standford CoreNLP
-Implement mining of information from a body of text

Additional Information

N/A

Release pipeline for end-to-end GitHub deployment

Architect and implement the release workflows for automating the testing and release of BioGrakn on GitHub

refer to:
https://docs.google.com/spreadsheets/d/1E0asVaOyiSm51TcdU4Pfp_6NZG_pKoby4vwRWnAKolc/edit#gid=1256724151

Import minimum viable datasets for the precision medicine use-case

Attempt to import the minimum number of datasets in order to create a proof of concept for the precision medicine use-case.

Error in textmining schema

https://github.com/graknlabs/biograkn/blob/997d1fb2e5c3b139ec7701c0b8ad3a796ca03e97/textmining/schema/text-mining-schema.gql#L110

"InvalidKBException-A structural validation error has occurred. Please correct the [`1`] errors found. 
Rule [entity-extraction-rule] asserts [$e] plays role [extracted-entity] that it can never play
. Please check server logs for the stack trace."

INVALID_ARGUMENT: GraqlQueryException-relation doesn't have an 'isa', a 'sub' or an 'id'. Please check server logs for the stack trace.

Please replace every line in curly brackets { like this } with an appropriate description, and remove this line.

Description

I am using grakn core 1.4.2 and during the schema loading step , i am getting this error

Command: `../../grakn-core-1.4.2/graql console --keyspace xyz --file ./schema.gql`

Error: `INVALID_ARGUMENT: GraqlQueryException-relation doesn't have an 'isa', a 'sub' or an 'id'. Please check server logs for the stack trace.`

Environment

OS (where Grakn server runs): Ubuntu 18.04
Grakn version (and platform): grakn 1.4.2
Other environment details: java openjdk 1.8

Reproducible Steps

I am following this video's steps, have a look

Expected Output

I wasnt expecting this error to pop up

Actual Output

The Error : INVALID_ARGUMENT: GraqlQueryException-relation doesn't have an 'isa', a 'sub' or an 'id'. Please check server logs for the stack trace
{ Please describe what actually happened. }

Additional information

BLAST: define the schema in accordance with the response from BLAST

test the BLAST API for minimal response
test the BLAST API for maximal response
define the first draft of the schema
finalise the schema

resources:

https://unmc.edu/bsbc/docs/NCBI_blast.pdf

output:

<Hit>
	<Hit_num>100</Hit_num>
	<Hit_id>gi|472390071|ref|XP_004414351.1|</Hit_id>
	<Hit_def>PREDICTED: double-stranded RNA-specific editase B2 [Odobenus rosmarus divergens]</Hit_def>
	<Hit_accession>XP_004414351</Hit_accession>
	<Hit_len>745</Hit_len>
	<Hit_hsps>
		<Hsp>
			<Hsp_num>1</Hsp_num>
			<Hsp_bit-score>315.079</Hsp_bit-score>
			<Hsp_score>806</Hsp_score>
			<Hsp_evalue>1.34662e-99</Hsp_evalue>
			<Hsp_query-from>70</Hsp_query-from>
			<Hsp_query-to>248</Hsp_query-to>
			<Hsp_hit-from>567</Hsp_hit-from>
			<Hsp_hit-to>745</Hsp_hit-to>
			<Hsp_query-frame>0</Hsp_query-frame>
			<Hsp_hit-frame>0</Hsp_hit-frame>
			<Hsp_identity>150</Hsp_identity>
			<Hsp_positive>163</Hsp_positive>
			<Hsp_gaps>0</Hsp_gaps>
			<Hsp_align-len>179</Hsp_align-len>
			<Hsp_qseq>RWNVLGLQGALLSHFVEPVYLQSIVVGSLHHTGHLARVMSHRMEGVGQLPASYRHNRPLLSGVSDAEARQPGKSPPFSMNWVVGSADLEIINATTGRRSCGGPSRLCKHVLSARWARLYGRLSTRTPSPGDTPSMYCEAKLGAHTYQSVKQQLFKAFQKAGLGTWVRKPPEQQQFLLTL</Hsp_qseq>
			<Hsp_hseq>RWNVLGLQGALLCHFIEPVYLHSIIVGSLHHTGHLSRVMSLRTEDIGQLPASYRHNQPLLSGVSLAEARQPGKSPHFSVNWVMGNADVEVIDGTTGKRSCGGSSRLCKHVFSARWARLYGKLSTRIPSHGDTPSMYFEAKRGAGTYQSVKQQLFKAFQKAGLGTWVRKPPEQDQFLLTL</Hsp_hseq>
			<Hsp_midline>RWNVLGLQGALL HF+EPVYL SI+VGSLHHTGHL+RVMS R E +GQLPASYRHN+PLLSGVS AEARQPGKSP FS+NWV+G+AD+E+I+ TTG+RSCGG SRLCKHV SARWARLYG+LSTR PS GDTPSMY EAK GA TYQSVKQQLFKAFQKAGLGTWVRKPPEQ QFLLTL</Hsp_midline>
		</Hsp>
	</Hit_hsps>
</Hit>

schema:

define
name sub attribute datatype string;
identifier sub attribute datatype string;
gi sub attribute datatype string;
ref sub attribute datatype string;
sequence-identicality sub attribute datatype double;
sequence-positivity sub attribute datatype double;
sequence-gaps sub attribute datatype double;
####################################################################################
#
####################################################################################
sequence sub attribute datatype string,
    plays sourced-information,
    plays target-sequence,
    plays matched-sequence;
protein sub entity,
    has identifier,
    has sequence,
    has gi,
    has ref,
    plays owned-protein;
species sub entity,
    has name,
    plays species-owner;
####################################################################################
# Alignment
####################################################################################
alignment sub relationship,
    relates aligning-element;
protein-protein-alignment sub alignment,
    relates aligning-protein,
    relates target-protein as aligning-protein,
    relates matched-protein as aligning-protein;
sequence-sequence-alignment sub alignment,
    relates aligned-sequence as aligning-element,
    relates target-sequence as aligned-sequence,
    relates matched-sequence as aligned-sequence,
    has sequence-identicality,
    has sequence-positivity,
    has sequence-gaps;
sequence-alignment-implies-protein-alignment sub rule,
when {
    (target-sequence: $ts, matching-sequence: $ms) isa sequence-sequence-alignment;
    $tp isa protein, has sequence $ts;
    $mp isa protein, has sequence $ms;
} then {
    (target-protein: $tp, matching-protein: $mp) isa protein-protein-alignment;
};
####################################################################################
# BLAST Architecture
####################################################################################
# BLAST Itself is an instance
search-tool sub entity,
    has name,
    plays owner;
search-tool-programme sub entity,
    has name,
    plays owner,
    plays property;
search-algorithm sub entity,
    has name,
    plays property;
database sub entity,
    plays information-source,
    has name;
sourcing-of-information sub relationship,
    relates information-source,
    relates sourced-information;
####################################################################################
# Ownership
####################################################################################
ownership sub relationship,
    relates owner,
    relates property;
transitive-ownership sub rule,
when {
    (owner: $a, property: $b) isa ownership;
    (owner: $b, property: $c) isa ownership;
} then {
    (owner: $a, property: $c) isa ownership;
};
protein-ownership sub ownership,
    relates species-owner as owner,
    relates owned-protein as property;

Enable RBE for test-dists-ubuntu

Modify circle ci config to enable RBE for test-dists-ubuntu

disabled as enabling RBE produced the following error:

https://circleci.com/gh/graknlabs/biograkn/58

BioGrakn Blog Post: demo a simple chatbot application of BioGrakn with DialogFlow

obtain an overall understanding of how Dialogflow works
learn from example applications of Dialogflow
define the Dialogflow on BioGrakn demo
create the demo

Implement the migration scripts in order ingest raw data into the Grakn

Make sure the migration is compliant to the precision medicine schema

Release 1.5.0 compatible BioGrakn DN

Problem to Solve

BioGrakn DN along with its README need will go out of sync with the 1.5 release.

Proposed Solution

Release and publish the README for BioGrakn DN compatible with Grakn Core 1.5.0

Compliancy with Grakn v1.5.2

Make BioGrakn compliant with Grakn version 1.5.2 and respective clients

Build Biograkn distribution containing the precision medicine keyspace

Make sure the distribution is compliant to the 1.4.3 release of Grakn

The test 'assemble-mock-test' would output java.nio.file.NoSuchFileException

Reproducible steps

bazel build @graknlabs_grakn_core//:assemble-mac-zip
cd bazel-bin/external/graknlabs_grakn_core/
unzip grakn-core-all-mac.zip
cd grakn-core-all-mac
./grakn server start
bazel test --test_output=streamed //migrator:assemble-mock-test

Output:

...
10:02:41.830 [grpc-default-worker-ELG-1-2] DEBUG io.grpc.netty.NettyClientHandler - [id: 0x2e64619f, L:/127.0.0.1:54882 - R:/127.0.0.1:48555] OUTBOUND DATA: streamId=873 padding=0 endStream=true length=0 bytes=
10:02:41.831 [grpc-default-worker-ELG-1-2] DEBUG io.grpc.netty.NettyClientHandler - [id: 0x2e64619f, L:/127.0.0.1:54882 - R:/127.0.0.1:48555] INBOUND HEADERS: streamId=873 headers=GrpcHttp2ResponseHeaders[grpc-status: 0] streamDependency=0 weight=16 exclusive=false padding=0 endStream=true
java.nio.file.NoSuchFileException: precisionmedicine/dataset/mock/clinvar/gene_condition_source_id.csv
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
	at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
	at java.nio.file.Files.newByteChannel(Files.java:361)
	at java.nio.file.Files.newByteChannel(Files.java:407)
	at java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384)
...

Require Detailed Step by step tutorial for every usecase in biograkn

Problem to Solve

Would like to have "Getting Started" articles/videos describing step by step how to get the usecases working on user's PC

Current Workaround

Currently It is painstakingly difficult for begineers to get started with grakn, especially with begineers who want to learn from real world problem implementations, and not on toy problems.

Proposed Solution

Would like Grakn to help out with step by step guides to set up their usecase on user's pc. maybe even making scripts to automate stuff.
The Outline of the "Getting Started" can be:

Intro
Environment setup and libraries to install(Irrespective of OS,hardware etc.)
Getting the usecase up and running(Eg. for diseasenetwork)
Explaination through examples on how to query that graph
[Optional] Integration with kglib

Refactor architecture to be usecase specific

Problem to Solve

Refactor BioGrakn architecture to be more modular pertaining to the domain/usecase

Current Workaround

Only one use case present

Proposed Solution

-biograkn
-precisionMedicine
-dataset
-schema
-migrator
-textmining
-dataset
-schema
-migrator

Additional Information

Make circleci use git lfs

Description

circle ci runs out of memory due

Environment

Circle CI

Reproducible Steps

trigger circle CI

Expected Output

Pass

Actual Output

java.lang.OutOfMemoryError: GC overhead limit exceeded
GC overhead limit exceeded

ERROR: bazel ran out of memory and crashed.

Additional information

https://discuss.circleci.com/t/is-git-lfs-supported-by-circleci-2-0/11283/8

Refactor migration to run concurrent insertions

Refactor migration script to collect all inert queries and run them concurrently

Data files not available in `diseasenetwork' and `precisionmedicine`.

Description

I am trying to run migrate.py in diseasenetwork/migrate and i am getting the following error

Traceback (most recent call last):
  File "migrate.py", line 41, in <module>
    data['protein-name'] = i[3]
IndexError: list index out of range

So i inspected the concerned csv files and found that all of them have some git-lfs message inside it similar to the following

version https://git-lfs.github.com/spec/v1
oid sha256:dd0a985f3547fa96f5a2e85f971d04a77e656324529e4a62baf3bb967c2a09da
size 3399920

The data is not proper, or I have missed something here. hence can't get it to work.

Environment

OS (where Grakn server runs): Ubuntu 18.04
Grakn version (and platform): grakn core 1.5.7
Other environment details:

Reproducible Steps

Steps to create the smallest reproducible scenario:

load the schema in keyspace
go to diseasenetwork/migrate
run migrate.py
Get error like

  File "migrate.py", line 41, in <module>
    data['protein-name'] = i[3]
IndexError: list index out of range

Release pipeline for BioGrakn

Circle CI pipeline to release BioGrakn distributions for mac and linux.

https://docs.google.com/spreadsheets/d/1E0asVaOyiSm51TcdU4Pfp_6NZG_pKoby4vwRWnAKolc/edit#gid=1256724151

Missing README.md images

The folder ./examples/biograkn-queries/ does not exist, so there are no pictures displayed in the README page here.

It should be fixed asap, because the outgoing paper refers to this repository.

Write README.md file to describe installation and usage of BioGrakn

Downloading distribution
Running distribution
Querying it including examples

Move big objects located at some of the commits to Git LFS

The git history contains large files which slows git operation. This diminishes developers productivity and we should move these big objects into Git LFS.

First, we'll need to identify what those objects are and in which commits they are located.

Second, remove them from the commit (maybe by using https://rtyley.github.io/bfg-repo-cleaner ?) and move them to git LFS.

Release pipeline for end-to-end GCP deployment

Architect and implement the release workflows for automating the testing and release of BioGrakn on GCP

README is incomplete

Description

I can't find the Precision Medicine Knowledge Graph project!

Reproducible Steps

Steps to create the smallest reproducible scenario:

visit https://github.com/graknlabs/biograkn

Expected Output

To see instructions to run:

BioGrakn DN
Precision Medicine
BLAST Integration

Actual Output

I only see instructions for running BioGrakn DN.

BLAST: Sequence structural/functional analysis

write the schema
come up with the outline of the blog post
write the blog post
publish

typedb / biograkn Goto Github PK

biograkn's Introduction

BioGrakn

BioGrakn

Quickstart

Interacting With BioGrakn

Via Graql Console

Via Grakn Clients

Via Grakn Workbase

biograkn's People

Contributors

Stargazers

Watchers

Forkers

biograkn's Issues

Description

Environment

Reproducible Steps

Expected Output

Actual Output

Additional information

Description

Environment

Description

Reproducible Steps

Expected Output

Actual Output

Additional information

Description

Environment

Reproducible Steps

Environment

Reproducible Steps

Expected Output

Actual Output

Additional information

Problem to Solve

Current Workaround

Proposed Solution

Additional Information

Problem to Solve

Current Workaround

Proposed Solution

Additional Information

Description

Environment

Reproducible Steps

Description

Problem to Solve

Current Workaround

Proposed Solution

Additional Information

Description

Environment

Reproducible Steps

Expected Output

Actual Output

Additional information

resources:

output:

schema:

Problem to Solve

Proposed Solution

Problem to Solve

Current Workaround

Proposed Solution

Problem to Solve

Current Workaround

Proposed Solution

Additional Information

Description

Environment

Reproducible Steps

Expected Output

Actual Output

Additional information

Description

Environment

Reproducible Steps

Description