legumeinfo / conektlegumes Goto Github PK
View Code? Open in Web Editor NEWIntegrate CoNekT (Coexpression Network Toolkit) with legumes expression data into LIS.
Integrate CoNekT (Coexpression Network Toolkit) with legumes expression data into LIS.
A consequence of issue #8
Remove the LIS cicar1 dataset and use cicar2 instead. Cicar2 expression measures are based on mapping Desi(ICC) samples on Kabuli(CDC) genome.
Refs:
https://legumeinfo.org/lis_expression/all
https://legumeinfo.org/feature/Cicer/arietinum_CDCFrontier/gene/cicar.CDCFrontier.gnm1.ann1.Ca_04638#pane=geneexpressionprofile
Keep in mind:
Many tables will be affected.
lis_conekt and lis_dev_conekt dbs are not functional anymore after MariaDB in wright was updated by Sam. To make Conekt prod and dev sites work again these two dbs need to be rebuilt from scratch.
Rename the old dbs, Use conekt linux installation instruction to re-establish mySQL schema.
The data should be populated for rebuilding the site at the admin interface of Conekt site (raw loading files should exist with sdash for most large data except lesser ones that need to be redone manually).
Now that it is served to outside via a VM, https://conekt.legumeinfo.org/:
Integrate it into lis_expression module.
a. Have an iframed drupal page/node to serve as container to be displayed via LIS pages.
b. later make links from lis_expression module genes to conekt.
The link from lis_expression(Tripal gene page expression, THIS GENE) is always there irrespective of whether the dataset is loaded to Conekt or not. So, for exmple, Cajca and cicer1 the links will be unproductive and leads to undesired UX.
DO:
Some code in the lis_expression tpl.php to not show link if the species data is not loaded.
(Will attempt it later: when relatively free)
Menu items: Species, Tools (dropdown), and Search
Species: stays unchanged.
Tools:(Hide Coex Network sub-items.
Search: No dropdown. But haven't thought about it yet.
Refer: https://conekt.lis.ncgr.org/species/
Shows:
Transcripts 0 and profiles 0 for species id=5=Cicer arietinum.
This is not correct because there is data for these and conekt actually displays those data. Haven't been able to figure it out but haven't done too thorough an investigation either. This also happened when deleted and reloaded Cicer in lis-dev-conekt too before here in production (This is not a result of pointing production to db in dev; it had happened with production db). This seems not to affect other functionality.
Very soon the VM would run out of space because of data as it did in SQLite database. So the solution is to have the db in Wright, mount it to dev-conekt and set up regular back up to Erdos.
OR
Have everything set up in lis servers for eventual integration to LIS
The front page search, news and a block with moving panels of various tools and functionality. We are not using them at present. Find where they are coded and remove them.
Reproduce thus:
https://conekt.legumeinfo.org/tree/view/6975
Click on phavu.G19833.gnm1.ann1.Phvul.010G077300, 2nd gene from top to show the popup window titled Node.
Hover over the Taxonomy: Phaseolus vulgaris (phavu) to see the link = 'http://conekt/species/view/1'. This is a non-productive link. It should be https://conekt.legumeinfo.org/species/view/1
Wellformed in dev-conekt
In the dev-conekt it is wellformed: http://localhost:5000/species/view/1
Get LIS logos displayed in this production version. This was already done in dev-conekt VM: trace what was done; document in this issue and get it displayed.
Start loading gene expression data (Subsequent to loading species, genes, gene annotation, gene families, tress data #5 ).
Get Conekt configured to use MySQL/MariaDB instead of SQLite. This is meant to be an umbrella issue until MySQL is functional with Conekt
Background: Conekt recommends use of MySQL for production. I have been using SQLite so far for convenience and proof of concept. Decided to use MySQL/MariaDB after loss of the sqlite conekt.db file.
"... ... it looks like the
chickpea desi genes have been inappropriately assigned into the families/trees, probably through kabuli/desi identity switching at some point. In the legfed trees as represented in legumeinfo, only CDCFrontier was included in the family+tree building (Steven's choice). However, here's a tree in Conekt that seems to have a desi member:
https://conekt.legumeinfo.org/tree/view/6975
however, when you look at the descriptors, it seems clear that the CDCFrontier gene Ca_13056 is the one that is really part of this family, while
https://legumeinfo.org/chado_phylotree/legfed_v1_0.L_CSRBJ4?hilite_node=cicar.CDCFrontier.Ca_13056.1
the desi Ca_13056 has been assigned to a completely different family (after the families were constructed and without being included in the tree):
https://legumeinfo.org/feature/Cicer/arietinum_ICC4958/gene/cicar.ICC4958.gnm2.ann1.Ca_13056
So, I guess for purposes of Conekt it would probably be better to use kabuli, even if the samples are desi. Alternatively, we might be able to include both, but desi would not be in the trees (assuming we use the current legfed build anyway)
let me know if you think I'm confused or have confused you.
adf
@adf-ncgr
(Ref: email from adf: 2019-12-27::3:29 pm::LIS conekt VM)
Complete gene and gene description data for all the four species. Phavu is already done.
Install MySQL/MariaDB in Conekt as a first step.
it's possible I am misunderstanding what should happen in some cases, but other than the species and gene id restrictions, the only part of the form that I've been able to get to function in a useful/correct way is setting the "exact" option on description and providing a search term (and I'm actually not sure those results are complete). Almost everything else I try simply doesn't return anything. The one other thing that almost worked was giving a gene family id, but for the one I tried (legfed_v1_0.L_PLNLH7) it returned only the cowpea and soybean members, not common bean or chickpea (both of which do have members in the family). I wonder if there is some post-load indexing that hasn't been done uniformly? For example using IPR012132 as a description term yielded hits for all species except phaseolus, which seems wrong based on what I get from a similar legumeinfo.org gene query.
it is probably worth figuring out some of this, although we could also hide parts of the form that we don't intend to support (e.g. if we don't want to load GO/Interpro analyses as structured data for the time being - these are part of the descriptors and seem to be usable there as long as "exact" is set).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.