legumeinfo / arachispheno Goto Github PK

View Code? Open in Web Editor NEW

This project forked from 1001genomes/arapheno

0.0 4.0 0.0 26.6 MB

AraPheno source code for http://arapheno.1001genomes.org

License: MIT License

Python 47.43% HTML 43.99% CSS 0.49% JavaScript 7.84% Dockerfile 0.11% Shell 0.15%

arachispheno's People

Watchers

arachispheno's Issues

Update copyright notice in LICENSE

Phenotype detail map shows number of replicates, not accessions

Split off from issue #5:

Inconsistency in the number of accessions:
This file (study) has Unique acc = 794. The page with this link
http://dev.lis.ncgr.org:50007/phenotype/5/ shows 'Geographic
distribution of 876 accessions' above the world map.
But the table view in the same page has correct number of accessions: 793.

I already changed the map label count from replicates to accessions, but then determined that the map itself shows the number of replicates for each country, not accessions. Working on that now.

Authentication to access ArachisPheno site

On April-08-2020, a week from today, there is a meeting with the researchers whose data we are using here. They have insisted to keep the data private until it is published (in future). But we need to share with them this utility. Can we have some kind of authenticated access ready for them before the meeting?
Refer: #2
From the schema it looks like there is author, group and public access for studies. This could work for us too for this purpose. Or go the quick temporary way Andrew suggested in #2, "... use apache basic authentication (after we get it running via apache, that is)"

Prettify the phenotype value histogram ticks

The number of digits should be just right.

Change phenotype DOI links from DataCite to PubMed

Each phenotype in AraPheno has a DOI link that goes to DataCite, for example
https://search.datacite.org/works/10.21958/phenotype:672
as well as a "Cite" link that can generate properly formatted citations in various formats (APA, BibTeX, etc). Do our phenotypes already have a DOI link that goes to PubMed (or elsewhere), or is this something we have to set up?

Also, does our study have an associated publication?

Fix # Values column

Missing or incorrect in the study-phenotype table, the accession-phenotype table, and also the REST API.

This problem existed in AraPheno, so do a pull request once fixed.

Restore Download Database feature

Requires updating the database.zip, using
python manage.py generate_database_dump
to keep it current.

Phenotype transformations

These replace each phenotype value v with a functional transformation, like log(v), sqr(v), sqrt(v).

Some of the resulting histograms look reasonable, like that for seed_weight. Note that the untransformed value with the highest frequency is around 150, and log(150) ~ 5 which has the highest frequency in the log-transformation histogram.
http://dev.lis.ncgr.org:50007/phenotype/20/transformation/

Others, like #1_kernel_weight, seem off. Here, note that the highest value is around 125, log(125) ~ 4.8 but the corresponding value in the log-transformation histogram is about 5.1, and sqr(125) = 15625 but the highest value in the sqr-transformation histogram is about 27000, etc.
http://dev.lis.ncgr.org:50007/phenotype/1/transformation/

On closer examination, AraPheno's default behavior for a transform f is not to use f(v) as expected, but to use
f(v - min(vv) + 0.1*var(vv))
where vv is the list of all values. This must be some kind of statistical correction (? I am still researching it). However, it is possible to tell it to not do this in ArachisPheno.

ArachisPheno crashes if no Studies exist

This behavior existed in AraPheno, it throws an error instead of displaying the Home page if no Studies exist. I initially had to add a spurious Study to the database by hand to get it to run.

OperationalError at phenotype and study

Going to any phenotype or the study page shows:
OperationalError at /phenotype/19/
no such column: phenotypedb_submission.is_private
... ... ...
Error during template rendering
In template /home/svengato/ArachisPheno/html/base.html, error at line 0
no such column: phenotypedb_submission.is_private

... ... ...

ArachisPheno: Customization

The ArachisPheno doesn't have to be an exact replicate of the Arabidopsis portal. We will need only certain features, at least, in the beginning. This issue is to help us gradually settle on which features, UI included, we need. The features list should evolve as we work through the site and the data we have keeping in mind the flexibility factor for future needs.

Peanut Core collection data into CSV format

Make a matrix format csv file from my summary spreadsheet of Roshan's trait data. Then, Sven can check its suitability for loading and other related issues.

Restore Take a Tour feature

Easy to do, whenever we are ready for it.

Change directory names and file names containing 'arapheno' to something more general

Maybe just 'pheno'?

Loading test with csv file

Document issues with test loading the data in csv format.

Finalize accession format

I will run through the migration process one more time when we finalize the [accession id, accession name, replicate id] format (with underscore or whatever).

I have been holding off on this in order to implement it in the next migration event (such as adding model fields for distinguishing private data).

Should I update these this morning, before people start looking at ArachisPheno? I can change them directly in the database for now.
As I understand it, the final format will involve underscores: accession id = PI_152111, accession name = PI_152111, replicate id = PI_152111_1 ?

Allow categorical (string) and binary (boolean) phenotypes

Among other things, this would require a categorical bar chart rather than a numerically binned histogram.

Enable/disable user authentication

We previously added user authentication (issue #6), but would like the ability to configure a public version that does not require it.

Allow HTML code in study description

Sudhansu requests the ability to put line breaks and URL links in the study description, so that the study detail and study list pages format them correctly.

Update README.md now that we have public data

Private data: Can a dataset/study be kept private?

Issue description copied from email.
0327, 10.45 a.m. sd-adf-sg

There is a related issue in my mind but I will spell it out in an issue in Arachpheno: Is it possible to have some datasets public and some datasets private in Arapheno. This becomes relevant when we serve the public minicore data. May be Sven already knows about the public vs. private data.

I doubt that it has this ability- I haven't seen any sort of login mechanism. could be wrong, though.

As far as I can tell, all of the AraPheno data are public. One of the database tables (auth_permission) lists various permission levels, but I do not think they are used for anything. I have no idea how hard it would be to add private data, as in the InterMines.

do you mean the MyMine feature of the intermines? that's a bit different I think, in that only the lists and queries one makes are private;
I don't think there's any private data per se.

Site access restriction:

I am in favour of password protection that you have suggested, we should be able to share the presentation of the data among us insiders and the collaborators.

Like an initial login page? I will look into it.

If nothing else, we could just use apache basic authentication
(after we get it running via apache, that is)

Add phenotype metadata

This is part of the study curation process. Once a user submits a study, someone with (a) knowledge of the phenotypes, (b) admin access has to add ontology terms and other metadata for each phenotype. Once this is complete, the curator may publish the study.

For example, I set the unit ontology for yield_avg_kg_ha as it is clearly in units of kg/hectare.

Fix REST API

These bugs are all present in the original AraPheno. There may be others.

Phenotype lists: fields like num_values are often missing.
Do num_values and number_replicates mean the same thing? If so, we could eliminate the latter.
Phenotype lists use AraPheno's DOI (10.21958), as defined in arapheno/arapheno/settings/defaults.py. Do we have our own DOI?
Missing commas between ontology types in the header. (This is a simple oversight in arapheno/phenotypedb/renderer.py, easy to fix).

Another alternative is to hide the REST API for now.

(split off from issue #21)

Scrub and prepare minicore data (for public version of ArachisPheno)

from Sudhansu:

The peanut pheno public data is in our DS and below is the link to the two relevant files.

The relevant DS directory: https://v1.legumefederation.org/data/public/Arachis_hypogaea/minicore.trt.JWYM/

Descriptors file: https://v1.legumefederation.org/data/public/Arachis_hypogaea/minicore.trt.JWYM/arahy.mincore.trt.JWYM.descriptors.xlsx
Data file: https://v1.legumefederation.org/data/public/Arachis_hypogaea/minicore.trt.JWYM/arahy.mincore.trt.JWYM.observations.xlsx

The Data file (observations.xlsx) has several sheets and the two sheets that has the pheno data are:

Obs-mini_core sheet

These phenotype data should have two observations for each accession, one for year 2013 and the other for year 2015. I think we should be able to treat them as replications.

protein_oil-mini_core sheet

The structure comes out after you sort on the basis of the 'accession' column. Each accession has mostly 2-3 replications.
Because we are able to store replicate data in in our database we should do that to preserve the original data as much as possible.

Please let me know if we should meet to talk about the data after you go through the files.

Thanks for moving it forward quickly.

Sudhansu

Purge unused AraPheno institutional logos

Lurking in static/img, invoked in html/home/[about | links].html and maybe elsewhere.

The Flash-y Explorer: disable or give guidance?

My curiosity finally got the better of me enough to figure out the rather byzantine rigamarole required to enable flash to work in chrome on a specific site. The result (for AraPheno) is shown below, showing perhaps that stomata in Spanish Arabidopsis tend to be larger than those in Scandinavian countries (due to the falling of the rains mainly on the plains? or maybe another climate-related factor)

It also works on ArachisPheno, though I have yet to find a cute example. Seems like a reasonably flexible tool for display along various dimensions, could probably be reimplemented in something less universally deprecated than Flash but would take some time. Just wanted to note that it's actually possible to use this, if you are willing to be sufficiently sycophantic to your browser. I followed steps given here:
https://www.freecodecamp.org/news/how-to-enable-adobe-flash-player-in-google-chrome/
which worked as long as I followed them to the very end rather than assuming that it would work after the basic setting to Allow Flash was enabled

Accessions and Genotypes

The ones provided by AraPheno are for Arabidopsis, so I removed them. We can add peanut accessions and genotypes later if desired.

sqlite> delete from phenotypedb_genotype_accessions;
sqlite> delete from phenotypedb_genotype;
sqlite> delete from phenotypedb_accession;

Update or remove "Impressum" footer?

This legal disclaimer pops up when you click on Impressum at the bottom of the original AraPheno application, but is commented out in ArachisPheno. In issue #3, I speculated that it may be a European legal requirement, and wondered whether we need it.

Among other things, it mentions that the application uses Google Analytics and suggests how the user can prevent it by refusing to set cookies. Another option would be to remove that functionality, if possible.

legumeinfo / arachispheno Goto Github PK

arachispheno's People

Watchers

arachispheno's Issues

Site access restriction:

Recommend Projects

Recommend Topics

Recommend Org