greenelab / tribe Goto Github PK
View Code? Open in Web Editor NEWAn open-source webserver that allows for easy, reproducible genomics analyses between different webservers
License: Other
An open-source webserver that allows for easy, reproducible genomics analyses between different webservers
License: Other
Here are some useful links to upgrade this project from Python 2.x to Python 3.x:
Here is a generalized DB schema that takes advantage of registries in identifiers.org
. I added some comments to explain the purpose.
from django.db import models
# Registry in identifiers.org
class Registry(models.Model):
name = models.CharField()
prefix = models.CharField()
description = models.CharField()
# and other attributes of a registry ...
# Entity includes common attributes of any entity (such as gene,
# publication, disease, tissue, etc)
class Entity(models.Model):
accession = models.CharField(null=True) # accession in identifiers.org
registry = models.ForeignKey(Registry, null=True)
# and other attributes shared by all entities ...
# "Gene" is one kind of entity
class Gene(models.Model):
entity = models.OneToOneField(
Entity,
on_delete=models.CASCADE,
primary_key=True,
)
# specific attributes for a gene
scientific_name = models.CharField(max_length=32)
systematic_name = models.CharField(max_length=32)
organism = models.ForeignKey(Organism, ...)
# "Publication" is another kind of entity
class Publication(models.Model):
entity = models.OneToOneField(
Entity,
on_delete=models.CASCADE,
primary_key=True,
)
# specific attributes for a publication
pmid = models.IntegerField(null=True, unique=True, db_index=True)
title = models.TextField()
authors = models.TextField()
date = models.DateField()
journal = models.TextField()
volume = models.TextField(blank=True, null=True)
pages = models.TextField(blank=True, null=True)
issue = models.TextField(blank=True, null=True)
# "Disease" is another kind of entity
class Disease(models.Model):
entity = models.OneToOneField(
Entity,
on_delete=models.CASCADE,
primary_key=True,
)
# and specific attributes for a disease ...
# "Entityset" includes common attributes for any kind of entity set.
# It may include different types of entities.
class Entityset(models.Model):
creator = models.ForeignKey(User)
title = models.TextField()
abstract = models.TextField(null=True)
slug = models.SlugField(help_text="Slugified title field", max_length=75)
public = models.BooleanField(default=False)
deleted = models.BooleanField(default=False)
fork_of = models.ForeignKey('self', editable=False, null=True)
tip_item_count = models.IntegerField(null=True)
# "Geneset" is one kind of "Entityset"
class Geneset(models.Model):
entityset = models.OneToOneField(
Entityset,
on_delete=models.CASCADE,
primary_key=True,
)
organism = models.ForeignKey(Organism)
# and other attributes for a geneset
# Similar models can be defined for "Publicationset" or "Diseaseset" ...
# Version of an Entityset
class Version(models.Model):
entityset = models.ForeignKey(entityset)
creator = models.ForeignKey(User)
ver_hash = models.CharField(db_index=True, max_length=40)
description = models.TextField(null=True)
commit_date = models.DateTimeField(auto_now_add=True)
parent = models.ForeignKey('self', null=True)
# Annotations of entities
class Annotation(models.Model):
version = models.ForeignKey(Version)
primary_entity = models.ForeignKey(Entity) # entity that is being annotated
annotator_entity = models.ForeignKey(Entity) # entity that is the annotation
I got a request:
Is there a way to share a tribe link to the specific version of a geneset.
This should be possible, but is likely to require a bit of work updating our URLs to incorporate a version. That doesn't currently appear to be included.
Right now both a back end and a front end are required for a developer to work on this repo (even for any front-end-only issue). The installation of a local backend is not friendly to a front end developer.
To ease front end development, we can dockerize the backend services (such as web server and DB server). Adage-server
repo is using docker now:
https://github.com/greenelab/adage-server
We can tailor the Dockerfiles there for this project.
Some warnings showed up in the last deployment:
.../tribe/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318:
SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name Indication)
extension to TLS is not available on this platform. This may cause the server to present an incorrect
TLS certificate, which can cause validation failures. You can upgrade to a newer version of Python
to solve this.
For more information, see https://urllib3.readthedocs.io/en/latest/security.html#snimissingwarning.
[tribe.greenelab.com] out: SNIMissingWarning
and
.../tribe/local/lib/python2.7/site-packages/celery_haystack/utils.py:2: RemovedInDjango19Warning:
django.utils.importlib will be removed in Django 1.9.
We should update some related packages to fix them.
Today, when doing a new deployment, I needed to run the following commands manually:
a) Once the initial_setup_and_check
command in the fabfile created the symlink to the static folder (via the private method _make_static()
), I needed to run (from inside the /tribe/tribe
folder):
ln -s ../static/index.html templates/index.html
to create the symlink to the index.html file for django to use.
b) Also, the deploy
fabfile command asked me to pick between spin.js
versions - 2.0.0
and 2.1.0
. I chose 2.1.0
.
However, both of these steps should not be run manually, they should be automated in the deployment process.
Otherwise, we won't be able to deploy any of the changes we make in this repo to the public Tribe instance using current our fab commands.
Due to the absence of /favicon.ico
, there is a warning in the log file /var/log/supervisor/tribe-gunicorn-stderr-*
(which was generated by gunicorn
process):
Not Found: /favicon.ico
and /var/log/nginx/access.log
:
"GET /favicon.ico HTTP/1.1" 404 261
grunt-ngmin
is deprecated:
It is supposed to replaced by ng-annotate
, which is also deprecated now:
ng-annotate
is supposed to replaced by babel-plugin-angularjs-annotate
:
grunt-recess
is deprecated too:
https://www.npmjs.com/package/grunt-recess
No replacement was stated.
Could be a memory issue? Seems that Tribe is having problems handling the large amount of POST requests. During this time, even the load time in https://tribe.greenelab.com/#/use/list is very long.
Since Django 1.8+, BaseCommand.option_list
has been deprecated.
https://docs.djangoproject.com/en/1.8/howto/custom-management-commands/#django.core.management.BaseCommand.option_list
This affects the following management commands:
genesets/management/commands/genesets_update_tip_item_count_all.py
genesets/management/commands/genesets_load_kegg.py
genesets/management/commands/genesets_create_update_user.py
genesets/management/commands/genesets_load_disease.py
genesets/management/commands/genesets_add_geneset_tags.py
genesets/management/commands/genesets_load_go.py
Surprisingly, these management commands were only being used in this test file:
genesets/tests.py
In Elasticsearch 1.X and 2.X, snowball
has been the default analyzer for string data type in django-haystack:
But this analyzer is not mentioned in Elasticsearch official documentation since 5.X:
According to:
https://stackoverflow.com/questions/41859821/why-snowball-analyser-was-removed-in-elasticsearch-5-1
it seems to be replaced by english
analyzer:
https://www.elastic.co/guide/en/elasticsearch/reference/5.0/analysis-lang-analyzer.html#english-analyzer
This affects Tribe
, not probably doesn't affect Adage
web server, because the latter is using a customized adage_snowball
as the default analyzer for strings.
Although snowball
is not listed as an analyzer in Elasticsearch 5.X and 6.X documentation, it seems to be still available. Confirmed by this command:
curl -XGET 'localhost:9200/dhutest/_analyze?pretty' -H 'Content-Type: application/json' -d'
{
"analyzer": "snowball",
"text": "foo bart"
}
'
Output:
{
"tokens" : [
{
"token" : "foo",
"start_offset" : 0,
"end_offset" : 3,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "bart",
"start_offset" : 4,
"end_offset" : 8,
"type" : "<ALPHANUM>",
"position" : 1
}
]
}
This is probably why the latest django-haystack dev version still uses snowball
analyzer and claims that it supports Elasticsearch 5.X.
The confirmation emails sent from Tribe for new account signup and password reset are controlled by the following two templates in allauth
:
allauth/templates/account/email/password_reset_key_message.txt
allauth/templates/account/email/email_confirmation_message.txt
Some of the fields (such as {{ site_name }}
and {{ site_domain }}
) should be customized to replace the default such as example.com
. Here is an example of email confirmation of new account signup:
From: [email protected]
Subject: Tribe:Please Confirm Your E-mail Address
========================================
Hello from example.com!
You're receiving this e-mail because user xxx has given yours as an e-mail address to connect their account.
To confirm this is correct, go to http://tribe.greenelab.com/accounts/confirm-email/xyz.../
Thank you from example.com!
tribe.dartmouth.edu
Here is an example of email confirmation of password reset:
From: [email protected]
Subject: Tribe:Password Reset E-mail
=======================================
Hello from example.com!
You're receiving this e-mail because you or someone else has requested a password for your user account.
It can be safely ignored if you did not request a password reset. Click the link below to reset your password.
http://tribe.greenelab.com/accounts/password/reset/key/blah.../
Thank you for using example.com!
tribe.dartmouth.edu
Obviously example.com
({{ site_name }}
) and tribe.dartmouth.edu
({{ site_domain }}
) in both emails should be customized.
This bug has been fixed in a PR in adage-server:
https://github.com/greenelab/adage-server/pull/364/files#diff-de0262d92c061a11d7cdab94dd18eab8
The same change should work on Tribe too.
npm install
output:npm WARN deprecated [email protected]: use grunt-ng-annotate instead
npm WARN deprecated [email protected]: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue
npm WARN deprecated [email protected]: Deprecated as RECESS is unmaintained
npm WARN deprecated [email protected]: Use the built-in module in node 9.0.0 or newer, instead
npm WARN deprecated [email protected]: use ng-annotate instead
npm WARN deprecated [email protected]: graceful-fs v3.0.0 and before will fail on node releases >= v7.0. Please update to graceful-fs@^4.0.0 as soon as possible. Use 'npm ls graceful-fs' to find it in the tree.
npm WARN deprecated [email protected]: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue
npm WARN deprecated [email protected]: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue
npm WARN deprecated [email protected]: All versions below 4.0.1 of Nodemailer are deprecated. See https://nodemailer.com/status/
npm WARN deprecated [email protected]: stop using this version
npm WARN deprecated [email protected]: This project is unmaintained
npm WARN deprecated [email protected]: If using 2.x branch, please upgrade to at least 2.1.6 to avoid a serious bug with socket data flow and an import issue introduced in 2.1.0
npm WARN deprecated [email protected]: Use uuid module instead
npm WARN deprecated [email protected]: This project is unmaintained
npm WARN deprecated [email protected]: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue
npm WARN deprecated [email protected]: graceful-fs v3.0.0 and before will fail on node releases >= v7.0. Please update to graceful-fs@^4.0.0 as soon as possible. Use 'npm ls graceful-fs' to find it in the tree.
> [email protected] install /home/dhu/github/tribe/interface/node_modules/coffeelint
> [ -e lib/commandline.js ] || npm run compile
> [email protected] install /home/dhu/github/tribe/interface/node_modules/uws
> node-gyp rebuild > build_log.txt 2>&1 || exit 0
> [email protected] postinstall /home/dhu/github/tribe/interface/node_modules/circular-json
> echo ''; echo "\x1B[1mCircularJSON\x1B[0m is in \x1B[4mmaintenance only\x1B[0m, \x1B[1mflatted\x1B[0m is its successor."; echo ''
\x1B[1mCircularJSON\x1B[0m is in \x1B[4mmaintenance only\x1B[0m, \x1B[1mflatted\x1B[0m is its successor.
npm notice created a lockfile as package-lock.json. You should commit this file.
npm WARN [email protected] requires a peer of jasmine-core@* but none is installed. You must install peer dependencies yourself.
npm WARN [email protected] requires a peer of karma@~0.9.4 || ~0.10 but none is installed. You must install peer dependencies yourself.
npm WARN [email protected] No repository field.
npm WARN [email protected] No license field.
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: [email protected] (node_modules/fsevents):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for [email protected]: wanted {"os":"darwin","arch":"any"} (current: {"os":"linux","arch":"x64"})
added 570 packages from 766 contributors and audited 3005 packages in 22.861s
found 25 vulnerabilities (9 low, 6 moderate, 10 high)
run `npm audit fix` to fix them, or `npm audit` for details
bower install
warning of package incompatibility:Unable to find a suitable version for spin.js, please choose one by typing one of the numbers below:
1) spin.js#~2.0.0 which resolved to 2.0.2 and is required by angular-spinner#0.6.2
2) spin.js#~2.1.0 which resolved to 2.1.2 and is required by tribe-interface
https://github.com/greenelab/tribe/network/dependencies#interface%252Fpackage-lock.json
Issues in old repository here: https://bitbucket.org/greenelab/tribe/issues
Right now two configuration files are read in when tribe/setting.py
configures Django:
settings/secrets.ini
(includes confidential info, ignored by the repo)settings/<included_file>
(an extra config file that is included by secrets.ini
in [configfile]
section, mainly for configurations that can be open to the public, such as database engine, 3rd party modules, etc)But the sections that should be included in these two files are hard coded. If for some reason we want to move database engine parameter from <config_file>
to secrets.ini
, we have to modify settings.py
too.
We can make the configuration scheme more user-friendly. Here is the idea:
[configfile]
section into [include]
.settings/
dir into config/
.settings.py
reads secrets.ini
, if secrets.ini
has [include]
section, the options in the included file will be treated as secondary configuration and merged with options in secrets.ini
. For example, if the included file is production.ini
and it specifies DATABASE_PORT
as 5432
, but secrets.ini
specifies the same parameter value as 5433
, then 5433
will be the final option used in Django settings.interface/README.md
is still the one copied from an earlier version of ngbp
project:
https://github.com/ngbp/ngbp
The last update on this project was in 2015. The URL on the title (http://joshdmiller.github.io/ng-boilerplate) is not even working.
We should update README.md
to remove the obsolete parts, and add our own customization.
Bower
is going to be deprecated. bower install
gives the following message:
npm WARN deprecated [email protected]: We don't recommend using Bower for new projects.
Please consider Yarn and Webpack or Parcel. You can read how to migrate legacy project here:
https://bower.io/blog/2017/how-to-migrate-away-from-bower/
/usr/local/bin/bower -> /usr/local/lib/node_modules/bower/bin/bower
Codeship doesn't seem to be working on Tribe. CircleCI
or Travis
may be better choice for automatic testing.
When removing codeship, codeship configuration in tribe/settings.py
should be removed too.
Postgres has been improving full text search and trigram search a lot since version 9.6. I wonder whether it is possible to use them to replace Elasticsearch. If we can, the backend architecture (and deployment) can be greatly simplified. With the GIN
or GIST
indexes on search fields, we don't have to worry about the index updates (which invoke celery
jobs right now).
Right now, Elasticsearch is being used to search genes and genesets. We have 312,983 genes and 408,237 genesets in Postgres backend database.
According to Django's official doc:
https://www.djangoproject.com/download/#supported-versions
Djanog 1.11.x seems to be a reasonable version to upgrade to for two reasons:
Once Django is upgrade to 1.11, many other Django-related packages should be upgraded too. I am going to use this issue to keep track of the upgrade info for all backend packages that depend upon Django.
This was brought up in #8, but was proving to be too involved for the scope of the pull request.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.