biolab / orange3-bioinformatics Goto Github PK
View Code? Open in Web Editor NEW🍊🔬 Bioinformatics add-on for Orange3
License: GNU General Public License v3.0
🍊🔬 Bioinformatics add-on for Orange3
License: GNU General Public License v3.0
When Gene Set Enrichment is empty and there is no data on the input, any change in the filter entry box crashes the widget.
It would also help that when there is no data on the input, the widget would display "(no data on input)" in the Info box.
An init.py file is missing in orangecontrib/bioinformatics//widgets/utils .
This causes the widgets to be invisible on a fresh install.
Release: 3.0.3
3.2.0
3.17.dev
Database Updates downloads selected dataset.
Database Updates downloads all cached datasets.
Database Updates. Deselect all the pre-selected datasets. Then select only one (say, the smallest). Upon pressing Download all, old datasets get selected and the download begins. I think intuitively (and judging by documentation), only the selected datasets should be downloaded.
Download is very slow, perhaps since all datasets get downloaded.
Also, I cannot use Cancel button - it doesn't work.
Showcase workflows must be updated to use Preprocess/Batch Effect Removal widgets, etc.
3.2.0
GO Browser should work when computer is offline. It should use the local and previously loaded information files, or, if these are not available for given organism, the widget should issue an error.
The widget crashes when computer is offline.
Exception: | requests.exceptions.ConnectionError: HTTPConnectionPool(host='orange.biolab.si', port=80): Max retries exceeded with url: /serverfiles-bio2/__INFO__ (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x1881d3860>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
-- | --
Module: | requests.adapters:513
Widget Name: | GO Browser
Widget Module: | orangecontrib.bioinformatics.widgets.OWGOBrowser:326
Widget Scheme: | /var/folders/xs/sz7rs6w902g8gvtvhyt68w640000gn/T/ows-xlkov6mh.ows.xml
Version: | 3.16.0
Environment: | Python 3.6.1 on Darwin 17.7.0 Darwin Kernel Version 17.7.0: Thu Jun 21 22:53:14 PDT 2018; root:xnu-4570.71.2~1/RELEASE_X86_64 x86_64
Installed Packages: | AnyQt==0.0.8, Bottleneck==1.2.0, CommonMark==0.7.3, Genesis-PyAPI==1.2.1, Orange3-Bioinformatics==3.2.0, Orange3-SingleCell==0.8.1, Orange3==3.16.0, PyQt5==5.9, astropy==3.0.4, certifi==2018.8.24, chardet==3.0.4, cycler==0.10.0, decorator==4.3.0, dill==0.2.6, docutils==0.13.1, fastdtw==0.3.2, future==0.16.0, h5py==2.8.0, idna==2.7, joblib==0.11, keyring==10.3.1, keyrings.alt==2.2, kiwisolver==1.0.1, loompy==2.0.12, matplotlib==3.0.0, networkx==2.1, numpy==1.12.1, pandas==0.23.4, pip==9.0.1, pyparsing==2.2.1, pyqtgraph==0.10.0, python-dateutil==2.7.3, python-louvain==0.11, pytz==2018.5, requests-cache==0.4.13, requests==2.19.1, scikit-learn==0.18.2, scipy==0.19.1, serverfiles==0.2.1, setuptools==40.4.1, sip==4.19.3, six==1.10.0, slumber==0.7.1, typing==3.6.6, urllib3==1.23, wheel==0.31.1, xlrd==1.0.0
Machine ID: | 154505275726980
Stack Trace: | Traceback (most recent call last): File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connection.py", line 171, in _new_conn (self._dns_host, self.port), self.timeout, **extra_kw) File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/util/connection.py", line 56, in create_connection for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM): File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 743, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags):socket.gaierror: [Errno 8] nodename nor servname provided, or not knownDuring handling of the above exception, another exception occurred:Traceback (most recent call last): File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen chunked=chunked) File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connectionpool.py", line 354, in _make_request conn.request(method, url, **httplib_request_kw) File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1239, in request self._send_request(method, url, body, headers, encode_chunked) File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1285, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1234, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1026, in _send_output self.send(msg) File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 964, in send self.connect() File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connection.py", line 196, in connect conn = self._new_conn() File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connection.py", line 180, in _new_conn self, "Failed to establish a new connection: %s" % e)urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x1881d3860>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not knownDuring handling of the above exception, another exception occurred:Traceback (most recent call last): File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/requests/adapters.py", line 445, in send timeout=timeout File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connectionpool.py", line 667, in urlopen **response_kw) File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connectionpool.py", line 667, in urlopen **response_kw) File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen _stacktrace=sys.exc_info()[2]) File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/util/retry.py", line 398, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause))urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='orange.biolab.si', port=80): Max retries exceeded with url: /serverfiles-bio2/__INFO__ (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x1881d3860>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))During handling of the above exception, another exception occurred:Traceback (most recent call last): File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/Orange/canvas/scheme/widgetsscheme.py", line 573, in create_widget_instance widget.__init__() File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/orangecontrib/bioinformatics/widgets/OWGOBrowser.py", line 326, in __init__ for _, annotation_file in set(serverfiles.ServerFiles().listfiles(DOMAIN) + serverfiles.listfiles(DOMAIN)) File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/serverfiles/__init__.py", line 188, in listfiles self._download_server_info() File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/serverfiles/__init__.py", line 179, in _download_server_info t = self._open("__INFO__") File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/serverfiles/__init__.py", line 294, in _open return self._server_request(self.server, *args) File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/serverfiles/__init__.py", line 291, in _server_request return self.req.get(root+"/".join(path), auth=auth, verify=False, timeout=TIMEOUT, stream=True) File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/requests/sessions.py", line 525, in get return self.request('GET', url, **kwargs) File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/requests/sessions.py", line 512, in request resp = self.send(prep, **send_kwargs) File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/requests/sessions.py", line 622, in send r = adapter.send(request, **kwargs) File "/Applications/scOrange.app/Contents/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/requests/adapters.py", line 513, in send raise ConnectionError(e, request=request)requests.exceptions.ConnectionError: HTTPConnectionPool(host='orange.biolab.si', port=80): Max retries exceeded with url: /serverfiles-bio2/__INFO__ (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x1881d3860>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
Local Variables: | OrderedDict([('cert', None), ('chunked', False), ('conn', <urllib3.connectionpool.HTTPConnectionPool object at 0x1881c8320>), ('proxies', OrderedDict()), ('request', <PreparedRequest [GET]>), ('self', <requests.adapters.HTTPAdapter object at 0x1881c8390>), ('stream', True), ('timeout', <urllib3.util.timeout.Timeout object at 0x1881c81d0>), ('url', '/serverfiles-bio2/__INFO__'), ('verify', False)])
The selection in Entity Sets should be remembered according to the selected organism. Either use context-depending setting or simply remember the selection for a specific organism. For the new organism, select the first item (do not leave selection blank).
Selection is blank for every new data set.
Example:
GO Annotations for 10090
should be
GO annotations for Mus musculus
Orange version 3.4 to 3.9
Plot Volcano Plot
Using the caffeine data set from the volcano plot example... error reads "negative values in input". This happens no matter what using any data set from GEO. Or if I provide my own. Even if I get read of all negatives in dataset.
Run the example Geo Data set from example documentation.
The widget should use gene id and organism information from the input data set, and no longer require a user to provide any of these information:
GEO Data Sets should attach information on the name of the gene_id_attribute ("Entrez ID"). Currently, this annotation is missing.
Add another output signal to the Cluster Analysis widget, that outputs results in a tabular form.
The table should have one row per result (gene-cluster pair) and all the relevant variables and metadata.
Add another output signal, which is similar, but lists GO terms per cluster in the same way.
Gene set enrichment widget should output the data profiled with selected genes. The data format should follow the format of input data: if the genes are in rows, so should be in the output, if in columns, the same. The output should also pass the information on genes and organism.
Currently, upon selection of gene sets in this widget, the widget does not produce any output.
I was testing the widget with brown-selected data set (File -> Gene Name Matcher -> Gene Set Enrichment). Note that the selection of, say, biosynthetic process, should output the data containing 139 genes (in rows) and all the data (all columns). Note that selection of several gene sets should produce the data with union of genes from the sets.
Test also this functionality on single cell data, where the number of columns (genes) on the output should be reduced, while the number of rows (cells) should be retained.
Note: keep meta and class data, if in input, on the output as well.
Minor:
Since legacy orange-bio add-on, a lot has changed. For example, #60 introduced GUI changes for some of the widgets.
Some widgets are not supported anymore and others were completely re-designed, e.g., #47
Documentation needs to be updated:
Documentation missing:
The default sorting of the terms in the GO Browser (before user changes anything) should be by the p-value, in both upper (hierarchical) and lower pane with results.
Add a Hypergeometric test for binary expression data.
score_hypergeom(a, b)
methodSome experimentation will be needed on how to assign score to display in the distribution plot.
See attached code snippet for a concrete example.
scratch-hypergeom.py.zip
Entity set names should be human-readable. They should start with upper case. Do not use underscores. For instance, instead of biological_process use Biological process.
This "bug" is probably not the feature of this widget but needs changes in a database system or on the server.
Provide support for additional input file that provides for custom-defined genes and gene sets:
GO Browser should present its icon as soon as the widget is introduced to canvas.
Choose some data set from GEO Data Sets, draw a connection, choose GO Browser. The widget takes a while to appear and display progress.
3.2.0
3.17.dev
When one clicks on a pathway, the widget should output selected data.
Widget outputs nothing, despite constant attempts. Nodes in the pathway seem clickable, but there is nothing on the output (or at least some instances with no features).
GEO Data Sets (GDS2914) - KEGG Pathways (AGE-RAGE pathway) - Data Table. Data Table is empty, regardless of what one clicks.
GeneMatcher module needs a GUI.
The widget implementation needs:
Code coverage should not be below (at least) 85%.
Here is a request for a revision of user interface used by Gene Name Matcher widget that would better expose gene names and allow selection of the subset of input data.
Widget has one output (see Data Table for similar design):
Data on the output of dictyExpress should be named. Use "%s (%s)" % (project, experiment)
Data sets are at present not named.
In case of preliminary annotation data, it would be great to load annotations from local files.
Include "Load from local file ..." button in the same row as "Update all" and other buttons, but push it all the way to the right. Enable loading GO and KEGG annotations for a start. Annotation file should include all other necessary information, like data source name.
The widget should output column (or attribute of an attribute) with the ID of the gene even if an object (column, or attribute's attribute) with the name "Gene ID" already exists. In this case, construct the name "Gene ID (1)" or "Gene ID (i)", where i is an integer and "Gene ID (i)" is the first occurrence of this name with lowest i that does not exist yet. Do this only if new ID's are different from the old ones.
When "Gene ID" exists, the widget does not add gene IDs.
In its current implementation Gene Sets widgets allows filtering of input data sets according to a set of genes from some data base, like GO or KEGG. Gene Sets would report on how many genes from the input data set are present in some gene set (GO term, KEGG pathway). We could use this same widget on some gene selection, and turn it into enrichment analysis. This was already implemented in the previous version of bioinformatics add-on, and the idea here is to reintroduce enrichment-based analytics in the current version. For a reference, here is how "old" Gene Set Enrichment widget looked like:
Gene Sets should output selected genes along with any metadata and class information.
Class information and metadata is not on the output if genes are in the columns.
Separate synonym in Gene Info with comma (","), rather than with current bar sign ("|"). Also, remove leading and trailing bar.
dictyExpress should output the data that includes gene id and data attributes that provide info on where this information is found (gene_id_attribute). This probably requires gene name - entrez matching just after loading of the data.
Use webhook-integrations for this repository to automatically build and deploy orange3-bioinformatics add-on documentation.
Differential expression is broken. Test it on GDS531 and compare it to the old implementation.
When Gene Name Matcher is empty and there is no data on the input, any change in the filter entry box crashes the widget.
It would also help that when there is no data on the input, the widget would display "(no data on input)" in the Info box.
Create a box "Output" with Genes in rows, Samples in rows (default genes in rows). See also GEO widget.
The DE widget should output a table with a score for each gene. Score is dependent on
the currently selected DE scoring method.
Alternatively (perhaps using a checkbox), all scores can be appended as columns in the table.
The corresponding output signal always remains empty.
When clicking on the term in, say, a label from Molecular Function, I get 404 Not Found warning. An example is a link to molecular_function term, which is linked to
http://amigo.geneontology.org/cgi-bin/amigo/term-details.cgi?term=GO:0003674
the page that does not exist.
None of the Cytobands and Reactome terms work, but they seem not to have a link. E.g., 1q23.1 is printed in blue, but the link is empty (nothing happens).
On the other hand, KEGG links seem to work.
GO Browser is supposed to accept reference data, with the reference set of genes that would replace the "Entire genome" when computing the enrichment.
When provided the reference data set, the widget issues a warning: 'list' object has no attribute 'intersection'. It seems that this functionality was not ported from Python 2.
3.2
3.16
Hi,
After installation on bioinformatics add-on through GUI interface I've obtained only nine widgets Databases update, GEO, Dicty, Gene Name, Differential Expression, GO brow, KEGG path, Gen Set enrich and cluster analysis)
I uninstall it and use orange commande prompt. It ask for cython installation. I've done it and reinstall through commad prompt without any errors.
However, the problem is still there...
Thanks for your Help
Guillaume.
to-do:
Widgets should work with annotated tables seamlessly, without additional settings in control area.
Widget that needs change:
On my system 3.4 Python (Ubuntu 14.04) the taxonomy does not work:
Traceback (most recent call last):
File "/home/marko/orange3-bioinformatics/orangecontrib/bioinformatics/ncbi/taxonomy/utils.py", line 359, in <module>
main()
File "/home/marko/orange3-bioinformatics/orangecontrib/bioinformatics/ncbi/taxonomy/utils.py", line 354, in main
strains = test.get_all_strains("562")
File "/home/marko/orange3-bioinformatics/orangecontrib/bioinformatics/ncbi/taxonomy/utils.py", line 112, in get_all_strains
return self._tax.strains(tax_id)
File "/home/marko/orange3-bioinformatics/orangecontrib/bioinformatics/ncbi/taxonomy/utils.py", line 218, in strains
""", (tax_id, ))
sqlite3.OperationalError: near "WITH": syntax error
My sqlite3.sqlite_version
is 3.8.2. The WITH statement is only supported since 3.8.3.
The widget should use gene id and organism information from the input data set, and no longer require a user to provide any of these information:
Sorting of the list of GO terms should be by p-value in order from lowest p-value to the highest.
The list of GO terms is sorted, but in the opposite order, from high to low p-values.
The widget should use gene id and organism information from the input data set, and no longer require a user to provide any of these information:
Current gene ontology enrichment runs quite slow and consumes a lot of system memory.
The proposal is, that we use gotea module from Genialis repository: https://github.com/genialis/gotea
Usage: https://github.com/genialis/resolwe-bio/blob/master/resolwe_bio/tools/goea.py#L64
This change will introduce a major redesign in current GO module and OWGOBrowser widget.
Sort organisms in Gene Info alphabetically. Now they are not sorted and it is hard to find the right organism. Also, before we implement a functionality that would guess the "right" organism, make Homo sapiens a default organism.
Replace "Use attribute names" in the GUI with "Use column names".
Cache hint should indicate which data files are local.
Cache hint is displayed only for most recently loaded file, other hints are set to False.
Open widget, load several files in sequence.
Some protocols output .mtx files that can contain gene ID in attributes-of-attributes.
Any chances Gene Matcher could look there too?
See example file: aplysia_cAMP_sample.pkl.gz
Genes are listed in rows of an input table ("Stored in data column").
The IDs in the output are wrong.
In this example, 17218 is the correct ID.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.