This is a QIIME 2 plugin. For details on QIIME 2, see https://qiime2.org.
qiime2 / q2-longitudinal Goto Github PK
View Code? Open in Web Editor NEWQIIME 2 plugin for paired sample comparisons
License: BSD 3-Clause "New" or "Revised" License
QIIME 2 plugin for paired sample comparisons
License: BSD 3-Clause "New" or "Revised" License
This is a QIIME 2 plugin. For details on QIIME 2, see https://qiime2.org.
spaghetti is great, but (in the words of @gregcaporaso ):
users are going to want to know which subjects the outlier lines (or any lines, for that matter) in these plots are... For example, you might be able to achieve this with mouse-overs that highlight a specific line and give more information about it including the subject id.
$ qiime intervention paired-differences --m-metadata-file ecam_map_maturity.txt --m-metadata-file ecam_shannon.qza --p-metric shannon --p-group-column delivery --p-state-column month --p-state-1 12 --p-state-2 0 --p-individual-id-column not-a-column --o-visualization ecam-delivery-alpha --p-no-drop-duplicates --verbose
Traceback (most recent call last):
File "/Users/gregcaporaso/miniconda3/envs/qiime2-2017.7/lib/python3.5/site-packages/pandas/core/indexes/base.py", line 2442, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5280)
File "pandas/_libs/index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5126)
File "pandas/_libs/hashtable_class_helper.pxi", line 1210, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:20523)
File "pandas/_libs/hashtable_class_helper.pxi", line 1218, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:20477)
KeyError: 'not-a-column'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/gregcaporaso/miniconda3/envs/qiime2-2017.7/lib/python3.5/site-packages/q2cli/commands.py", line 222, in __call__
results = action(**arguments)
File "<decorator-gen-251>", line 2, in paired_differences
File "/Users/gregcaporaso/miniconda3/envs/qiime2-2017.7/lib/python3.5/site-packages/qiime2/sdk/action.py", line 201, in callable_wrapper
output_types, provenance)
File "/Users/gregcaporaso/miniconda3/envs/qiime2-2017.7/lib/python3.5/site-packages/qiime2/sdk/action.py", line 392, in _callable_executor_
ret_val = callable(output_dir=temp_dir, **view_args)
File "/Users/gregcaporaso/miniconda3/envs/qiime2-2017.7/lib/python3.5/site-packages/q2_intervention/_intervention.py", line 38, in paired_differences
drop_duplicates=drop_duplicates)
File "/Users/gregcaporaso/miniconda3/envs/qiime2-2017.7/lib/python3.5/site-packages/q2_intervention/_utilities.py", line 36, in _get_group_pairs
for individual_id in set(group_md[individual_id_column]):
File "/Users/gregcaporaso/miniconda3/envs/qiime2-2017.7/lib/python3.5/site-packages/pandas/core/frame.py", line 1964, in __getitem__
return self._getitem_column(key)
File "/Users/gregcaporaso/miniconda3/envs/qiime2-2017.7/lib/python3.5/site-packages/pandas/core/frame.py", line 1971, in _getitem_column
return self._get_item_cache(key)
File "/Users/gregcaporaso/miniconda3/envs/qiime2-2017.7/lib/python3.5/site-packages/pandas/core/generic.py", line 1645, in _get_item_cache
values = self._data.get(item)
File "/Users/gregcaporaso/miniconda3/envs/qiime2-2017.7/lib/python3.5/site-packages/pandas/core/internals.py", line 3590, in get
loc = self.items.get_loc(item)
File "/Users/gregcaporaso/miniconda3/envs/qiime2-2017.7/lib/python3.5/site-packages/pandas/core/indexes/base.py", line 2444, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5280)
File "pandas/_libs/index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5126)
File "pandas/_libs/hashtable_class_helper.pxi", line 1210, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:20523)
File "pandas/_libs/hashtable_class_helper.pxi", line 1218, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:20477)
KeyError: 'not-a-column'
Plugin error from intervention:
'not-a-column'
See above for debug info.
Should instead say something like: The individual column specified (not-a-column) is not a column name in the sample metadata. Available columns are: ...
Confirm that any time a category is passed as a type different than qiime2.MetadataCategory
that you catch these issues.
In the paired-differences
boxplot, the y-axis label should be Difference in {metric} (state 2 - state 1)
, or you could get more fancy with it and actually use the state_column
, state1
and state2
variables, in which case it could be: Difference in {metric} ({state_column} {state2} - {state_column} {state2})
(e.g., Difference in shannon (month 12 - month 0))
In Paired difference tests table, can you include the test name and the test statistic name (currently it just says stat, but you should be able to keep a dict mapping test name to test statistic name so that this label could be more informative. See here for an example. Also, please make the P column label say P value and FDR P -> FDR P-value.
Can the Multiple group tests table be transposed so it matches the others?
I'm confused about what the difference is between the Multiple group tests and Pairwise comparison tests tables when there are only two groups. It might help to have a brief description of what each test is (and including the test name in each would help with this). When there are only two groups, should the results of these tests be different (they are in the README example, so just confirming that that is expected).
These should also be applied to the pairwise-distances
visualization.
Improvement Description
spaghetti color is defined by the group value at the initial state.
ideally, spaghetti color should change dynamically. E.g., some metadata categories (like antibiotic use or other exposures) may change longitudinally for a subject. It would be nice to capture those.
Comments
That probably cannot be done easily... but it would be pretty cool if it could.
References
This beta-dispersion test might be a good candidate.
Proposed Behavior
show N per group per state as histogram or line plot sharing axis with main (volatility) plot.
and/or toggle sample size in x-axis label?
Comments
If that's difficult/ugly forget about it — but it might save folks from manually typing in this info for pub-ready figures.
Improvement Description
Just wondering if there are plan to implement the two methods below. They can also be used for longitudinal microbiome analysis with a given distance matrix.
References
This will be confusing for users if they accidentally specify an state value that doesn't correspond to something in their data...
You should probably throw an error if there are no paired samples being evaluated.
This also suggests that it's going to be important to tell the user how many samples were included in each test. Could you include n
(number of paired samples per group) in all of the tables? See the pairwise table here for one example of where we do this.
When a single independent variable is used for LME, plots fail to generate because a single AxesSubplot
object is generated — the current code expects multiple variables/subplots, and the ability to index these.
Key error is here:
File "/Users/nbokulich/miniconda3/envs/qiime2-2017.8/lib/python3.5/site-packages/q2_longitudinal/_utilities.py", line 351, in _regplot_subplots_from_dataframe
ax=axes[num], lowess=lowess, ci=ci)
TypeError: 'AxesSubplot' object does not support indexing
This bug is noted in this forum post.
Improvement Behavior
click to show/hide individuals/groups.
Current Behavior
currently, can click on the group legend to show single groups, but this only does one at a time.
Proposed Behavior
Would be very useful to, e.g., drop one or more groups to focus on specific groups for comparison.
Comment
same with individuals (spaghetti) but no such feature exists. Would be very helpful to hide all spaghetti but one, for example, to compare an individual's trajectory vs. the group mean.
these are the actual test result! The figures are ornamental.
I am exploring the way to import taxa and mapping data in python and develop new functions.
Could you show me a quick example how to import both taxa and mapping file. Then write a function (that follow _utility.py style) to calculate a beta diversity at Week 0?
Below is my code to include taxa information. But I am not sure what will be the standard way to import mapping information in Qiime2 artifact API.
from qiime2 import Artifact
taxa = Artifact.load("../tutorial_data/ecam-table-taxa.qza")
taxa_df = taxa.view(pd.DataFrame)
The pairwise test visualizers should be renamed to pairwise-differences
and pairwise-distances
for clarity. @nbokulich and I discussed this offline.
it might be really useful for users to be able to download the raw distances/differences. this could look like a sample metadata file where the rows are:
sample-id <tab> {metric} difference <tab> group
or
sample-id <tab> {metric} distance <tab> group
palette
parameteralpha-rarefaction
) could be useful, though this will require more structural changes to how these are handled at inputActually, many of the parameters for this action could be useful as interactive features. E.g., interactively set x-tick intervals, yscale, xscale, but these are less important than those listed above.
Example and idea provided by @elong0527:
X-axis = time (or other continuous metric)
y-axis = subject ID
points colored by group
category (should accept categorical or continuous metadata, infer type, and color-code accordingly)
Could also add a parameter to change size or shape of points based on other optional metadata category inputs???
You mention that this is possible, but it'd be better to just use that in your example since that's the preferred way to do this (since it retains provenance where exporting the alpha diversity data wouldn't).
--m-metadata-file ecam_map_maturity.txt
You could also link to the metadata tutorial, which has a good description of this.
Computes first differences (differences in Y between sequential samples across time X)
X Y FD
1 1
2 3 2
3 7 4
4 11 4
Accepts metadata files, [optionally] distance matrix or feature table
Support value interpolation? Would be easy for series data but not for matrix!
CLARIFICATION: without a group-column
selected, mean lines should still be drawn, but calculated across all samples rather than aggregating by group. (edited 4/23/18)
or other specified time point
ΔYt=Yt−Y0
Should use the new citation API in qiime2/qiime2#387
see @gregcaporaso 's comment in #36
One more thought: Would it be worth adding a download link for the data used to generate these plots? Since we're not including any statistics, it could be useful to allow the user to get that data to do statistics on their own. If we did that, I think you'd want a tsv file that looks something like:
delivery month studyid shannon vaginal 0 42 2.2 cesarian 0 43 3.0
(EDITED: to make the example file tab-separated text instead of comma-separated)
Hello,
I am using q2-longitudinal in a bit of a different way than it was originally perceived for and some strange behavior has resulted. This issue has arisen in creating a volatility plot using this code:
qiime longitudinal volatility
--m-metadata-file ../EXMP_Sample_metadata_3_17_2018.tsv
--m-metadata-file EXMP-200-4562-single-core-metrics-results/shannon_vector.qza
--p-metric shannon
--p-group-column activity
--p-state-column sample_number
--p-individual-id-column redcap_survey_identifier
--p-spaghetti yes
--o-visualization EXMP-200-4562-single-shannon-volatility.qzv
I am introducing an intervention period to be compared to a baseline period both prior to and after the intervention. As you can see this creates a volatility plot that is unusual:
It causes some strange behavior that I am not so sure is a bug, but rather arising from the differing way I am trying to do things.
Thanks,
Arron Shiffer
As with the other plugins in QIIME 2, we provide official tutorials as part of https://github.com/qiime2/docs, and unofficial tutorials on the forum. We should clear out this README of the existing tutorial content and ensure that things get moved to the appropriate location (docs or forum).
It would be helpful to link to a key that would help with interpreting the Model summary and Model results sections - we're going to get a lot of questions about interpreting these (I'm not exactly sure how to interpret them myself). I would recommend including links in those sections the visualization if possible, and if not expanding on the interpretation in the tutorial.
How could I find the explanation of data in the folder "tutorial_data"?
I have few questions.
Question:
This can be fixed by specifying a {% block title %}
in the base template and passing in the correct title via the context (it looks like the title is generally already available, for other parts of the HTML document).
i.e., between an individual at time t and baseline (or another specified time)
Some inputs to LME will result in a singular matrix error:
File "/Users/nbokulich/miniconda3/envs/qiime2-2017.8/lib/python3.5/site-packages/numpy/linalg/linalg.py", line 90, in _raise_linalgerror_singular
raise LinAlgError("Singular matrix")
numpy.linalg.linalg.LinAlgError: Singular matrix
This is not a bug — it is due to improper inputs to LME — but should fail gracefully so users can respond appropriately.
The issue seems to be that if the independent variables passed to LME are covariates, this results in a singular matrix error, either due to the correlation between covariates or to the lack of variance between group subcategories.
This issue was reported in this forum post.
Blocking #79
see @gregcaporaso 's comment here
I think this must be left over from before we had optional artifact support (the feature table is now optional in this visualizer, this documentation is just out of date): A feature table artifact is required input, though whether "metric" is derived from the feature table or metadata is optional.
(unless if MetadataCategories
ever transpire).
related to this forum post
use optional artifacts and "metric" to determine behavior
Thus paired tests can be performed in a single group. E.g., does metric X change between states 1 and 2 in ALL samples (not stratified by group).
forum xref and xref
Comments
this would be a new visualizer. thanks to @antgonza for the suggestion.
References
See compare_trajectories.py
show all individuals in control charts, colored by group membership
Currently, linear-mixed-effects
displays one plot for each independent factor input to the model. Instead of displaying all plots simultaneously, it could be useful to allow the user to choose which plot to display using a drop-down menu.
There are a lot of default metadata column names (eg. I would make these required parameters (without defaults) because it's unlikely that your defaults will be useful for most people, and if they're required it won't make users think that they need to rename the columns in their sample metadata.
Instead of using the term Metadata category, can you use Metadata column? We're switching our terminology since category doesn't necessarily make sense for continuous variables, so it'd be good to start making that change in documentation. For example: Metadata category on which to separate groups for comparison.
I recommend state-pre
and state-post
be renamed to state-1
and state-2
, since pre/post aren't always relevant (e.g., you mention that "States" can also commonly be methodological).
I think non-parametric tests should always be the default:
--p-parametric / --p-no-parametric
[default: True]
Perform parametric (ANOVA
and t-tests) or non-parametric (Kruskal-
Wallis, Wilcoxon, and Mann-Whitney U tests)
statistical tests.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.