Giter Club home page Giter Club logo

q2-fragment-insertion's Introduction

q2-fragment-insertion

This is a QIIME 2 plugin. For details on QIIME 2, see https://qiime2.org.

q2-fragment-insertion's People

Contributors

adswafford avatar colinbrislawn avatar colinvwood avatar david-rod avatar ebolyen avatar gregcaporaso avatar hagenjp avatar lizgehret avatar oddant1 avatar q2d2 avatar sjanssen2 avatar thermokarst avatar turanoo avatar wasade avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

q2-fragment-insertion's Issues

SILVA reference

Improvement Description
It should be possible to download the QIIME compatible version of Silva and construct reference phylogeny and alignment for SEPP to enable 18S analyses.

Questions

  1. @josenavas @wasade do you know if release 128 is the latest?

  2. How and where would we host SEPP compatible references? Within this Plugin (which is already 130 MB large), on the github repo?

check fragment names

according to Siavash, SEPP might fail if fragments to be inserted have same names as tips of reference tree. Add a testing function to abort early if user provides conflicting names.

How about internal node names?

ITS / 18S

Improvement Description
Jake asked if it would be possible to compute insertion trees not only for 16S but also for 18S and ITS.

Comments
I think that would work in principle, however we would need to create reference trees for the according databases (Silva and Unite). Any comments?

Migration to bioconda

Improvement Description
I finally was able to clean up Siavash's source code and created a bioconda recipe for SEPP, producing the packages at https://anaconda.org/bioconda/sepp

Note that this package does NOT contain the default Greengenes 13.8 99% reference (which consists of three files a) alignment b) tree c) info file.) In the future, we also want to support alternative references like SILVA.

Proposed Behavior
I wonder how we best do this? I see the following options:

  1. create conda reference packages for GG and / or SILVA
    pro: no changes to current behaviour of qiime2
    con: where to host? would bioconda accept that?
  2. provide as qiime2 Data resources
    pro: smaller downloads when qiime2 gets installed, easy to host
    con: user need to do extra work when a) install and b) execute since file paths for all three files need to be provided

Questions
Any thoughts @thermokarst @antgonza ?

References

  1. recipe
  2. https://anaconda.org/bioconda/sepp

ENH: "Preserve" original node names

Improvement Description
"Preserve" original feature IDs by renaming with the rename-json.py output by SEPP.

Because SEPP renames nodes , the trees it produces don't play nice with downstream tools like Empress that can color trees using feature metadata.

Current Behavior
This tree cannot be easily colored by taxonomy, because the node IDs do not map to the original feature IDs.
image

Proposed Behavior
Use the rename-json.py script output by SEPP to "preserve" original feature IDs, probably by exposing a new parameter so as to not impact runtimes.

Update readme

with respect to later q2 version and the newly optional reference inputs

Installation does not work (channel-independent issue)

Hello,

I've run into an error when installing this package into a QIIME2 (2018.8) conda environment (Miniconda3-latest-Linux-x86_64) installed into my home directory on a computing cluster (i.e., barnacle).

This is the code I ran to install the package:

$ conda install -c anaconda -c defaults -c conda-forge -c bioconda -c https://conda.anaconda.org/biocore q2-fragment-insertion

This is the error that prints to screen when running the install code:

BEGINNING OF ERROR PRINTED TO SCREEN

Solving environment: failed

>>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<

Traceback (most recent call last):
  File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/exceptions.py", line 819, in __call__
    return func(*args, **kwargs)
  File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/cli/main.py", line 78, in _main
    exit_code = do_call(args, p)
  File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/cli/conda_argparse.py", line 77, in do_call
    exit_code = getattr(module, func_name)(args, parser)
  File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/cli/main_install.py", line 11, in execute
    install(args, parser, 'install')
  File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/cli/install.py", line 235, in install
    force_reinstall=context.force,
  File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/solve.py", line 518, in solve_for_transaction
    force_remove, force_reinstall)
  File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/solve.py", line 451, in solve_for_diff
    final_precs = self.solve_final_state(deps_modifier, prune, ignore_pinned, force_remove)
  File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/solve.py", line 180, in solve_final_state
    index, r = self._prepare(prepared_specs)
  File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/solve.py", line 592, in _prepare
    self.subdirs, prepared_specs)
  File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/index.py", line 215, in get_reduced_index
    new_records = query_all(spec)
  File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/index.py", line 184, in query_all
    return tuple(concat(future.result() for future in as_completed(futures)))
  File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/subdir_data.py", line 95, in query
    self.load()
  File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/subdir_data.py", line 149, in load
    _internal_state = self._load()
  File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/subdir_data.py", line 246, in _load
    _internal_state = self._process_raw_repodata_str(raw_repodata_str)
  File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/subdir_data.py", line 369, in _process_raw_repodata_str
    info['fn'] = fn
TypeError: 'NoneType' object does not support item assignment

$ /home/jpshaffer/software/miniconda3/bin/conda install -c anaconda -c defaults -c conda-forge -c bioconda -c https://conda.anaconda.org/biocore q2-fragment-insertion

environment variables:
CIO_TEST=
CONDA_DEFAULT_ENV=qiime2-2018.8
CONDA_EXE=/home/jpshaffer/software/miniconda3/bin/conda
CONDA_PREFIX=/home/jpshaffer/software/miniconda3/envs/qiime2-2018.8
CONDA_PROMPT_MODIFIER=(qiime2-2018.8)
CONDA_PYTHON_EXE=/home/jpshaffer/software/miniconda3/bin/python
CONDA_ROOT=/home/jpshaffer/software/miniconda3
CONDA_SHLVL=1
MANPATH=/opt/slurm-18.08.0/share/man:/opt/torque-4.2.8/man:
MODULEPATH=/opt/modules/Modules/versions:/opt/modules/Modules/$MODULE_VERSION/mod
ulefiles:/opt/modules/Modules/modulefiles
PATH=/home/jpshaffer/software/miniconda3/envs/qiime2-2018.8/bin:/home/jpsha
ffer/software/miniconda3/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin
:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/gold/2.2.0.5/sbin:/opt/
gold/2.2.0.5/bin:/opt/torque-4.2.8/bin:/opt/torque-4.2.8/sbin:/opt/mau
i-3.3.1/bin:/opt/slurm-18.08.0/bin:/opt/slurm-18.08.0/sbin
PYTHONNOUSERSITE=/home/jpshaffer/software/miniconda3/envs/qiime2-2018.8/lib/python*/sit
e-packages/
REQUESTS_CA_BUNDLE=
SSL_CERT_FILE=

 active environment : qiime2-2018.8
active env location : /home/jpshaffer/software/miniconda3/envs/qiime2-2018.8
        shell level : 1
   user config file : /home/jpshaffer/.condarc

populated config files :
conda version : 4.5.11
conda-build version : not installed
python version : 3.7.0.final.0
base environment : /home/jpshaffer/software/miniconda3 (writable)
channel URLs : https://conda.anaconda.org/anaconda/linux-64
https://conda.anaconda.org/anaconda/noarch
https://repo.anaconda.com/pkgs/main/linux-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/free/linux-64
https://repo.anaconda.com/pkgs/free/noarch
https://repo.anaconda.com/pkgs/r/linux-64
https://repo.anaconda.com/pkgs/r/noarch
https://repo.anaconda.com/pkgs/pro/linux-64
https://repo.anaconda.com/pkgs/pro/noarch
https://conda.anaconda.org/conda-forge/linux-64
https://conda.anaconda.org/conda-forge/noarch
https://conda.anaconda.org/bioconda/linux-64
https://conda.anaconda.org/bioconda/noarch
https://conda.anaconda.org/biocore/linux-64
https://conda.anaconda.org/biocore/noarch
package cache : /home/jpshaffer/software/miniconda3/pkgs
/home/jpshaffer/.conda/pkgs
envs directories : /home/jpshaffer/software/miniconda3/envs
/home/jpshaffer/.conda/envs
platform : linux-64
user-agent : conda/4.5.11 requests/2.19.1 CPython/3.7.0 Linux/2.6.32-573.26.1.el6.x86_64.debug centos/6.6 glibc/2.12
UID:GID : 420084:550
netrc file : None
offline mode : False

An unexpected error has occurred. Conda has prepared the above report.
If submitted, this report will be used by core maintainers to improve
future releases of conda.

END OF ERROR PRINTED TO SCREEN

I was able to reproduce the error after uninstalling and reinstalling both Miniconda and the QIIME2 environment.

Please let me know if you need additional information to troubleshoot this error.

Thanks in advance and best wishes,

Justin

Qiita integration

Hi @antgonza @josenavas ,
I hope that we will have soon completed the q2 plugin for SEPP. I wonder how we would integrate that into Qiita? Can you wrap general qiime2 plugins or would we have to create our own Qiita plugin?
Would you consider SEPP a tool for data processing or (meta)-analysis?

Taxonomy

Improvement Description
I thought about the FeatureData[Taxonomy] artifact and Daniel's warnings about the quality of the assigned taxonomic labels, which depend on the quality of the placements of taxonomic labels in the reference phylogeny. Furthermore, fragment insertion is not unambiguous, but results in a distribution of positions and I remember Siavash suggesting his program TIPP for taxonomy assignment. Thus, I think we better organize creation of a FeatureData[Taxonomy] as a separate function instead of integrating it into the main function ("sepp").

Proposed Behavior
Currently, I am thinking about two alternatives to generate a FeatureData[Taxonomy]:

  1. classify-paths: the current method which collects all taxonomic labels along the path from tip to root. Single input would be the Phylogeny[Rooted] artifact.

  2. classify-otus: For every inserted fragment, we traverse the tree from tip to root. In every step, we check if we can find any OTU nodes in the current sub-tree. If so, we stop, otherwise continue the same procedure with the parent node. Once we found one (or maybe several) OTUs, we look up their assigned taxonomy lineage in Greengenes/Silva taxonomy table for corresponding reference tree. In case of several OTUs we report the longest commong prefix. This would require two inputs, the Phylogeny[Rooted] artifact and the taxonomy table from Greengenes with two columns: OTU-ID and lineage-string. This is the more conservative method and should only produce results en par with current Greengenes based taxonomy assignment algorithms.

  3. classify-tipp: A feature development could use Siavash's TIPP to generate taxonomic lineages.

Questions
@wasade what are your thoughts?

Plugin description would be nice

Improvement Description
This plugin doesn't have a top-level description registered for the user docs

image

Proposed Behavior
A plain-language description would make it easier for users to understand what q2-fragment-insertion can do, and could increase usage.

Installation does not work

Hello.

I am trying the following:

  conda config --add channels anaconda
  conda config --add channels conda-forge
  conda config --add channels defaults
  conda config --add channels r
  conda config --add channels bioconda
  

  conda install -c qiime2/label/r2018.6 qiime2
  conda install -c anaconda -c defaults -c conda-forge -c bioconda -c https://conda.anaconda.org/biocore q2-fragment-insertion
  qiime dev refresh-cache

But, when trying to "Solve the environment", I am getting the PackagesNotFoundError:

conda install -c anaconda -c defaults -c conda-forge -c bioconda -c https://conda.anaconda.org/biocore q2-fragment-insertion
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  - q2-fragment-insertion
  - q2cli[version='>=2017.12.*']
  - q2-fragment-insertion
  - q2-feature-table[version='>=2017.12.*']
  - q2-fragment-insertion
  - q2-types[version='>=2017.12.*']
  - q2-fragment-insertion
  - q2templates[version='>=2017.12.*']

Current channels:

  - https://conda.anaconda.org/anaconda/linux-64
  - https://conda.anaconda.org/anaconda/noarch
  - https://repo.anaconda.com/pkgs/main/linux-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/free/linux-64
  - https://repo.anaconda.com/pkgs/free/noarch
  - https://repo.anaconda.com/pkgs/r/linux-64
  - https://repo.anaconda.com/pkgs/r/noarch
  - https://repo.anaconda.com/pkgs/pro/linux-64
  - https://repo.anaconda.com/pkgs/pro/noarch
  - https://conda.anaconda.org/conda-forge/linux-64
  - https://conda.anaconda.org/conda-forge/noarch
  - https://conda.anaconda.org/bioconda/linux-64
  - https://conda.anaconda.org/bioconda/noarch
  - https://conda.anaconda.org/biocore/linux-64
  - https://conda.anaconda.org/biocore/noarch
  - https://conda.anaconda.org/r/linux-64
  - https://conda.anaconda.org/r/noarch

I am trying to create a Singularity container with qiime2 plus your extension.

Thank you.

Anders.

Trouble with Silva 128 in classify-otus-experimental ?

I ran fragment insertion as seen in the tutorial. I used the Silva 128 provided tree and alignment. My insertion tree was created, and I filtered my feature table. However, once I get to the classify-otus-experimental step and use the Silva 128 consensus 7 level taxonomy, I get the following error:

Not all OTUs in the provided insertion tree have mappings in the provided reference taxonomy.
I am attaching my insertion tree.
Any help would be appreciated!

insertion-tree.qza.gz

open ToDos

Improvement Description
PR #66 introduced major changes to the plugin and we have some open ToDos. Let us keep track of them here with this list:

  • Publish new QZAs on docs.qiime2.org for GreenGenes and SILVA, for the new database format defined above, using sepp-refs as source data.
  • Move readme information from this repo to library.qiime2.org
  • think about a mechanism to provide default values within the new reference set qzas for e.g. alignment_subset_size that can override plugin defaults, but also can be overwritten by user flags
  • find a way to check consistency between reference alignment/tree and raxml info file when creating reference sets
  • rough in method to merge database components
  • rough in method to destructure database components

References
PR #66

reference as input or parameter

Hi @wasade ,
testing is currently not very convenient, because of the long waiting times. Therefore, I think passing reference tree/alignment would be quite beneficial. I wonder how to design that.

Since both are Semantic Types (FeatureData[AlignedSequence] Phylogeny[Rooted]) they can only be "inputs" not "parameters" right? If so, do you know if it is possible to have optional inputs?

If not, the user needs to always pass reference alignment and reference phylogeny as q2 artifacts. Do we really want to put that burden to users or would we be fine to have two "parameters" (which can be optional) that point to filenames?

P.S. could you invite me to the slack channel for q2?

filter biom table

add a qiime2 function feature-table -> phylogeny -> feature-table that removes those features not found in phylogeny.
And maybe reports about lost read ratio?!

Plugin error from fragment-insertion: Command '['run-sepp.sh' returned non-zero exit status 1

Bug Description
Hello I've been trying to use q2-fragment insertion in order to use PICRUSt2, following the instructions from the original source, unfortunately I got an error from this plug in, in some forum I saw the same error, and followed the instructions using this command:
first I tried it with the files of my interest but then I tried the files provided in the tutorial
qiime fragment-insertion sepp --i-representative-sequences mammal_seqs.qza --p-threads 12 --i-reference-alignment reference.fna.qza --i-reference-phylogeny reference.tre.qza --output-dir pruebapicrust2tutorial --p-debug --verbose 2> err.txt > out.txt
there was no follow up on the error.

References
in order to view a more detailed information here are the files
err.txt
out.txt

Comments
I'm using an hp with the following hardware:
AMD® A12-9720p radeon r7, 12 compute cores 4c+8g × 4

I though it may be a problem with the installation, so I removed qiime2 and reinstalled. I updated Anaconda and conda to the latest version.
Thank you

qiime phylogeny align-to-tree-mafft-fasttree

Excuse me, how to solve this problem

Plugin error from phylogeny:

Command '['mafft', '--preservecase', '--inputorder', '--thread', '33', '/tmp/qiime2-archive-tspmm41w/4dd87431-bf1c-465f-8f38-2d4c3a9605cf/data/dna-sequences.fasta']' returned non-zero exit status 1.

Debug info has been saved to /tmp/qiime2-q2cli-err-6e_xupmi.log

renamed files

https://github.com/wasade/q2-fragment-insertion/blob/64d4b52847fef856ebcf01c6459395af0dcb5c7f/q2_fragment_insertion/_insertion.py#L50

Is there a reason why you chose to not use the tree and placement files (.relabelled) that have the restored internal node labels? As far as I understand the code, Siavash assigns every node a unique ID and prefixed the original label with this ID. In a postprocessing step (a generated python program) those IDs get trimmed from the labels to restore their original values.
Thus, users don't see those IDs in e.g. the taxonomy labels of the reference.

Classify - otus experimental had blank taxon

Bug Description
I ran the classify otus experimental and I was getting an error that one of the entries was a float and it couldn't parse it. After digging into the taxonomy file, it looks like one of the entries was blank, and it was reading it as NaN, and it broke it. Once I deleted the line, everything ran fine.

Questions
Any chance something could be coded to avoid this issue in the future?

p-threads

double check if --p-threads is correctly passed to executable

verbosity

If SEPP fails it should be more verbose, i.e. override Siavash's trap function which eliminates protocols and thus hinders debugging.

merge placements

Improvement Description
There are increasing numbers of use cases where one wants to merge placements from different runs against the same reference phylogeny.

Questions

  1. Would it make sense to provide anther "function" within the plugin which accepts a list of placement files and produces one phylogeny out of it, or would that be to much of an expert process that we would not want to expose to the "normal" QIIME2 user to not confuse him/her?
  2. @wasade what is your opinion on that?

fragment-insertion sepp to display # of inserted features upon completion.

It would be very useful if upon completion sepp would print out the # of successfully inserted features.

So far, working on human and mouse samples I've never had a case where any features failed to be inserted to the tree, but I still do the filtering step and each time my table is unchanged. The filtering step can also take a little time depending on the number of features you have. It would be super convenient if sepp can just take print how many features it inserted and the user could compare that number to their feature-table reads and see if a filtering step is needed or not.
Alternatively, the full insertion and filtering can be turned into a pipeline to do all in one go.

Thanks!

Migrate to QIIME 2 Org?

Following up on a months-old discussion regarding including this plugin in the QIIME 2 Core Distribution. Here on some options for us to proceed:

  1. Migrate this repo to the @qiime2 org, add contributors here as maintainers in the new org
  2. Keep the repo under biocore, give a handful of @qiime2 devs maintainer perms
  3. Something else entirely?

@sjanssen2, I think the easiest path is for us to go with 1 - since this will be the least friction for busywork to be wired up.

@antgonza, @sjanssen2, @ebolyen, @gregcaporaso, @nbokulich (and probably more, apologies if my list is incomplete) have discussed getting this into the "core" distribution of QIIME 2, and we would really like that to happen in time for the upcoming release of QIIME 2 (2018.11, scheduled for this Thursday). I don't expect there to be too much to get this rolled into the distro, but, it would be a lot simpler if we moved this over to @qiime2.

Thoughts, @sjanssen2?

"no action filter-features" qiime fragment-insertion filter-features

Hi there,

I stumbled on a weird behaviour with qiime fragment-insertions where why I run the following I get an error that it cannot file the 'filter-features' option.

qiime fragment-insertion filter-features \
  --i-table $path2table \
  --i-tree insertion-tree.qza \
  --o-filtered-table filtered_table.qza \
  --o-removed-table removed_table.qza

Returns the following error:

Error: QIIME 2 plugin 'fragment-insertion' has no action 'filter-features'.

Further, if I look at the qiime fragment-insertion --help there are only two options classify-otus-experimental and sepp

I would be very grateful for any help you could provide. I'm an amateur bioinformatician and I have now exhausted my troubleshooting skills.

I am running QIIME 2 version 2018.4.0 with fragment-insterion 2018.2.0.dev0. See attached for my complete qiime info output.

Thank you very much for your help (and the easy-to-use software!!)

Courtney

qiime_version.txt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.