Giter Club home page Giter Club logo

viral-ngs-deploy's Introduction

viral-ngs-deploy's People

Contributors

biocyberman avatar dpark01 avatar tomkinsc avatar yesimon avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

viral-ngs-deploy's Issues

gsutil in Docker image is not parallelizing

This message is coming through after a dsub finishes a job and stages output (using --output-recursive) back to GCS:

==> NOTE: You are uploading one or more large file(s), which would run
significantly faster if you enable parallel composite uploads. This
feature can be enabled by editing the
"parallel_composite_upload_threshold" value in your .boto
configuration file. However, note that if you do this large files will
be uploaded as `composite objects
<https://cloud.google.com/storage/docs/composite-objects>`_,which
means that any user who downloads such objects will need to have a
compiled crcmod installed (see "gsutil help crcmod"). This is because
without a compiled crcmod, computing checksums on composite objects is
so slow that gsutil disables downloads of composite objects.

Handling of write permissiong to docker's /data volume

Currently I have to set world-writable for host user data directory so that docker can write to this via data volume. So I propose the following solution so that docker can have the same read/write permission as the docker running user has:

Aim: to match userid and groupid of viral-ngs-user inside docker to host's current running user.

In order to do that, viral-ngs installation location inside docker should be moved to system-wide location, for example /opt. We can then move the creation of viral-ngs-user to the ENTRYPOINT script env-wraper.sh, and so that userid and groupid can be matched during docker image startup. In doing so, su-exec tool might be needed. Do you think this is worth doing @tomkinsc ?

add option to easy-deploy script to install py2 version of viral-ngs

This will require first creating a conda environment with Python 2, then installing viral-ngs as a package (rather than simply creating an environment with viral-ngs as the base package). This should be an option that can be set via something like ./easy-deploy-viral-ngs.sh setup-py2

Adapt easy-install script to use local conda package

The easy-install script should be adapted to use a local conda package. For conda install, packages can be specified by path rather than name, so it will probably be necessary to call conda create and specify Python as the base package (this connects with #1), and then call conda install with the package path. This will make it possible to install a test build of the viral-ngs conda recipe.

Allow user to optionally specify branch when installing from git

Currently, the easy-deploy script has an option, setup-git, to install a copy of viral-ngs from git. It would be helpful for this to accept an argument for a specific branch to check out. Additionally, it could be helpful to add the viral-ngs directory to the path, either directly or by symlinking the viral-ngs *.py files to the bin/ directory of the viral-ngs conda environment.

gsutil broken within Docker image

I'm finding that I can't call gsutil from within the Docker image. I see that there are attempts to install the GCP SDK and add it to the path, but I get errors like this when I attempt to dsub the 1.18.0 image and use --input-recursive or --output-recursive. Apparently, that's the only time it really needs gsutil inside the container, and if it doesn't see it, it tries to auto-install it via apt-get, but apt-get fails if we blow away its "lists" directory. I am successfully able to use --input and --output and execute code, so I think everything else probably works.

Example error log:

E: command failed: /tmp/ggp-062979599: line 16: type: gsutil: not found
E: List directory /var/lib/apt/lists/partial is missing. - Acquire (13: Permission denied)
 (exit status 100)

gatk install difficulty

I am having a problem installing gatk

As you can tell I am new to linux
Ubuntu 14lts
I installed the viral-ngs pipeline using the easy deploy shell script.
I can activate the viral-ngs environment and I made the mkdir -p /path/to/gatk_dir
the wget line fails. I recorded the terminal during the install and pasted that below.
I can see the gatk_dir so that worked
I also have several copies of the gatk.tar after a few different attempts but keep running into problems

Any help appreciated.

Best,
James

Script started on Fri 08 Dec 2017 01:24:51 PM CST
�]0;jb013b@jb013b-OptiPlex-9020: /Documents�jb013b@jb013b-OptiPlex-9020:/Documents$ cd ~
�]0;jb013b@jb013b-OptiPlex-9020: �jb013b@jb013b-OptiPlex-9020:$ source activate viral-ngs-env
(viral-ngs-env) �]0;jb013b@jb013b-OptiPlex-9020: �jb013b@jb013b-OptiPlex-9020:$ wget -O - 'https://software.broadinstitute.org/gatk/down
load/auth?package=GATK-archive&version=3.6-0-g89b7209' | tar -xjvC /path/to/gatk_dir
--2017-12-08 13:25:32-- https://software.broadinstitute.org/gatk/download/auth?package=GATK-archive&version=3.6-0-g89b7209
Resolving software.broadinstitute.org (software.broadinstitute.org)... 69.173.92.37
Connecting to software.broadinstitute.org (software.broadinstitute.org)|69.173.92.37|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12610798 (12M) [application/octet-stream]
Saving to: ‘STDOUT’

0% [ ] 0 --.-K/s
1% [ ] 163,840 751KB/s
5% [==> ] 671,744 1.43MB/s
6% [===> ] 843,776 1.21MB/s GenomeAnalysisTK.jar

7% [===> ] 999,424 1.09MB/s
9% [====> ] 1,163,264 1.03MB/s
10% [=====> ] 1,327,104 1006KB/s
12% [======> ] 1,531,904 980KB/s
13% [=======> ] 1,744,896 968KB/s
15% [========> ] 1,966,080 963KB/s
17% [=========> ] 2,187,264 958KB/s
19% [==========> ] 2,433,024 964KB/s
21% [============> ] 2,654,208 960KB/s
22% [============> ] 2,793,472 930KB/s
23% [=============> ] 2,932,736 904KB/s eta 10s
24% [==============> ] 3,088,384 899KB/s eta 10s
25% [==============> ] 3,227,648 828KB/s eta 10s
26% [===============> ] 3,350,528 780KB/s eta 10s
27% [================> ] 3,514,368 776KB/s eta 10s
29% [=================> ] 3,678,208 766KB/s eta 10s
30% [=================> ] 3,850,240 763KB/s eta 10s
31% [==================> ] 4,022,272 750KB/s eta 10s
33% [===================> ] 4,202,496 744KB/s eta 10s
34% [====================> ] 4,358,144 735KB/s eta 10s
36% [=====================> ] 4,546,560 721KB/s eta 10s
37% [======================> ] 4,751,360 716KB/s eta 10s
39% [=======================> ] 4,956,160 704KB/s eta 10s
41% [========================> ] 5,185,536 729KB/s eta 10s
42% [=========================> ] 5,316,608 729KB/s eta 10s
43% [=========================> ] 5,455,872 724KB/s eta 9s
44% [==========================> ] 5,578,752 723KB/s eta 9s
45% [===========================> ] 5,734,400 722KB/s eta 9s
46% [===========================> ] 5,898,240 725KB/s eta 9s
48% [============================> ] 6,070,272 725KB/s eta 9s
49% [=============================> ] 6,217,728 724KB/s eta 8s
50% [==============================> ] 6,406,144 729KB/s eta 8s
52% [===============================> ] 6,594,560 732KB/s eta 8s
53% [================================> ] 6,799,360 740KB/s eta 8s
54% [=================================> ] 6,930,432 724KB/s eta 8s
55% [=================================> ] 7,053,312 697KB/s eta 7s
57% [==================================> ] 7,192,576 680KB/s eta 7s
58% [===================================> ] 7,340,032 650KB/s eta 7s
59% [===================================> ] 7,495,680 657KB/s eta 7s
60% [====================================> ] 7,651,328 665KB/s eta 7s
62% [=====================================> ] 7,823,360 672KB/s eta 6s
63% [======================================> ] 8,003,584 680KB/s eta 6s
64% [=======================================> ] 8,192,000 688KB/s eta 6s
66% [========================================> ] 8,388,608 696KB/s eta 6s
68% [=========================================> ] 8,593,408 705KB/s eta 6s
69% [==========================================> ] 8,790,016 707KB/s eta 5s
71% [===========================================> ] 9,003,008 714KB/s eta 5s
73% [============================================> ] 9,207,808 727KB/s eta 5s
74% [=============================================> ] 9,396,224 745KB/s eta 5s
76% [==============================================> ] 9,617,408 778KB/s eta 5s
77% [===============================================> ] 9,814,016 798KB/s eta 4s
79% [================================================> ] 10,043,392 828KB/s eta 4s
81% [=================================================> ] 10,272,768 858KB/s eta 4s
83% [==================================================> ] 10,526,720 890KB/s eta 4s
85% [===================================================> ] 10,764,288 909KB/s eta 4s
87% [=====================================================> ] 11,026,432 948KB/s eta 2s
89% [======================================================> ] 11,288,576 970KB/s eta 2s
91% [=======================================================> ] 11,567,104 1003KB/s eta 2s
93% [=========================================================> ] 11,845,632 1.01MB/s eta 2s
96% [==========================================================> ] 12,140,544 1.04MB/s eta 2s
98% [============================================================> ] 12,435,456 1.08MB/s eta 0s
100%[=============================================================>] 12,610,798 1.09MB/s in 15s

2017-12-08 13:25:47 (842 KB/s) - written to stdout [12610798/12610798]

tar: GenomeAnalysisTK.jar: Cannot open: Permission denied
resources/exampleFASTA.dict
tar: resources: Cannot mkdir: Permission denied
tar: resources/exampleFASTA.dict: Cannot open: No such file or directory
resources/exampleFASTA.fasta
tar: resources: Cannot mkdir: Permission denied
tar: resources/exampleFASTA.fasta: Cannot open: No such file or directory
resources/exampleBAM.bam
tar: resources: Cannot mkdir: Permission denied
tar: resources/exampleBAM.bam: Cannot open: No such file or directory
resources/exampleBAM.bam.bai
tar: resources: Cannot mkdir: Permission denied
tar: resources/exampleBAM.bam.bai: Cannot open: No such file or directory
resources/exampleFASTA.fasta.fai
tar: resources: Cannot mkdir: Permission denied
tar: resources/exampleFASTA.fasta.fai: Cannot open: No such file or directory
tar: Exiting with failure status due to previous errors
(viral-ngs-env) �]0;jb013b@jb013b-OptiPlex-9020: �jb013b@jb013b-OptiPlex-9020:$ gatk-register /path/to/gatk_dir/GenomeAnalysisTK.jar
ENV_PREFIX /home/jb013b/miniconda3/envs/viral-ngs-env
Processing GenomeAnalysisTK.jar as *.jar
Error: Unable to access jarfile /path/to/gatk_dir/GenomeAnalysisTK.jar
The version of the jar specified, , does not match the version expected by conda: 3.6
(viral-ngs-env) �]0;jb013b@jb013b-OptiPlex-9020: �jb013b@jb013b-OptiPlex-9020:$ exit
exit

Script done on Fri 08 Dec 2017 01:26:26 PM CST

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.