Giter Club home page Giter Club logo

dc_genomics_docker's Introduction

Genomics Data Carpentry Dockerfile

Container development for Data Carpentry Genomics lessons

Draft instructions

Setup config files

Enter usernames for your image users

  1. Clone this repo and place docker-persistant/ in a convenient location on your server. In docker-persistant/ edit usernames.txt; this file should have one or more valid linux username(s) (one name per line). Accounts will be created in your container for each user. The sample list has dcuser which will be assigned the password by the script 'data4Carp'

    tip: You can edit the password in line 25 and 39 of initiate.sh

    Note: Your user will have a home directory at /home/$user This will be a symbolic link to a folder docker-persistant/$user that will be created on the machine running the docker container. In this way, data and changes made by the user on the hub will exist persistently outside of the container.

Copy docker-persistant

  1. Place docker-persistant/ in a suitable location on the machine where Docker is hosted. The -v option used at execution will bind this folder.

  2. Make sure /docker-persistant/initiate.sh is executable:

     chmod +x SOMEPATH/docker-persistant/initiate.sh
    

Running the container

  1. Pull the image from dockerhub

     docker pull jasonjwilliamsny/dc_genomics:dev_1.8
    
  2. Start the container with this command (remember to edit the location of docker-persistant/)

     docker run -p 8787:8787 -p 22:22 --name dc_genomics -d -v SOMEPATH/docker-persistant:/docker-persistant jasonjwilliamsny/dc_genomics:dev_1.8
    
  3. Rstudio will be available at the ip address of the machine

     127.0.0.1:8787
     localhost:8787
    
  4. SSH will be accessible at the ip address of the machine

     127.0.0.1:22
     localhost:22
    

    Login

      dcuser
      data4Carp
    

dc_genomics_docker's People

Contributors

jasonjwilliamsny avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

naupaka kbieser

dc_genomics_docker's Issues

SSH port choice for container

Running the container ssh port on 22 may conflict with connecting to the container while configuring the host machine via an existing ssh session on port 22.

Enforce versions of R, R-Studio, and R packages

In the docker file there are several installs that currently will pull the latest versions. Need to update the file to a specific R version, R-Studio version, and versions of R packages. Maybe want to wait on this until all maintainers agree on the needed packages.

Path issue with Illumina adapters for Trimmomatic

This command from the lesson doesn't work:

cp ~/miniconda3/pkgs/trimmomatic-0.38-0/share/trimmomatic-0.38-0/adapters/NexteraPE-PE.fa .

because of how the things are installed. This works instead:

cp /opt/conda/pkgs/trimmomatic-0.38-0/share/trimmomatic-0.38-0/adapters/NexteraPE-PE.fa .

Possible problem with Java and trimmomatic

This only happens with one set of fastq files

SRR2584863_1.fastq.gz SRR2584863_2.fastq.gz

TrimmomaticPE:` Started with arguments:
 SRR2584863_1.fastq.gz SRR2584863_2.fastq.gz SRR2584863_1.trim.fastq.gz SRR2584863_1un.trim.fastq.gz SRR2584863_2.trim.fastq.gz SRR2584863_2un.trim.fastq.gz SLIDINGWINDOW:4:20 MINLEN:25 ILLUMINACLIP:NexteraPE-PE.fa:2:40:15
Multiple cores found: Using 2 threads
Using PrefixPair: 'AGATGTGTATAAGAGACAG' and 'AGATGTGTATAAGAGACAG'
Using Long Clipping Sequence: 'GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG'
Using Long Clipping Sequence: 'TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG'
Using Long Clipping Sequence: 'CTGTCTCTTATACACATCTCCGAGCCCACGAGAC'
Using Long Clipping Sequence: 'CTGTCTCTTATACACATCTGACGCTGCCGACGA'
ILLUMINACLIP: Using 1 prefix pairs, 4 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Quality encoding detected as phred33
#
[thread 897 also had an error]
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGILL (0x4) at pc=0x00007fe808491072, pid=871, tid=896
#
# JRE version: OpenJDK Runtime Environment (11.0.1+13) (build 11.0.1+13-LTS)
# Java VM: OpenJDK 64-Bit Server VM (11.0.1+13-LTS, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# J 413 c2 org.usadellab.trimmomatic.trim.IlluminaClippingTrimmer$IlluminaLongClippingSeq.readsSeqCompare(Lorg/usadellab/trimmomatic/fastq/FastqRecord;)Ljava/lang/Integer; (306 bytes) @ 0x00007fe808491072 [0x00007fe80848f560+0x0000000000001b12]
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /docker-persistant/dcuser/dc_workshop/data/untrimmed_fastq/hs_err_pid871.log
Could not load hsdis-amd64.so; library not loadable; PrintAssembly is disabled
#
# If you would like to submit a bug report, please visit:
#   http://www.azulsystems.com/support/
#
/bin/trimmomatic: line 60:   871 Aborted                 /bin/java -Xms512m -Xmx1g -jar /opt/conda/pkgs/trimmomatic-0.38-0/share/trimmomatic-0.38-0/trimmomatic.jar PE SRR2584863_1.fastq.gz SRR2584863_2.fastq.gz SRR2584863_1.trim.fastq.gz SRR2584863_1un.trim.fastq.gz SRR2584863_2.trim.fastq.gz SRR2584863_2un.trim.fastq.gz SLIDINGWINDOW:4:20 MINLEN:25 ILLUMINACLIP:NexteraPE-PE.fa:2:40:15

Also, some java info:

java -XshowSettings:properties -version
Property settings:
    awt.toolkit = sun.awt.X11.XToolkit
    file.encoding = ANSI_X3.4-1968
    file.separator = /
    java.awt.graphicsenv = sun.awt.X11GraphicsEnvironment
    java.awt.printerjob = sun.print.PSPrinterJob
    java.class.path = 
    java.class.version = 55.0
    java.home = /opt/conda/pkgs/openjdk-11.0.1-h516909a_1015
    java.io.tmpdir = /tmp
    java.library.path = /usr/java/packages/lib
        /usr/lib64
        /lib64
        /lib
        /usr/lib
    java.runtime.name = OpenJDK Runtime Environment
    java.runtime.version = 11.0.1+13-LTS
    java.specification.name = Java Platform API Specification
    java.specification.vendor = Oracle Corporation
    java.specification.version = 11
    java.vendor = Azul Systems, Inc.
    java.vendor.url = http://www.azulsystems.com/
    java.vendor.url.bug = http://www.azulsystems.com/support/
    java.vendor.version = Zulu11.2+3
    java.version = 11.0.1
    java.version.date = 2018-10-16
    java.vm.compressedOopsMode = 32-bit
    java.vm.info = mixed mode
    java.vm.name = OpenJDK 64-Bit Server VM
    java.vm.specification.name = Java Virtual Machine Specification
    java.vm.specification.vendor = Oracle Corporation
    java.vm.specification.version = 11
    java.vm.vendor = Azul Systems, Inc.
    java.vm.version = 11.0.1+13-LTS
    jdk.debug = release
    line.separator = \n 
    os.arch = amd64
    os.name = Linux
    os.version = 4.9.125-linuxkit
    path.separator = :
    sun.arch.data.model = 64
    sun.boot.library.path = /opt/conda/pkgs/openjdk-11.0.1-h516909a_1015/lib
    sun.cpu.endian = little
    sun.cpu.isalist = 
    sun.io.unicode.encoding = UnicodeLittle
    sun.java.launcher = SUN_STANDARD
    sun.jnu.encoding = ANSI_X3.4-1968
    sun.management.compiler = HotSpot 64-Bit Tiered Compilers
    sun.os.patch.level = unknown
    user.country = US
    user.dir = /docker-persistant/dcuser/dc_workshop/data/untrimmed_fastq
    user.home = /home/dcuser
    user.language = en
    user.name = dcuser
    user.timezone = 

openjdk version "11.0.1" 2018-10-16 LTS
OpenJDK Runtime Environment Zulu11.2+3 (build 11.0.1+13-LTS)
OpenJDK 64-Bit Server VM Zulu11.2+3 (build 11.0.1+13-LTS, mixed mode)

The AWS java setup is:

java -XshowSettings:properties -version
Property settings:
    awt.toolkit = sun.awt.X11.XToolkit
    file.encoding = UTF-8
    file.encoding.pkg = sun.io
    file.separator = /
    java.awt.graphicsenv = sun.awt.X11GraphicsEnvironment
    java.awt.printerjob = sun.print.PSPrinterJob
    java.class.path = .
    java.class.version = 52.0
    java.endorsed.dirs = /home/dcuser/.miniconda3/jre/lib/endorsed
    java.ext.dirs = /home/dcuser/.miniconda3/jre/lib/ext
        /usr/java/packages/lib/ext
    java.home = /home/dcuser/.miniconda3/jre
    java.io.tmpdir = /tmp
    java.library.path = /usr/java/packages/lib/amd64
        /usr/lib64
        /lib64
        /lib
        /usr/lib
    java.runtime.name = OpenJDK Runtime Environment
    java.runtime.version = 1.8.0_152-release-1056-b12
    java.specification.name = Java Platform API Specification
    java.specification.vendor = Oracle Corporation
    java.specification.version = 1.8
    java.vendor = JetBrains s.r.o
    java.vendor.url = https://www.jetbrains.com/
    java.vendor.url.bug = https://youtrack.jetbrains.com
    java.version = 1.8.0_152-release
    java.vm.info = mixed mode
    java.vm.name = OpenJDK 64-Bit Server VM
    java.vm.specification.name = Java Virtual Machine Specification
    java.vm.specification.vendor = Oracle Corporation
    java.vm.specification.version = 1.8
    java.vm.vendor = JetBrains s.r.o
    java.vm.version = 25.152-b12
    line.separator = \n 
    os.arch = amd64
    os.name = Linux
    os.version = 3.13.0-48-generic
    path.separator = :
    sun.arch.data.model = 64
    sun.boot.class.path = /home/dcuser/.miniconda3/jre/lib/resources.jar
        /home/dcuser/.miniconda3/jre/lib/rt.jar
        /home/dcuser/.miniconda3/jre/lib/sunrsasign.jar
        /home/dcuser/.miniconda3/jre/lib/jsse.jar
        /home/dcuser/.miniconda3/jre/lib/jce.jar
        /home/dcuser/.miniconda3/jre/lib/charsets.jar
        /home/dcuser/.miniconda3/jre/lib/jfr.jar
        /home/dcuser/.miniconda3/jre/classes
    sun.boot.library.path = /home/dcuser/.miniconda3/jre/lib/amd64
    sun.cpu.endian = little
    sun.cpu.isalist = 
    sun.io.unicode.encoding = UnicodeLittle
    sun.java.launcher = SUN_STANDARD
    sun.jnu.encoding = UTF-8
    sun.management.compiler = HotSpot 64-Bit Tiered Compilers
    sun.os.patch.level = unknown
    user.country = US
    user.dir = /home/dcuser/.backup/untrimmed_fastq
    user.home = /home/dcuser
    user.language = en
    user.name = dcuser
    user.timezone = 

openjdk version "1.8.0_152-release"
OpenJDK Runtime Environment (build 1.8.0_152-release-1056-b12)
OpenJDK 64-Bit Server VM (build 25.152-b12, mixed mode)

Rename the docker host machine

A cosmetic change - maybe name the computer something other than a hash - like 'dc_computer'? This should be easy to add to the initiate.sh script

Workshop directory structure and sample data needed

We need to create a directory structure and appropriate sample data for the user of this docker image. Once created, it can be placed in /docker-persistant/dcuser. This can easily be hosted on CyVerse and copied into the machine using iRODS.

Adjust R-Studio home directory

Currently:
`

getwd()
[1] "/docker-persistant/dcuser/dc_genomics_r"`

Would be nice if that was /home/dcuser/dc_genomics_r. /home/dcuser is a symbolic link to /docker-persistant/dcuser, so need to find out how to fix this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.