Giter Club home page Giter Club logo

Comments (6)

kishwarshafin avatar kishwarshafin commented on July 19, 2024

@Carl-labhub, in this case, you are installing nucleus as user but DeepVariant is installed differently? Can you provide the full command for DeepVariant?

from deepvariant.

Carl-labhub avatar Carl-labhub commented on July 19, 2024

The command for installing DeepVariant?

singularity build DeepVariant_1.6.1.sif docker://google/deepvariant:1.6.1

Or the full command that is written to stdout when DeepVariant runs? For this latter case, after installing nucleus the full command for deepvariant is not written to output. The last line of output is KeyError: 'SerializedDType'. I dont have the output saved from the test data run to retrieve the full command output with the error prior to installing nucleus as user, and will re-run that and update this comment with it in a few hours. I did, however, get the same error attempting to run deepvariant with my own data (prior to installing nucleus as user), and the output and command from that are below:

for bam in $READS; do
	echo "running deepvariant on $bam"
	run_deepvariant --model_type=PACBIO --ref=$REF --reads=$bam --output_vcf=$OUTDIR/$bam.vcf.gz --output_gvcf=$OUTDIR/$bam.g.vcf.gz --logging_dir=$LOGDIR --num_shards=$CORES
	echo "finished with $bam"
done

#output in block comment below

# running deepvariant on /work/cjm124/SWFst/VarCalling/reads/bc2001_aligned_sorted.bam
# 2024-04-23 11:42:51.281492: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
# To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
# I0423 11:42:57.943745 140073410221888 run_deepvariant.py:519] Re-using the directory for intermediate results in /tmp/tmpkmab_2kw

# ***** Intermediate results will be written to /tmp/tmpkmab_2kw in docker. ****


# ***** Running the command:*****
# time seq 0 11 | parallel -q --halt 2 --line-buffer /opt/deepvariant/bin/make_examples --mode calling --ref "/work/cjm124/SWFst/lvar3ref/Lvar_scaffolds.fasta" --reads "/work/cjm124/SWFst/VarCalling/reads/bc2001_aligned_sorted.bam" --examples "/tmp/tmpkmab_2kw/[email protected]" --add_hp_channel --alt_aligned_pileup "diff_channels" --gvcf "/tmp/tmpkmab_2kw/[email protected]" --max_reads_per_partition "600" --min_mapping_quality "1" --parse_sam_aux_fields --partition_size "25000" --phase_reads --pileup_image_width "199" --norealign_reads --sort_by_haplotypes --track_ref_reads --vsc_min_fraction_indels "0.12" --task {}

# perl: warning: Setting locale failed.
# perl: warning: Please check that your locale settings:
# 	LANGUAGE = (unset),
# 	LC_ALL = (unset),
# 	LC_CTYPE = "C.UTF-8",
# 	LANG = "en_US.UTF-8"
#     are supported and installed on your system.
# perl: warning: Falling back to the standard locale ("C").
# perl: warning: Setting locale failed.
# perl: warning: Please check that your locale settings:
# 	LANGUAGE = (unset),
# 	LC_ALL = (unset),
# 	LC_CTYPE = "C.UTF-8",
# 	LANG = "en_US.UTF-8"
#     are supported and installed on your system.
# perl: warning: Falling back to the standard locale ("C").
# I0423 11:43:12.358298 140211385890624 genomics_reader.py:222] Reading /work/cjm124/SWFst/VarCalling/reads/bc2001_aligned_sorted.bam with NativeSamReader
# W0423 11:43:12.358482 140211385890624 make_examples_core.py:344] No non-empty sample name found in the input reads. DeepVariant will use default as the sample name. You can also provide a sample name with the --sample_name argument.
# I0423 11:43:12.365553 140211385890624 make_examples_core.py:301] Task 0/12: Preparing inputs
# I0423 11:43:12.377128 140211385890624 genomics_reader.py:222] Reading /work/cjm124/SWFst/VarCalling/reads/bc2001_aligned_sorted.bam with NativeSamReader
# I0423 11:43:12.405545 140211385890624 make_examples_core.py:301] Task 0/12: Common contigs are ['chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'unplaced_scaffold20', 'unplaced_scaffold21', 'unplaced_scaffold22', 'unplaced_scaffold23', 'unplaced_scaffold24', 'unplaced_scaffold25', 'unplaced_scaffold26', 'unplaced_scaffold27', 'unplaced_scaffold28', 'unplaced_scaffold29', 'unplaced_scaffold30', 'unplaced_scaffold31', 'unplaced_scaffold32', 'unplaced_scaffold33', 'unplaced_scaffold34', 'unplaced_scaffold35', 'unplaced_scaffold36', 'unplaced_scaffold37', 'unplaced_scaffold38', 'unplaced_scaffold39', 'unplaced_scaffold40', 'unplaced_scaffold41', 'unplaced_scaffold42', 'unplaced_scaffold43', 'unplaced_scaffold44', 'unplaced_scaffold45', 'unplaced_scaffold46', 'unplaced_scaffold47', 'unplaced_scaffold48', 'unplaced_scaffold49', 'unplaced_scaffold50', 'unplaced_scaffold51', 'unplaced_scaffold52', 'unplaced_scaffold53', 'unplaced_scaffold54', 'unplaced_scaffold55', 'unplaced_scaffold56', 'unplaced_scaffold57', 'unplaced_scaffold58', 'unplaced_scaffold59', 'unplaced_scaffold60', 'unplaced_scaffold61', 'unplaced_scaffold62', 'unplaced_scaffold63', 'unplaced_scaffold64', 'unplaced_scaffold65', 'unplaced_scaffold66', 'unplaced_scaffold67', 'unplaced_scaffold68', 'unplaced_scaffold69', 'unplaced_scaffold70', 'unplaced_scaffold71', 'unplaced_scaffold72', 'unplaced_scaffold73', 'unplaced_scaffold74', 'unplaced_scaffold75', 'unplaced_scaffold76', 'unplaced_scaffold77', 'unplaced_scaffold78', 'unplaced_scaffold79', 'unplaced_scaffold80', 'unplaced_scaffold81', 'unplaced_scaffold82', 'unplaced_scaffold83', 'unplaced_scaffold84', 'unplaced_scaffold85', 'unplaced_scaffold86', 'unplaced_scaffold87', 'unplaced_scaffold88', 'unplaced_scaffold89', 'unplaced_scaffold90', 'unplaced_scaffold91', 'unplaced_scaffold92', 'unplaced_scaffold93', 'unplaced_scaffold94', 'unplaced_scaffold95', 'unplaced_scaffold96', 'unplaced_scaffold97', 'unplaced_scaffold98', 'unplaced_scaffold99', 'unplaced_scaffold100', 'unplaced_scaffold101', 'unplaced_scaffold102', 'unplaced_scaffold103', 'unplaced_scaffold104']
# I0423 11:43:12.466705 140211385890624 make_examples_core.py:301] Task 0/12: Starting from v0.9.0, --use_ref_for_cram is default to true. If you are using CRAM input, note that we will decode CRAM using the reference you passed in with --ref
# I0423 11:43:12.538744 140211385890624 genomics_reader.py:222] Reading /work/cjm124/SWFst/VarCalling/reads/bc2001_aligned_sorted.bam with NativeSamReader
# I0423 11:43:12.636761 140211385890624 genomics_reader.py:222] Reading /work/cjm124/SWFst/VarCalling/reads/bc2001_aligned_sorted.bam with NativeSamReader
# I0423 11:43:12.637369 140211385890624 make_examples_core.py:301] Task 0/12: Writing gvcf records to /tmp/tmpkmab_2kw/gvcf.tfrecord-00000-of-00012.gz
# I0423 11:43:12.637865 140211385890624 make_examples_core.py:301] Task 0/12: Writing examples to /tmp/tmpkmab_2kw/make_examples.tfrecord-00000-of-00012.gz
# I0423 11:43:12.637962 140211385890624 make_examples_core.py:301] Task 0/12: Overhead for preparing inputs: 0 seconds
# 2024-04-23 11:43:12.645232: W ./third_party/nucleus/util/proto_clif_converter.h:75] Failed to cast type N6google8protobuf14DynamicMessageE
# Traceback (most recent call last):
#   File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 234, in <module>
#     app.run(main)
#   File "/tmp/Bazel.runfiles_qy0tffir/runfiles/absl_py/absl/app.py", line 312, in run
#     _run_main(main, args)
#   File "/tmp/Bazel.runfiles_qy0tffir/runfiles/absl_py/absl/app.py", line 258, in _run_main
#     sys.exit(main(argv))
#   File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 224, in main
#     make_examples_core.make_examples_runner(options)
#   File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py", line 2838, in make_examples_runner
#     region_processor.process(region, region_n)
#   File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py", line 1695, in process
#     sample_reads = self.region_reads_norealign(
#   File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py", line 1817, in region_reads_norealign
#     reads = reservoir_sample_reads(
#   File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py", line 976, in reservoir_sample_reads
#     return utils.reservoir_sample(iterable_of_reads, k, random)
#   File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/third_party/nucleus/util/utils.py", line 117, in reservoir_sample
#     for i, item in enumerate(iterable):
#   File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/third_party/nucleus/io/clif_postproc.py", line 95, in __next__
#     record, not_done = self._raw_next()
#   File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/third_party/nucleus/io/clif_postproc.py", line 154, in _raw_next
#     not_done = self._cc_iterable.PythonNext(record)
# RuntimeError: PythonNext() argument read is not valid: Dynamic cast failed
# parallel: This job failed:
# /opt/deepvariant/bin/make_examples --mode calling --ref /work/cjm124/SWFst/lvar3ref/Lvar_scaffolds.fasta --reads /work/cjm124/SWFst/VarCalling/reads/bc2001_aligned_sorted.bam --examples /tmp/tmpkmab_2kw/[email protected] --add_hp_channel --alt_aligned_pileup diff_channels --gvcf /tmp/tmpkmab_2kw/[email protected] --max_reads_per_partition 600 --min_mapping_quality 1 --parse_sam_aux_fields --partition_size 25000 --phase_reads --pileup_image_width 199 --norealign_reads --sort_by_haplotypes --track_ref_reads --vsc_min_fraction_indels 0.12 --task 0

# real	0m15.200s
# user	0m3.162s
# sys	0m1.161s
# finished with /work/cjm124/SWFst/VarCalling/reads/bc2001_aligned_sorted.bam

from deepvariant.

pichuan avatar pichuan commented on July 19, 2024

Hi @Carl-labhub , one thing to confirm:

In the original post, you said:

Installation method (Docker, built from source, etc.): Docker

But from the information you provided, it seems like the error you encountered was when you ran with Singularity.

Can you confirm: Do you see the error both when you use Docker and Singularity, or just Singularity?

I'll plan to try to reproduce on my side. But clarifying that will be helpful. Thank you!

from deepvariant.

Carl-labhub avatar Carl-labhub commented on July 19, 2024

Sorry, It’s singularity. I built using singularity, and am using singularity to run it. When I run it, I’m doing so from an interactive session with singularity exec

from deepvariant.

pichuan avatar pichuan commented on July 19, 2024

Attempt to reproduce the issue

Get a machine with the same Linux version

I used GCP to get a machine to test. Hopefully this provides a similar environment:

gcloud compute instances create "${USER}-test" --scopes "compute-rw,storage-full,cloud-platform" --image-family almalinux-9 --image-project almalinux-cloud --machine-type "n1-standard-64" --zone "us-west1-b"

I ssh'ed into the machine:

gcloud compute ssh pichuan-test --zone "us-west1-b"

Check the Linux version:

$ uname -a
Linux pichuan-test.us-west1-b.c.brain-genomics.google.com.internal 5.14.0-362.24.2.el9_3.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Mar 30 14:11:54 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux

And I ran this too:

$ cat /etc/os-release
NAME="AlmaLinux"
VERSION="9.3 (Shamrock Pampas Cat)"
ID="almalinux"
ID_LIKE="rhel centos fedora"
VERSION_ID="9.3"
PLATFORM_ID="platform:el9"
PRETTY_NAME="AlmaLinux 9.3 (Shamrock Pampas Cat)"
ANSI_COLOR="0;34"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:almalinux:almalinux:9::baseos"
HOME_URL="https://almalinux.org/"
DOCUMENTATION_URL="https://wiki.almalinux.org/"
BUG_REPORT_URL="https://bugs.almalinux.org/"

ALMALINUX_MANTISBT_PROJECT="AlmaLinux-9"
ALMALINUX_MANTISBT_PROJECT_VERSION="9.3"
REDHAT_SUPPORT_PRODUCT="AlmaLinux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.3"

Install Singularity

I don't have Singularity on the machine yet, so:

https://docs.sylabs.io/guides/4.1/user-guide/quick_start.html#quick-installation-steps

sudo yum update -y && \
    sudo yum groupinstall -y 'Development Tools' && \
    sudo yum install -y \
    openssl-devel \
    libuuid-devel \
    libseccomp-devel \
    wget \
    squashfs-tools
sudo yum groupinstall -y 'Development Tools'
# Install RPM packages for dependencies
sudo yum install -y \
   autoconf \
   automake \
   cryptsetup \
   fuse3-devel \
   git \
   glib2-devel \
   libseccomp-devel \
   libtool \
   runc \
   squashfs-tools \
   wget \
   zlib-devel
sudo dnf install dnf-plugins-core
sudo dnf copr enable dctrud/squashfs-tools-ng
sudo dnf install squashfs-tools-ng
export VERSION=1.21.0 OS=linux ARCH=amd64 && \
  wget https://dl.google.com/go/go$VERSION.$OS-$ARCH.tar.gz && \
  sudo tar -C /usr/local -xzvf go$VERSION.$OS-$ARCH.tar.gz && \
  rm go$VERSION.$OS-$ARCH.tar.gz
echo 'export PATH=/usr/local/go/bin:$PATH' >> ~/.bashrc && \
  source ~/.bashrc
export VERSION=4.1.0 && \
    wget https://github.com/sylabs/singularity/releases/download/v${VERSION}/singularity-ce-${VERSION}.tar.gz && \
    tar -xzf singularity-ce-${VERSION}.tar.gz && \
    cd singularity-ce-${VERSION}
./mconfig && \
    make -C builddir && \
    sudo make -C builddir install

At this point, I have singularity installed.

$ singularity --version
singularity-ce version 4.1.0

Get data and run DeepVariant

Now, let me try to follow similar steps:

singularity build DeepVariant_1.6.1.sif docker://google/deepvariant:1.6.1

From here, I used https://github.com/google/deepvariant/blob/r1.6.1/docs/deepvariant-pacbio-model-case-study.md to test.

Download data:

mkdir -p reference

FTPDIR=ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids

curl ${FTPDIR}/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz | gunzip > reference/GRCh38_no_alt_analysis_set.fasta
curl ${FTPDIR}/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.fai > reference/GRCh38_no_alt_analysis_set.fasta.fai
mkdir -p input
HTTPDIR=https://downloads.pacbcloud.com/public/dataset/HG003/deepvariant-case-study

curl ${HTTPDIR}/HG003.GRCh38.chr20.pFDA_truthv2.bam > input/HG003.GRCh38.chr20.pFDA_truthv2.bam
curl ${HTTPDIR}/HG003.GRCh38.chr20.pFDA_truthv2.bam.bai > input/HG003.GRCh38.chr20.pFDA_truthv2.bam.bai
ulimit -u 10000
BIN_VERSION="1.6.1"
mkdir -p deepvariant_output

@Carl-labhub mentioned "When I run it, I’m doing so from an interactive session with singularity exec". I'm a bit confused by this. Maybe you mean singularity shell? So I tried:

singularity shell --bind /usr/lib/locale/ DeepVariant_1.6.1.sif

This gets into a shell mode, then I ran:

Singularity> /opt/deepvariant/bin/run_deepvariant \
>     --model_type PACBIO \
>     --ref reference/GRCh38_no_alt_analysis_set.fasta \
>     --reads input/HG003.GRCh38.chr20.pFDA_truthv2.bam \
>     --output_vcf deepvariant_output/output.vcf.gz \
>     --num_shards $(nproc) \
>     --regions chr20

Directly singularity exec with the command (just like https://github.com/google/deepvariant/blob/r1.6.1/docs/deepvariant-pacbio-model-case-study.md) should be fine too.

My make_examples step completed without any issues.

It took:

real    9m7.540s
user    215m38.303s
sys     6m41.297s

I let the whole run finish just to be sure. --> The full run also completed without any errors.

@Carl-labhub, can you check what I did above, and see what might be different on your side? And, there were previous GitHub issues that might be worth reading through to see if there are any relevant clues. For example: #677.

from deepvariant.

pichuan avatar pichuan commented on July 19, 2024

Hi @Carl-labhub ,

given that there is no activity on this issue for a while, I'll close it. Feel free to update if you have more comments or questions.

from deepvariant.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.