Comments (6)
@Carl-labhub, in this case, you are installing nucleus
as user but DeepVariant is installed differently? Can you provide the full command for DeepVariant?
from deepvariant.
The command for installing DeepVariant?
singularity build DeepVariant_1.6.1.sif docker://google/deepvariant:1.6.1
Or the full command that is written to stdout when DeepVariant runs? For this latter case, after installing nucleus the full command for deepvariant is not written to output. The last line of output is KeyError: 'SerializedDType'
. I dont have the output saved from the test data run to retrieve the full command output with the error prior to installing nucleus as user, and will re-run that and update this comment with it in a few hours. I did, however, get the same error attempting to run deepvariant with my own data (prior to installing nucleus as user), and the output and command from that are below:
for bam in $READS; do
echo "running deepvariant on $bam"
run_deepvariant --model_type=PACBIO --ref=$REF --reads=$bam --output_vcf=$OUTDIR/$bam.vcf.gz --output_gvcf=$OUTDIR/$bam.g.vcf.gz --logging_dir=$LOGDIR --num_shards=$CORES
echo "finished with $bam"
done
#output in block comment below
# running deepvariant on /work/cjm124/SWFst/VarCalling/reads/bc2001_aligned_sorted.bam
# 2024-04-23 11:42:51.281492: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
# To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
# I0423 11:42:57.943745 140073410221888 run_deepvariant.py:519] Re-using the directory for intermediate results in /tmp/tmpkmab_2kw
# ***** Intermediate results will be written to /tmp/tmpkmab_2kw in docker. ****
# ***** Running the command:*****
# time seq 0 11 | parallel -q --halt 2 --line-buffer /opt/deepvariant/bin/make_examples --mode calling --ref "/work/cjm124/SWFst/lvar3ref/Lvar_scaffolds.fasta" --reads "/work/cjm124/SWFst/VarCalling/reads/bc2001_aligned_sorted.bam" --examples "/tmp/tmpkmab_2kw/[email protected]" --add_hp_channel --alt_aligned_pileup "diff_channels" --gvcf "/tmp/tmpkmab_2kw/[email protected]" --max_reads_per_partition "600" --min_mapping_quality "1" --parse_sam_aux_fields --partition_size "25000" --phase_reads --pileup_image_width "199" --norealign_reads --sort_by_haplotypes --track_ref_reads --vsc_min_fraction_indels "0.12" --task {}
# perl: warning: Setting locale failed.
# perl: warning: Please check that your locale settings:
# LANGUAGE = (unset),
# LC_ALL = (unset),
# LC_CTYPE = "C.UTF-8",
# LANG = "en_US.UTF-8"
# are supported and installed on your system.
# perl: warning: Falling back to the standard locale ("C").
# perl: warning: Setting locale failed.
# perl: warning: Please check that your locale settings:
# LANGUAGE = (unset),
# LC_ALL = (unset),
# LC_CTYPE = "C.UTF-8",
# LANG = "en_US.UTF-8"
# are supported and installed on your system.
# perl: warning: Falling back to the standard locale ("C").
# I0423 11:43:12.358298 140211385890624 genomics_reader.py:222] Reading /work/cjm124/SWFst/VarCalling/reads/bc2001_aligned_sorted.bam with NativeSamReader
# W0423 11:43:12.358482 140211385890624 make_examples_core.py:344] No non-empty sample name found in the input reads. DeepVariant will use default as the sample name. You can also provide a sample name with the --sample_name argument.
# I0423 11:43:12.365553 140211385890624 make_examples_core.py:301] Task 0/12: Preparing inputs
# I0423 11:43:12.377128 140211385890624 genomics_reader.py:222] Reading /work/cjm124/SWFst/VarCalling/reads/bc2001_aligned_sorted.bam with NativeSamReader
# I0423 11:43:12.405545 140211385890624 make_examples_core.py:301] Task 0/12: Common contigs are ['chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'unplaced_scaffold20', 'unplaced_scaffold21', 'unplaced_scaffold22', 'unplaced_scaffold23', 'unplaced_scaffold24', 'unplaced_scaffold25', 'unplaced_scaffold26', 'unplaced_scaffold27', 'unplaced_scaffold28', 'unplaced_scaffold29', 'unplaced_scaffold30', 'unplaced_scaffold31', 'unplaced_scaffold32', 'unplaced_scaffold33', 'unplaced_scaffold34', 'unplaced_scaffold35', 'unplaced_scaffold36', 'unplaced_scaffold37', 'unplaced_scaffold38', 'unplaced_scaffold39', 'unplaced_scaffold40', 'unplaced_scaffold41', 'unplaced_scaffold42', 'unplaced_scaffold43', 'unplaced_scaffold44', 'unplaced_scaffold45', 'unplaced_scaffold46', 'unplaced_scaffold47', 'unplaced_scaffold48', 'unplaced_scaffold49', 'unplaced_scaffold50', 'unplaced_scaffold51', 'unplaced_scaffold52', 'unplaced_scaffold53', 'unplaced_scaffold54', 'unplaced_scaffold55', 'unplaced_scaffold56', 'unplaced_scaffold57', 'unplaced_scaffold58', 'unplaced_scaffold59', 'unplaced_scaffold60', 'unplaced_scaffold61', 'unplaced_scaffold62', 'unplaced_scaffold63', 'unplaced_scaffold64', 'unplaced_scaffold65', 'unplaced_scaffold66', 'unplaced_scaffold67', 'unplaced_scaffold68', 'unplaced_scaffold69', 'unplaced_scaffold70', 'unplaced_scaffold71', 'unplaced_scaffold72', 'unplaced_scaffold73', 'unplaced_scaffold74', 'unplaced_scaffold75', 'unplaced_scaffold76', 'unplaced_scaffold77', 'unplaced_scaffold78', 'unplaced_scaffold79', 'unplaced_scaffold80', 'unplaced_scaffold81', 'unplaced_scaffold82', 'unplaced_scaffold83', 'unplaced_scaffold84', 'unplaced_scaffold85', 'unplaced_scaffold86', 'unplaced_scaffold87', 'unplaced_scaffold88', 'unplaced_scaffold89', 'unplaced_scaffold90', 'unplaced_scaffold91', 'unplaced_scaffold92', 'unplaced_scaffold93', 'unplaced_scaffold94', 'unplaced_scaffold95', 'unplaced_scaffold96', 'unplaced_scaffold97', 'unplaced_scaffold98', 'unplaced_scaffold99', 'unplaced_scaffold100', 'unplaced_scaffold101', 'unplaced_scaffold102', 'unplaced_scaffold103', 'unplaced_scaffold104']
# I0423 11:43:12.466705 140211385890624 make_examples_core.py:301] Task 0/12: Starting from v0.9.0, --use_ref_for_cram is default to true. If you are using CRAM input, note that we will decode CRAM using the reference you passed in with --ref
# I0423 11:43:12.538744 140211385890624 genomics_reader.py:222] Reading /work/cjm124/SWFst/VarCalling/reads/bc2001_aligned_sorted.bam with NativeSamReader
# I0423 11:43:12.636761 140211385890624 genomics_reader.py:222] Reading /work/cjm124/SWFst/VarCalling/reads/bc2001_aligned_sorted.bam with NativeSamReader
# I0423 11:43:12.637369 140211385890624 make_examples_core.py:301] Task 0/12: Writing gvcf records to /tmp/tmpkmab_2kw/gvcf.tfrecord-00000-of-00012.gz
# I0423 11:43:12.637865 140211385890624 make_examples_core.py:301] Task 0/12: Writing examples to /tmp/tmpkmab_2kw/make_examples.tfrecord-00000-of-00012.gz
# I0423 11:43:12.637962 140211385890624 make_examples_core.py:301] Task 0/12: Overhead for preparing inputs: 0 seconds
# 2024-04-23 11:43:12.645232: W ./third_party/nucleus/util/proto_clif_converter.h:75] Failed to cast type N6google8protobuf14DynamicMessageE
# Traceback (most recent call last):
# File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 234, in <module>
# app.run(main)
# File "/tmp/Bazel.runfiles_qy0tffir/runfiles/absl_py/absl/app.py", line 312, in run
# _run_main(main, args)
# File "/tmp/Bazel.runfiles_qy0tffir/runfiles/absl_py/absl/app.py", line 258, in _run_main
# sys.exit(main(argv))
# File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 224, in main
# make_examples_core.make_examples_runner(options)
# File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py", line 2838, in make_examples_runner
# region_processor.process(region, region_n)
# File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py", line 1695, in process
# sample_reads = self.region_reads_norealign(
# File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py", line 1817, in region_reads_norealign
# reads = reservoir_sample_reads(
# File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py", line 976, in reservoir_sample_reads
# return utils.reservoir_sample(iterable_of_reads, k, random)
# File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/third_party/nucleus/util/utils.py", line 117, in reservoir_sample
# for i, item in enumerate(iterable):
# File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/third_party/nucleus/io/clif_postproc.py", line 95, in __next__
# record, not_done = self._raw_next()
# File "/tmp/Bazel.runfiles_qy0tffir/runfiles/com_google_deepvariant/third_party/nucleus/io/clif_postproc.py", line 154, in _raw_next
# not_done = self._cc_iterable.PythonNext(record)
# RuntimeError: PythonNext() argument read is not valid: Dynamic cast failed
# parallel: This job failed:
# /opt/deepvariant/bin/make_examples --mode calling --ref /work/cjm124/SWFst/lvar3ref/Lvar_scaffolds.fasta --reads /work/cjm124/SWFst/VarCalling/reads/bc2001_aligned_sorted.bam --examples /tmp/tmpkmab_2kw/[email protected] --add_hp_channel --alt_aligned_pileup diff_channels --gvcf /tmp/tmpkmab_2kw/[email protected] --max_reads_per_partition 600 --min_mapping_quality 1 --parse_sam_aux_fields --partition_size 25000 --phase_reads --pileup_image_width 199 --norealign_reads --sort_by_haplotypes --track_ref_reads --vsc_min_fraction_indels 0.12 --task 0
# real 0m15.200s
# user 0m3.162s
# sys 0m1.161s
# finished with /work/cjm124/SWFst/VarCalling/reads/bc2001_aligned_sorted.bam
from deepvariant.
Hi @Carl-labhub , one thing to confirm:
In the original post, you said:
Installation method (Docker, built from source, etc.): Docker
But from the information you provided, it seems like the error you encountered was when you ran with Singularity.
Can you confirm: Do you see the error both when you use Docker and Singularity, or just Singularity?
I'll plan to try to reproduce on my side. But clarifying that will be helpful. Thank you!
from deepvariant.
Sorry, It’s singularity. I built using singularity, and am using singularity to run it. When I run it, I’m doing so from an interactive session with singularity exec
from deepvariant.
Attempt to reproduce the issue
Get a machine with the same Linux version
I used GCP to get a machine to test. Hopefully this provides a similar environment:
gcloud compute instances create "${USER}-test" --scopes "compute-rw,storage-full,cloud-platform" --image-family almalinux-9 --image-project almalinux-cloud --machine-type "n1-standard-64" --zone "us-west1-b"
I ssh'ed into the machine:
gcloud compute ssh pichuan-test --zone "us-west1-b"
Check the Linux version:
$ uname -a
Linux pichuan-test.us-west1-b.c.brain-genomics.google.com.internal 5.14.0-362.24.2.el9_3.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Mar 30 14:11:54 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux
And I ran this too:
$ cat /etc/os-release
NAME="AlmaLinux"
VERSION="9.3 (Shamrock Pampas Cat)"
ID="almalinux"
ID_LIKE="rhel centos fedora"
VERSION_ID="9.3"
PLATFORM_ID="platform:el9"
PRETTY_NAME="AlmaLinux 9.3 (Shamrock Pampas Cat)"
ANSI_COLOR="0;34"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:almalinux:almalinux:9::baseos"
HOME_URL="https://almalinux.org/"
DOCUMENTATION_URL="https://wiki.almalinux.org/"
BUG_REPORT_URL="https://bugs.almalinux.org/"
ALMALINUX_MANTISBT_PROJECT="AlmaLinux-9"
ALMALINUX_MANTISBT_PROJECT_VERSION="9.3"
REDHAT_SUPPORT_PRODUCT="AlmaLinux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.3"
Install Singularity
I don't have Singularity on the machine yet, so:
https://docs.sylabs.io/guides/4.1/user-guide/quick_start.html#quick-installation-steps
sudo yum update -y && \
sudo yum groupinstall -y 'Development Tools' && \
sudo yum install -y \
openssl-devel \
libuuid-devel \
libseccomp-devel \
wget \
squashfs-tools
sudo yum groupinstall -y 'Development Tools'
# Install RPM packages for dependencies
sudo yum install -y \
autoconf \
automake \
cryptsetup \
fuse3-devel \
git \
glib2-devel \
libseccomp-devel \
libtool \
runc \
squashfs-tools \
wget \
zlib-devel
sudo dnf install dnf-plugins-core
sudo dnf copr enable dctrud/squashfs-tools-ng
sudo dnf install squashfs-tools-ng
export VERSION=1.21.0 OS=linux ARCH=amd64 && \
wget https://dl.google.com/go/go$VERSION.$OS-$ARCH.tar.gz && \
sudo tar -C /usr/local -xzvf go$VERSION.$OS-$ARCH.tar.gz && \
rm go$VERSION.$OS-$ARCH.tar.gz
echo 'export PATH=/usr/local/go/bin:$PATH' >> ~/.bashrc && \
source ~/.bashrc
export VERSION=4.1.0 && \
wget https://github.com/sylabs/singularity/releases/download/v${VERSION}/singularity-ce-${VERSION}.tar.gz && \
tar -xzf singularity-ce-${VERSION}.tar.gz && \
cd singularity-ce-${VERSION}
./mconfig && \
make -C builddir && \
sudo make -C builddir install
At this point, I have singularity installed.
$ singularity --version
singularity-ce version 4.1.0
Get data and run DeepVariant
Now, let me try to follow similar steps:
singularity build DeepVariant_1.6.1.sif docker://google/deepvariant:1.6.1
From here, I used https://github.com/google/deepvariant/blob/r1.6.1/docs/deepvariant-pacbio-model-case-study.md to test.
Download data:
mkdir -p reference
FTPDIR=ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids
curl ${FTPDIR}/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz | gunzip > reference/GRCh38_no_alt_analysis_set.fasta
curl ${FTPDIR}/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.fai > reference/GRCh38_no_alt_analysis_set.fasta.fai
mkdir -p input
HTTPDIR=https://downloads.pacbcloud.com/public/dataset/HG003/deepvariant-case-study
curl ${HTTPDIR}/HG003.GRCh38.chr20.pFDA_truthv2.bam > input/HG003.GRCh38.chr20.pFDA_truthv2.bam
curl ${HTTPDIR}/HG003.GRCh38.chr20.pFDA_truthv2.bam.bai > input/HG003.GRCh38.chr20.pFDA_truthv2.bam.bai
ulimit -u 10000
BIN_VERSION="1.6.1"
mkdir -p deepvariant_output
@Carl-labhub mentioned "When I run it, I’m doing so from an interactive session with singularity exec". I'm a bit confused by this. Maybe you mean singularity shell
? So I tried:
singularity shell --bind /usr/lib/locale/ DeepVariant_1.6.1.sif
This gets into a shell mode, then I ran:
Singularity> /opt/deepvariant/bin/run_deepvariant \
> --model_type PACBIO \
> --ref reference/GRCh38_no_alt_analysis_set.fasta \
> --reads input/HG003.GRCh38.chr20.pFDA_truthv2.bam \
> --output_vcf deepvariant_output/output.vcf.gz \
> --num_shards $(nproc) \
> --regions chr20
Directly singularity exec
with the command (just like https://github.com/google/deepvariant/blob/r1.6.1/docs/deepvariant-pacbio-model-case-study.md) should be fine too.
My make_examples
step completed without any issues.
It took:
real 9m7.540s
user 215m38.303s
sys 6m41.297s
I let the whole run finish just to be sure. --> The full run also completed without any errors.
@Carl-labhub, can you check what I did above, and see what might be different on your side? And, there were previous GitHub issues that might be worth reading through to see if there are any relevant clues. For example: #677.
from deepvariant.
Hi @Carl-labhub ,
given that there is no activity on this issue for a while, I'll close it. Feel free to update if you have more comments or questions.
from deepvariant.
Related Issues (20)
- Merging vcf files error with glnexus:v1.2.7 HOT 6
- haploid contigs and PAR region options for DeepTrio HOT 13
- [E::vcf_parse_format] Incorrect number of FORMAT fields at NC_059157.1:24900 HOT 2
- postprocess_variants: Found multiple file patterns in input filename space HOT 8
- Issues with Incompatible TensorRT libraries in docker image google/deepvariant:latest-gpu and google/deepvariant:1.6.1-gpu HOT 9
- CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected HOT 9
- Info ONT R10.4.1 data HOT 3
- error while running deepvariant with a bam file with phasing information
- Error while using deepvariant with a bam file that is phased HOT 4
- Homozygous GT value while IGV shows otherwise HOT 8
- Fix male VCF after calling without --haploid_contigs="chrX,chrY" and/or --par_regions_bed parameters HOT 2
- gvcf with true depth and not (only) min_dp HOT 5
- any progress on somatic SNV calling? HOT 1
- Use haplotagged bam file with WES model type HOT 6
- docker: invalid reference format. HOT 6
- google/deepvariant:1.6.1 docker says version 1.6.0 HOT 7
- A timeout error occurs HOT 2
- training with multi-gpu HOT 2
- Error encountered while running on downsampled BAM HOT 3
- run DeepVariant on ARM64 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepvariant.