Giter Club home page Giter Club logo

micropipe's Issues

assembly:porechop terminated with an error exit status (255)

Dear micropipe team,
this looks like a wonderfull tool for me. Sorry this might be a newbie nextflow question. Running sample data of micropipe like:
nextflow main.nf --samplesheet test_data/samples_1.csv --outdir micropipetest/ --gpu

I receive error NOTE: Process assembly:porechop (S24) terminated with an error exit status (255) -- Error is ignored

Where can I find log information what went wrong?

System information

  • CentOs 7
  • N E X T F L O W ~ version 21.04.3
  • singularity version 2.4.2-dist

Further output of nextflow

WARN: DSL 2 PREVIEW MODE IS DEPRECATED - USE THE STABLE VERSION INSTEAD -- Read more at https://www.nextflow.io/docs/latest/dsl2.html#dsl2-migration-notes
executor > local (1)
[94/71c68c] process > assembly:porechop (S24) [100%] 1 of 1, failed: 1 ?
[- ] process > assembly:japsa -
[- ] process > assembly:flye -
[- ] process > assembly:racon_cpu -
[- ] process > assembly:medaka_cpu -
[- ] process > assembly:nextpolish -
[- ] process > assembly:fixstart -
[- ] process > assembly:quast -
[barcode01, /home/software/micropipe/test_data/S24EC_1P_test.fastq.gz, /home/software/micropipe/test_data/S24EC_2P_test.fastq.gz]
[barcode01, /home/software/micropipe/test_data/barcode01.fastq.gz, S24, 5.5m]

Nextpolish db_split failed

Hi,
_nextpolish.log says

INFO: Converting SIF file to temporary sandbox...
[INFO] 2022-03-01 02:45:24,637 start...
[INFO] 2022-03-01 02:45:24,637 logfile: pid2111778.log.info
[WARNING] 2022-03-01 02:45:24,637 Re-write workdir
[INFO] 2022-03-01 02:45:24,645 scheduled tasks:
[1, 2, 1, 2]
[INFO] 2022-03-01 02:45:24,645 options:
[INFO] 2022-03-01 02:45:24,645 {'polish_options': ' -p 40', 'rewrite': 1, 'job_prefix': 'nextPolish', 'job_type': 'local', 'cluster_options': '', 'snp_valid': '/home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/%02d.snp_valid', 'kmer_count': '/home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/%02d.kmer_count', 'sgs_max_depth': '100', 'align_threads': '40', 'sgs_block_size': 91759816L, 'lgs_max_read_len': '150k', 'parallel_jobs': '6', 'multithread_jobs': '40', 'snp_phase': '/home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/%02d.snp_phase', 'genome': '/home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/consensus.fasta', 'genome_size': 5505589L, 'workdir': '/home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204', 'cleantmp': 0, 'sgs_align_options': 'bwa mem -p -t 40', 'sgs_unpaired': '0', 'sgs_fofn': '/home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/sgs.fofn', 'lgs_polish': '/home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/%02d.lgs_polish', 'sgs_use_duplicate_reads': 0, 'score_chain': '/home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/%02d.score_chain', 'task': [1, 2, 1, 2], 'lgs_max_depth': '60', 'lgs_block_size': '500M', 'lgs_minimap2_options': '-x map-ont', 'rerun': 3, 'lgs_min_read_len': '1k'}
[INFO] 2022-03-01 02:45:24,645 step 0 and task 1 start:
[INFO] 2022-03-01 02:45:24,646 analysis tasks done
[INFO] 2022-03-01 02:45:24,647 total jobs: 3
[INFO] 2022-03-01 02:45:24,648 Throw jobID:[2111788] jobCmd:[/home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/00.score_chain/01.db_split.sh.work/db_split0/nextPolish.sh] in the local_cycle.
[INFO] 2022-03-01 02:45:25,149 Throw jobID:[2111845] jobCmd:[/home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh] in the local_cycle.
[INFO] 2022-03-01 02:45:25,651 Throw jobID:[2112009] jobCmd:[/home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh] in the local_cycle.
[ERROR] 2022-03-01 02:45:27,799 db_split failed: please check the following logs:
[ERROR] 2022-03-01 02:45:27,799 /home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/00.score_chain/01.db_split.sh.work/db_split0/nextPolish.sh.e
cat: '01.kmer_count/polish.ref.sh.work/polish_genome/genome.nextpolish.part*.fasta': No such file or directory
cat: '03.kmer_count/polish.ref.sh.work/polish_genome/genome.nextpolish.part*.fasta': No such file or directory
/home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/.command.sh: line 12: //: Is a directory

And the log /home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/00.score_chain/01.db_split.sh.work/db_split0/nextPolish.sh.e says

hostname
cd /home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/00.score_chain/01.db_split.sh.work/db_split0
cd /home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/00.score_chain/01.db_split.sh.work/db_split0
time /opt/NextPolish/bin/seq_split -d /home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204 -m 91759816 -n 6 -t 40 -i 1 -s 550558900 -p input.sgspart /home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/sgs.fofn
time /opt/NextPolish/bin/seq_split -d /home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204 -m 91759816 -n 6 -t 40 -i 1 -s 550558900 -p input.sgspart /home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/sgs.fofn
Error! /home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91e3204/110712RA1944_S13_L001_R1_001.fastq.gz does not exist!Command exited with non-zero status 1
0.00user 0.00system 0:00.00elapsed 100%CPU (0avgtext+0avgdata 4164maxresident)k
0inputs+0outputs (0major+132minor)pagefaults 0swaps
/home/software/micropipe/work/9c/4e9530b090e3cb82616fbca91

However, file 110712RA1944_S13_L001_R1_001.fastq.gz does exist. Its basically a link to the raw data.

Thanks

Test Data for Basecalling Not working

So I have been trying to run micropipe. And I always fail in the demultiplexing step with my own data (there is always successful basecalling and guppy is definitely being located as I made appropriate changes in config file). So I decided to run with the example samplesheet and that is failing from the very start. Like there isn't successful basecalling or anything. I get the following error. I tried running this to get a better understanding of a working samplesheet but this is not helpful.
[- ] process > basecalling_demultiplexing_... -
[- ] process > pycoqc -
[- ] process > assembly:porechop -
[- ] process > assembly:japsa -
[- ] process > assembly:flye -
[- ] process > assembly:racon_cpu -
[- ] process > assembly:medaka_cpu -
[- ] process > assembly:nextpolish -
[- ] process > assembly:fixstart -
[- ] process > assembly:quast -
[barcode01, S24, 5.5m]
[barcode01, /scicomp/home-pure/suj7/test_data/S24EC_1P_test.fastq.gz, /scicomp/home-pure/suj7/test_data/S24EC_2P_test.fastq.gz]
No such file: /scicomp/home-pure/suj7/false

Flye not creating assembly file

I am trying to run micropipe assembly-only. This is my sample sheet:
(base) [suj7@login02 ~]$ head sample0.txt
barcode_id,sample_id,long_fastq,genome_size
barcode13,barcode13,demux_guppy_fastq/barcode13/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode13,barcode13,demux_guppy_fastq/barcode13/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_1.fastq,5m
barcode14,barcode14,demux_guppy_fastq/barcode14/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode14,barcode14,demux_guppy_fastq/barcode14/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_1.fastq,5m
barcode15,barcode15,demux_guppy_fastq/barcode15/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode15,barcode15,demux_guppy_fastq/barcode15/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_1.fastq,5m
barcode16,barcode16,demux_guppy_fastq/barcode16/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode16,barcode16,demux_guppy_fastq/barcode16/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_1.fastq,5m
barcode16,barcode16,demux_guppy_fastq/barcode16/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_2.fastq,5m

I have attached an error file here
myerror.txt
I know I used nano-hq instead of the original nano-raw, but it doesn't work any better with nano-raw or nano-corr.

Working sample sheet

Hi,

I was wondering what a working sample sheet would be. I have no Illumina files, and I wish to start from the FAST5 input. I added guppy to the config file and basecalling finishes successfully, but the demultiplexing step is failing to identify the barcodes and keeps failing. I ran guppy_barcoder outside of micropipe and it identified the barcodes.So this means it's something in the sample sheet that is causing the issues.
I ran the pipeline from assembly-only at first. I originally assumed the long_fastq would be the already demultiplexed fastq files when running starting from assembly, but I keep getting errors. The following was the sample sheet I have been using:
barcode_id,sample_id,long_fastq,genome_size
barcode13,barcode13,demux_guppy_fastq/barcode13/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode13,barcode13,demux_guppy_fastq/barcode13/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_1.fastq,5m
barcode14,barcode14,demux_guppy_fastq/barcode14/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode14,barcode14,demux_guppy_fastq/barcode14/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_1.fastq,5m
barcode15,barcode15,demux_guppy_fastq/barcode15/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode15,barcode15,demux_guppy_fastq/barcode15/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_1.fastq,5m
barcode16,barcode16,demux_guppy_fastq/barcode16/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode16,barcode16,demux_guppy_fastq/barcode16/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_1.fastq,5m
barcode16,barcode16,demux_guppy_fastq/barcode16/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_2.fastq,5m
barcode17,barcode17,demux_guppy_fastq/barcode17/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode17,barcode17,demux_guppy_fastq/barcode17/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_1.fastq,5m
barcode20,barcode20,demux_guppy_fastq/barcode20/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode27,barcode27,demux_guppy_fastq/barcode27/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode46,barcode46,demux_guppy_fastq/barcode46/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode55,barcode55,demux_guppy_fastq/barcode55/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode58,barcode58,demux_guppy_fastq/barcode58/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode60,barcode60,demux_guppy_fastq/barcode60/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode64,barcode64,demux_guppy_fastq/barcode64/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode67,barcode67,demux_guppy_fastq/barcode67/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode72,barcode72,demux_guppy_fastq/barcode72/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m
barcode79,barcode79,demux_guppy_fastq/barcode79/fastq_runid_5363c81269c5ff44cc45e657b79b1135be0297cb_0.fastq,5m

Problem with NextPolish

Dear micropipe team

While micorpipe with ONT data alone works fine, we have problems with combining Illumina reads. The problem is that the file sample1/4_polishing_short_reads/04RR0090_flye_polishedLR_SR.fasta is empty. Therefore, quast throws an error. sample1_flye_polishedLR_SR_fixstart.log says „db_split failed:“ and wants me to check /home/software/micropipe/work/ce/cf7d05849bd4f81c067beb16e92367/00.score_chain/01.db_split.sh.work/db_split0/nextPolish.sh.e . However, the folder 00.score_chain does not exist.

This is the call

nextflow main.nf --basecalling --demultiplexing --gpu --samplesheet /data/samplesheet.csv --fast5 /data/20210525_0850_MN32008_FAQ18836_9cbcdc36/fast5/ --datadir /home/testmicropipemalle/illumina/ --outdir /home/testmicropipemallei nextflow

No changes regarding nextpolish were applied in nextflow.config so it should use this container: docker://pvstodghill/nextpolish:1.1.0__2020-05-12 Accordingly nextpolish_version.txt says v.1.1.0

Installing NextPolish v 1.3.1 from its GIT page by hand and running the sample data worked fine.

Thanks

error at assembly (flye step)

I am working with a student who is having this issue with their execution of nextflow


executor >  local (3)
[30/a60146] process > assembly:porechop (H37Rv.1) [100%] 1 of 1 ✔
[24/79a658] process > assembly:japsa (H37Rv.1)    [100%] 1 of 1 ✔
[dd/860979] process > assembly:flye (H37Rv.1)     [  0%] 0 of 1
[-        ] process > assembly:racon_cpu          -
[-        ] process > assembly:medaka_cpu         -
[-        ] process > assembly:nextpolish         -
[-        ] process > assembly:fixstart           -
[-        ] process > assembly:quast              -
Error executing process > 'assembly:flye (H37Rv.1)'

Caused by:
  Missing output file(s) `assembly.fasta` expected by process `assembly:flye (H37Rv.1)`

Command executed:

  set +eu
  flye --nano-raw filtered.fastq.gz --genome-size 5.0m --threads 4 --out-dir $PWD --plasmids
  flye -v 2> flye_version.txt

Command exit status:
  0

Command output:
  (empty)

Command error:
  WARNING: Skipping mount /var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container
  [2022-07-22 17:41:29] INFO: Starting Flye 2.5-release
  [2022-07-22 17:41:29] INFO: >>>STAGE: configure
  [2022-07-22 17:41:29] INFO: Configuring run
  [2022-07-22 17:43:47] INFO: Total read length: 5089510998
  [2022-07-22 17:43:47] INFO: Input genome size: 5000000
  [2022-07-22 17:43:47] INFO: Estimated coverage: 1017
  [2022-07-22 17:43:47] WARNING: Expected read coverage is 1017, the assembly is not guaranteed to be optimal in this setting. Are you sure that the genome size was entered correctly?
  [2022-07-22 17:43:47] INFO: Reads N50/N90: 9733 / 2679
  [2022-07-22 17:43:47] INFO: Minimum overlap set to 3000
  [2022-07-22 17:43:47] INFO: Selected k-mer size: 15
  [2022-07-22 17:43:47] INFO: >>>STAGE: assembly
  [2022-07-22 17:43:47] INFO: Assembling disjointigs
  [2022-07-22 17:43:47] INFO: Reading sequences
  [2022-07-22 17:45:15] INFO: Generating solid k-mer index
  [2022-07-22 17:45:32] INFO: Counting k-mers (1/2):
  0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
  [2022-07-22 17:48:26] INFO: Counting k-mers (2/2):
  0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
  [2022-07-22 17:54:34] INFO: Filling index table
  0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
  [2022-07-22 18:05:38] INFO: Extending reads
  [2022-07-22 18:24:23] INFO: Overlap-based coverage: 868
  [2022-07-22 18:24:23] INFO: Median overlap divergence: 0.0852075
  0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
  [2022-07-24 03:32:08] INFO: Assembled 0 disjointigs
  [2022-07-24 03:32:08] INFO: Generating sequence
  [2022-07-24 03:32:09] ERROR: No disjointigs were assembled - please check if the read type and genome size parameters are correct

Work dir:
  /projectsp/alland/PanGenome_Project/ReviewerResponses/testing_pipelines/work/dd/8609795cae4b8d69393b8e7daee1bf

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

Looking for some guidance on how to proceed.

Best,
Paul

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.