cancerit / ascatngs Goto Github PK

View Code? Open in Web Editor NEW

67.0 67.0 17.0 11.75 MB

Somatic copy number analysis using WGS paired end wholegenome sequencing

Home Page: http://cancerit.github.io/ascatNgs/

License: GNU Affero General Public License v3.0

Perl 77.24% R 10.38% Shell 8.90% Dockerfile 3.48%

ascatngs's People

Stargazers

Watchers

Forkers

wisekh6 al3n70rn alenzhao jason-weirather wangdi2014 xtmgah rajithbt chizhou-siti zzygyx9119 monie820 mxdeluca godloved wanhui5867 mimorik a1aks nvk747

ascatngs's Issues

Indicate removal of legacy files

Enough time has passed to indicate the removal of legacy scripts but otherwise don't mention:

.../bin/CN_to_VCF.pl
.../bin/failed_cn_csv.pl

Need to get absolute path for outdir

Add the following here:

$tmp = abs_path($tmp);

Error in calling ascat.GCcorrect

Hi there,
I am trying to run ascat on my data

srun ascat.pl -outdir $OUTDIR -tumour $TUMOR -normal $NORMAL -reference $GENOME \
-snp_gc SnpGcCorrections.tsv -protocol WGS \
-gender XX -genderChr Y -cpus 18

and got an error no matter what SnpGcCorrections reference file I use.

Executing: /opt/cesga/ascatngs/4.0.0/gcc/5.3.0/bin/ascat.pl -outdir ASCAT/M7vsM6 -tumour MCRL007_S7/MCRL007_S7.sorted.dedup.bam -normal MCRL006_S6/MCRL006_S6.sorted.dedup.bam -reference hg19/genome.fa -snp_gc ASCAT/SnpGcCorrections.tsv -protocol WGS -gender XX -genderChr Y -cpus 18
"/usr/bin/time /mnt/lustre/scratch/home/usc/mg/szd/ASCAT/M7vsM6/tmpAscat/logs/Sanger_CGP_Ascat_Implement_ascat.0.sh 1> /mnt/lustre/scratch/home/usc/mg/szd/ASCAT/M7vsM6/tmpAscat/logs/Sanger_CGP_Ascat_Implement_ascat.0.out 2> /mnt/lustre/scratch/home/usc/mg/szd/ASCAT/M7vsM6/tmpAscat/logs/Sanger_CGP_Ascat_Implement_ascat.0.err" unexpectedly returned exit value 1 at /opt/cesga/pcap-core/3.4.1/gcc/5.3.0/lib/perl5/PCAP/Threaded.pm line 229.
 at /opt/cesga/pcap-core/3.4.1/gcc/5.3.0/lib/perl5/PCAP/Threaded.pm line 227
srun: error: c7145: task 0: Exited with exit code 1
srun: Terminating job step 1244346.0

cat /M7vsM6/tmpAscat/logs/Sanger_CGP_Ascat_Implement_ascat.0.err


/opt/cesga/R/3.4.0/gcc/5.3.0/bin/Rscript /opt/cesga/ascatngs/4.0.0/gcc/5.3.0/lib/perl5/auto/share/module/Sanger-CGP-Ascat-Implement/ascat/runASCAT.R /opt/cesga/ascatngs/4.0.0/gcc/5.3.0/lib/perl5/auto/share/module/Sanger-CGP-Ascat-Implement/ascat /mnt/lustre/scratch/home/usc/mg/szd/ASCAT/M7vsM6/tmpAscat/SnpPositions.tsv /mnt/lustre/scratch/home/usc/mg/szd/ASCAT/M7vsM6/tmpAscat/SnpGcCorrections.tsv Patient_3 Patient_3.count Patient_3 Patient_3.count XX 24 /mnt/lustre/scratch/home/usc/mg/szd/ASCAT/M7vsM6/tmpAscat/ascat/Patient_3.Rdata
Error in apply(corr_tot, 1, function(x) sum(abs(x * length_tot))/sum(length_tot)) : 
  dim(X) must have a positive length
Calls: ascat.GCcorrect -> apply
Execution halted
453.29user 2.62system 7:36.40elapsed 99%CPU (0avgtext+0avgdata 2141944maxresident)k
34928inputs+438728outputs (35major+708846minor)pagefaults 0swaps

What's wrong? I am not able to spot the error. Thank you in advance.

Kind regards,
Sonia

setup.sh alleleCount check gives wrong result

I'm doing an experimental install of some of the CancerIT tools right now (on behalf of a researcher -- not intending to use it myself), and I got to ascatNgs (using release 4.1.0). Running the setup gives:

$ ./setup.sh ../prefix
App::cpanminus is up to date. (1.7043)
/home/uccaiki/Scratch/CancerIT-login05/prefix/bin/cpanm
PREREQUISITE: Please install alleleCount version >= 3.3.0 before proceeding (Found version 3.3.1):
  https://github.com/cancerit/alleleCount/releases

So, it's complaining that 3.3.1 is less than 3.3.0?

Looking at the way the check result is determined, it looks like the result is the wrong way around:

version_gt () 
{ 
    test $(printf '%s\n' $@ | sort -V | head -n 1) == "$1"
}

### This is bash, so 0 is success/true, 1 is fail/false
$ version_gt 1 2; echo $?
0
$ version_gt 3 2; echo $?
1
$ version_gt 2 2; echo $?
0

So currently it's checking whether the version is less than or equal to.

The check in setup.sh has the obtained version as the first argument, and the required version as the second, so if the version is sufficient, it fails...

I'm having a little trouble understanding how this error got here: surely it would have turned up at some point?

Human reference files from 1000 genomes VCFs : Illegal division by zero error

We get 'Illegal division by zero at ascatSnpPanelGcCorrections.pl line 81, <$SNPIN> line 1.'
while running commmands generated using following step in Convert SnpPositions.tsv to SnpGcCorrections.tsv
ls -1 splitPos/ | xargs -I {} echo '(ascatSnpPanelGcCorrections.pl genome.fa splitPos/{} > splitGc/{}) >& splitGcLogs/{}.log &'

error while running ascatngs

I have tried to run ascatngs recently but got errors like:
"/usr/bin/time /scratch/wangyong/all/0A4I3S_ASCAT/tmpAscat/logs/Sanger_CGP_Ascat_Implement_ascat.0.sh 1> /scratch/wangyong/all/0A4I3S_ASCAT/tmpAscat/logs/Sanger_CGP_Ascat_Implement_ascat.0.out 2> /scratch/wangyong/all/0A4I3S_ASCAT/tmpAscat/logs/Sanger_CGP_Ascat_Implement_ascat.0.err" unexpectedly returned exit value 1 at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 270.
at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 268

my command line is:
module load ascatngs;ascat.pl -o 0A4I0W_ASCAT -t 0A4I0W_Tumor.realigned.md.bam -n 0A4I0W_Normal.realigned.md.bam -r /fdb/igenomes/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/genome.fa -snp_gc /data/CCRBioinfo/wangyh/SnpGcCorrections.tsv -gc chrY -pr wgs -g XY

Not quite sure where the errors came from. Would you please give me some suggestions? For your convenience, I list the bam header below. Checking the output directory, I got for sub folders "allele_count, ascat, logs and progress", all the files in the "progress" has only "zero" size.

bam file header:
@hd VN:1.5 GO:none SO:coordinate
@sq SN:chrM LN:16571
@sq SN:chr1 LN:249250621
@sq SN:chr2 LN:243199373
@sq SN:chr3 LN:198022430
@sq SN:chr4 LN:191154276
@sq SN:chr5 LN:180915260
@sq SN:chr6 LN:171115067
@sq SN:chr7 LN:159138663
@sq SN:chr8 LN:146364022
@sq SN:chr9 LN:141213431
@sq SN:chr10 LN:135534747
@sq SN:chr11 LN:135006516
@sq SN:chr12 LN:133851895
@sq SN:chr13 LN:115169878
@sq SN:chr14 LN:107349540
@sq SN:chr15 LN:102531392
@sq SN:chr16 LN:90354753
@sq SN:chr17 LN:81195210
@sq SN:chr18 LN:78077248
@sq SN:chr19 LN:59128983
@sq SN:chr20 LN:63025520
@sq SN:chr21 LN:48129895
@sq SN:chr22 LN:51304566
@sq SN:chrX LN:155270560
@sq SN:chrY LN:59373566
@sq SN:chr1_gl000191_random LN:106433
@sq SN:chr1_gl000192_random LN:547496
@sq SN:chr4_ctg9_hap1 LN:590426
@sq SN:chr4_gl000193_random LN:189789
@sq SN:chr4_gl000194_random LN:191469
@sq SN:chr6_apd_hap1 LN:4622290
@sq SN:chr6_cox_hap2 LN:4795371
@sq SN:chr6_dbb_hap3 LN:4610396
@sq SN:chr6_mann_hap4 LN:4683263
@sq SN:chr6_mcf_hap5 LN:4833398
@sq SN:chr6_qbl_hap6 LN:4611984
@sq SN:chr6_ssto_hap7 LN:4928567
@sq SN:chr7_gl000195_random LN:182896
@sq SN:chr8_gl000196_random LN:38914
@sq SN:chr8_gl000197_random LN:37175
@sq SN:chr9_gl000198_random LN:90085
@sq SN:chr9_gl000199_random LN:169874
@sq SN:chr9_gl000200_random LN:187035
@sq SN:chr9_gl000201_random LN:36148
@sq SN:chr11_gl000202_random LN:40103
@sq SN:chr17_ctg5_hap1 LN:1680828
@sq SN:chr17_gl000203_random LN:37498
@sq SN:chr17_gl000204_random LN:81310
@sq SN:chr17_gl000205_random LN:174588
@sq SN:chr17_gl000206_random LN:41001
@sq SN:chr18_gl000207_random LN:4262
@sq SN:chr19_gl000208_random LN:92689
@sq SN:chr19_gl000209_random LN:159169
@sq SN:chr21_gl000210_random LN:27682
@sq SN:chrUn_gl000211 LN:166566
@sq SN:chrUn_gl000212 LN:186858
@sq SN:chrUn_gl000213 LN:164239
@sq SN:chrUn_gl000214 LN:137718
@sq SN:chrUn_gl000215 LN:172545
@sq SN:chrUn_gl000216 LN:172294
@sq SN:chrUn_gl000217 LN:172149
@sq SN:chrUn_gl000218 LN:161147
@sq SN:chrUn_gl000219 LN:179198
@sq SN:chrUn_gl000220 LN:161802
@sq SN:chrUn_gl000221 LN:155397
@sq SN:chrUn_gl000222 LN:186861
@sq SN:chrUn_gl000223 LN:180455
@sq SN:chrUn_gl000224 LN:179693
@sq SN:chrUn_gl000225 LN:211173
@sq SN:chrUn_gl000226 LN:15008
@sq SN:chrUn_gl000227 LN:128374
@sq SN:chrUn_gl000228 LN:129120
@sq SN:chrUn_gl000229 LN:19913
@sq SN:chrUn_gl000230 LN:43691
@sq SN:chrUn_gl000231 LN:27386
@sq SN:chrUn_gl000232 LN:40652
@sq SN:chrUn_gl000233 LN:45941
@sq SN:chrUn_gl000234 LN:40531
@sq SN:chrUn_gl000235 LN:34474
@sq SN:chrUn_gl000236 LN:41934
@sq SN:chrUn_gl000237 LN:45867
@sq SN:chrUn_gl000238 LN:39939
@sq SN:chrUn_gl000239 LN:33824
@sq SN:chrUn_gl000240 LN:41933
@sq SN:chrUn_gl000241 LN:42152
@sq SN:chrUn_gl000242 LN:43523
@sq SN:chrUn_gl000243 LN:43341
@sq SN:chrUn_gl000244 LN:39929
@sq SN:chrUn_gl000245 LN:36651
@sq SN:chrUn_gl000246 LN:38154
@sq SN:chrUn_gl000247 LN:36422
@sq SN:chrUn_gl000248 LN:39786
@sq SN:chrUn_gl000249 LN:38502
@rg ID:2675 SM:0A4I0W_Tumor LB:IHRT2536_TumourDNA_LibPrep PL:Illumina
@rg ID:2683 SM:0A4I0W_Tumor LB:IHRT2536_TumourDNA_LibPrep_corrected PL:Illumina
@rg ID:2696 SM:0A4I0W_Tumor LB:IHRT2536_TumourDNA_LibPrep PL:Illumina
@rg ID:825 SM:0A4I0W_Tumor LB:TARGET-40-0A4I0W-01A-01D-LibPrep PL:Illumina
@pg ID:GATK IndelRealigner VN:3.6-0-g89b7209 CL:knownAlleles=[(RodBinding name=knownAlleles source=/data/CCRBioinfo/zhujack/Ref/hg19/1000G_phase1.indels.hg19.vcf), (RodBinding name=knownAlleles2 source=/data/CCRBioinfo/zhujack/Ref/hg19/Mills_and_1000G_gold_standard.indels.hg19.vcf)] targetIntervals=/lscratch/31973919/realignment.intervals LODThresholdForCleaning=5.0 consensusDeterminationModel=USE_READS entropyThreshold=0.15 maxReadsInMemory=500000 maxIsizeForMovement=3000 maxPositionalMoveAllowed=200 maxConsensuses=30 maxReadsForConsensuses=120 maxReadsForRealignment=20000 noOriginalAlignmentTags=false nWayOut=null generate_nWayOut_md5s=false check_early=false noPGTag=false keepPGTags=false indelsFileForDebugging=null statisticsFileForDebugging=null SNPsFileForDebugging=null
@pg ID:bwa PN:bwa VN:0.7.15-r1140 CL:bwa mem -M -R @rg SM:0A4I0W_Tumor LB:IHRT2536_TumourDNA_LibPrep PL:Illumina -t 32 /data/CCRBioinfo/public/serpentine_resources/mapping/bwaindex_0.7.15/ucsc.hg19 /lscratch/31952976/5_1_64H0FAAXX.293_BUSTARD-2011-11-19.fq.gz /lscratch/31952976/5_3_64H0FAAXX.293_BUSTARD-2011-11-19.fq.gz
@pg ID:bwa.1 PN:bwa VN:0.7.15-r1140 CL:bwa mem -M -R @rg SM:0A4I0W_Tumor LB:IHRT2536_TumourDNA_LibPrep_corrected PL:Illumina -t 32 /data/CCRBioinfo/public/serpentine_resources/mapping/bwaindex_0.7.15/ucsc.hg19 /lscratch/31952895/1_1_70CTCAAXX.278_BUSTARD-2011-09-10.fq.gz /lscratch/31952895/1_2_70CTCAAXX.278_BUSTARD-2011-09-10.fq.gz
@pg ID:bwa.2 PN:bwa VN:0.7.15-r1140 CL:bwa mem -M -R @rg SM:0A4I0W_Tumor LB:TARGET-40-0A4I0W-01A-01D-LibPrep PL:Illumina -t 32 /data/CCRBioinfo/public/serpentine_resources/mapping/bwaindex_0.7.15/ucsc.hg19 /lscratch/31953249/TACAAG_3_1_AC0A7FACXX.297_BUSTARD-2011-12-15.fq.gz /lscratch/31953249/TACAAG_3_3_AC0A7FACXX.297_BUSTARD-2011-12-15.fq.gz
@pg ID:bwa.3 PN:bwa VN:0.7.15-r1140 CL:bwa mem -M -R @rg SM:0A4I0W_Tumor LB:IHRT2536_TumourDNA_LibPrep PL:Illumina -t 32 /data/CCRBioinfo/public/serpentine_resources/mapping/bwaindex_0.7.15/ucsc.hg19 /lscratch/31952936/2_1_70LKTAAXX.273_BUSTARD-2011-09-03.fq.gz /lscratch/31952936/2_2_70LKTAAXX.273_BUSTARD-2011-09-03.fq.gz
@pg ID:GATK PrintReads VN:3.6-0-g89b7209 CL:readGroup=null platform=null number=-1 sample_file=[] sample_name=[] simplify=false no_pg_tag=false
@pg ID:MarkDuplicates VN:2.1.1(6a5237c0f295ddce209ee3a3a5b83a3779408b1b_1457101272) CL:picard.sam.markduplicates.MarkDuplicates INPUT=[bam/0A4I0W_Tumor.realigned.bam] OUTPUT=/lscratch/31983274/realigned.md.bam METRICS_FILE=bam/0A4I0W_Tumor.realigned.md.bam.dupmetrics REMOVE_DUPLICATES=false ASSUME_SORTED=true TMP_DIR=[/lscratch/31983274] VALIDATION_STRINGENCY=SILENT MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 REMOVE_SEQUENCING_DUPLICATES=false TAGGING_POLICY=DontTag DUPLICATE_SCORING_STRATEGY=SUM_OF_BASE_QUALITIES PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates READ_NAME_REGEX=<optimized capture of last three ':' separated fields as numeric values> OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json PN:MarkDuplicates

Thanks for the help

Add options for rho and psi to ascat.pl

This will enable partial rerun with improved params.

setup.sh fails to find pre-req PERL5LIB dirs

Hi,
Is the supported mechanism for telling where the dependencies are to supply CGP_PERLLIBS?
Then the installation instructions should say so. At the end of installation of each dependency, PERL5LIB is set/updated but the first thing the setup script does is to reset it. The following seems to work though:
export CGP_PERLLIBS=$PERL5LIB
./setup.sh /some/ascatngs/2.0.1

Some reference files are unnecessary

The data for -snp_loci and -snp_pos file options can all be generated on the fly from the -snp_gc file.

Update to do this internally rather than have duplication of data.

chdir doesn't work before execution of runASCAT.R (as it isn't the thread being executed)

Easiest way to solve is to prepend the executed command on line 114 of Implement.pm:

my $command = "cd $ascat_out; "._which('Rscript');

protect against accidental overwrite

Should fail to start if base output folder contains logs/ which indicates that a run has completed.

Speedup ASPCF

Found a line in the original ASCAT paper describing how we can improve turnaround time.

Relevant quote:

The ASPCF segmentation algorithm is the computationally intensive step of the pipeline. However, this step can be executed in parallel, by using, e.g.

ascat.bc = ascat.aspcf(ascat.bc, 1:5)

to segment the first five samples of a dataset. For every sample, two files are created containing the segmented Log R and BAF data. When these files exist upon execution of the ascat.aspcf() function, the results are read from disk rather than recalculating. Hence, by first splitting the segmentation over multiple processors, copying the resulting segmentation files to one directory and finally executing

ascat.bc = ascat.aspcf(ascat.bc)

this segmentation can be easily parallelized.

From this is looks like we would change the following line:

ascatNgs/perl/share/ascat/runASCAT.R

Line 170 in a8a02a7

ascat.bc = ascat.aspcf(ascat.bc)

ascat.bc = ascat.aspcf(ascat.bc, 1:2) # we only have 2 samples in a paired analysis
# possibly move some files around here not completely clear.
ascat.bc = ascat.aspcf(ascat.bc)

Typo in ascat.pl -h usage

Need to check all of it but this jumped out at me:

Please defined as many of the parameters as possible

should be

Please define as many of the parameters as possible

How to make ASCAT input files

I am very new to ASCAT. I have WGS sequencing data. Can you please tell me how to create ASCAT inputs (BAF and LogR files) from WGS data. What all i need to know or where i can find it?

Thanks

GC content file

I am trying to create a custom probes positions to use with my exome sequences different than the provided SNP6 probes. For the GC_content file, I am using bedtools nuc to compute GC content in bins of length 200bp, 400bp, 1M, 10M etc around probe position. For example, the 10M bin spans 5M on each side, and if the probe distance to the beginning or end of the chromosome is less than 5M, I take 5M to one side and all remaining distance to beginning/end. Is that right?
What does the column named "Probe" means in the file?

readme formatting is messed up

Need to clean this up

Add even more obvious indicator not to be used for WXS

As repeatedly handling requests relating to Zfish WXS

Strip prefix chr from counts file before passing to ASCAT

Need to handle chr prefix on chromosome names.

Extract min/max ploidy/purity from distance matrix

As per new feature:

VanLoo-lab/ascat@647c30b#diff-73a99cdb7e85519b298eaa5990c7d425

Also ensure that the distance matrix is not saved to Rdata object.

RColorBrewer not listed as dependency for R

Known issue handled on docker images but not explicit here.

How to output all the solution that ASCATngs found?

How to output all the solution that ASCATngs found? Now the software only output the best solutions with purity, polidy and copy numbers. However sometimes the solution is not perfect. So how can I output the other solutions?

Update readme to show core ascat release

Add a section linking to the release we are basing ours on.

Running ascatNGS without matched normal data

Hello developers,
I'd like to apply ascatNGS to the cell line sample. I wonder if it is possible to run ascatNGS without matching normal bam file, in the same was as its microarray counterpart https://www.crick.ac.uk/peter-van-loo/software/ASCAT

Thank you
Vlad

ascat.pl -gc usage typo

Should say:

Specify the 'Male' sex chromosome: Y,chrY...

Handle variable chr count

Currently builds data structures wirh 24 elements, setting 22,X,Y

ascatNgs/perl/share/ascat/runASCAT.R

Lines 79 to 80 in e606c12

 ctrans = 1:24 

 names(ctrans)=c(1:22,"X","Y")

Need to get this to take in the max numeric chr as a parameter (or determine internally).

Mouse works with this as is, but likely will cause side-effects or break for genomes with more chrs.

Patch in unmerged ASCAT fix

Patch in the changes suggested here:

VanLoo-lab/ascat#21

error about Genotype.pm not readable

(sorry problem identified)

setting gender to XY

Hi, I was using ascatNGS with WGS datas.
I just find out if I set the gender to XY, the CNV segment result will show a deletion in chrX.
Even the data is from a female, if I set the gender=XY, there will be a deletion in the chrX.
And the other chromosomes' CNV segments were different between setting gender to XY or XX.
I'm wondering why this happens?

CN_to_VCF.pl not being deployed

Looks like the new script was not added to Makefile.PL.

Understanding Ascat output

Hello,

We are using ascatNgs 1.7.2 to analyze some whole-exome datasets. We are getting unexpected outputs for most of our samples. Here is the output for one of the samples:

Can you tell me if there could be something wrong with the input files or whether it is common to see such an output?

Usage documentation for ASCAT NGS

Dear developers,

I am a research associate at Sloan Kettering Cancer Center in NYC. I am building up a pipeline with different softwares for cancer exome data and downstream analysis and would be very interested in including ASCAT NGS.

Would you by any chance have available somewhere some sort of manual on how to install and specifically on how to use ASCAT NGS?

I took a look at the ASCAT official webpage and I could not see any reference yet to ASCAT NGS. I then contacted someone managing the webpage and they prompted me to ask you.

I would highly appreciate your valuable help.

Many thanks in advance for your kind attention,

Yours sincerely,

Pedro.

Link to ASCAT web-site

Update the link in README.md to point to:

https://www.crick.ac.uk/peter-van-loo/software/ASCAT

Also add a link to the GitHub site for the original R code:

https://github.com/Crick-CancerGenomics/ascat

hg38 Snp GC correction file

Is there a possibility to generate such file for hg38?

Expected ASCAT output file missing: TUMORNAME.aberrationreliability.png

Hi,

I am running ascatNgs v1.5.2 for whole-exome data. At the second step i.e. -p ascat, sometimes the .aberrationreliability.png file is not created because of which finalise fails. Do you have a work around for this issue?

Thanks a lot,
Komal

Generate usable generic output when 'force' option is specified

Primarily added to allow dependent process to continue if ASCAT fails to generate a solution. Basically generate copynumber and contamination values equal to the CaVEMan defaults for genomes. Comment/warning in sample.statistics.csv to indicate this is the case.

Fold in the C alleleCount script

Use new version of alleleCount and update code to use the C implementation

allele_count from ascatNgs are all zeros

Hello guys,

I got a problem in running ascatNgs (version 4.0.0).

We used tumor sample BAM file and normal sample BAM file as inputs as the program asks. Also, according to this page (https://github.com/cancerit/ascatNgs/wiki/Mouse-reference-files-from-Mouse-Genome-Project-VCFs), we produced "Snp GC correction file". We used the following command to run the program:
ascat.pl -outdir ./result/ -tumour tumor.bam -normal normal.bam -reference mm10.fa -snp_gc SnpGcCorrections.tsv -protocol WGS -gender XY -genderChr Y -species GRCm38 -assembly GRCm38.p3 -platform Illumina HiSeq 2500 -cpus 2 -force -noclean

But the program aborted automatically, and I noticed the result (allele_count/*.allct) from program are all zeros, as shown below:
1 #CHR POS Count_A Count_C Count_G Count_T Good_depth
2 7 3255367 0 0 0 0 0
3 7 3475891 0 0 0 0 0
4 7 3476046 0 0 0 0 0
5 7 3478293 0 0 0 0 0
6 7 3480843 0 0 0 0 0
7 7 3482833 0 0 0 0 0
8 7 3485157 0 0 0 0 0
9 7 3486183 0 0 0 0 0
10 7 3489356 0 0 0 0 0

I'm not sure why I got result like this and I don't know where the problem is. Could anyone help me with this issue? Thanks in advance.

Best,

Freya

samplename not found

Hi,

I am getting the following error while running ascat.pl :

WARN: Failed to find samplename in RG headers of /<path>/*.bam
Use of uninitialized value $sample in substitution (s///) at /apps/ascatngs/2.0.1/lib/perl5/Sanger/CGP/Ascat/Implement.pm line 280.

Is there a workaround this?

Many thanks,
Hasan

Handle chr/non-chr magically

It should be possible to check the BAM @SQ lines against the loci files and automatically handle the mismatch.

Move 'Executing:...' to after version output

Also output version asVERSION: x.x.x for clarity

Incomplete output running ascat.pl

Dear all,

I am having troubles running ascat.pl. This was installed with all the tools offered by the dockstore-cgpwgs, which was installed as a singularity container in the supercomputer of my university. I am using the test data offered in your website along with my own data. Currently, I have run bas_stats and compareBamGenotypes.pl. Those tools worked perfectly fine.

However, I was running the ascat.pl with the required parameters as follows:

module load brass
singularity exec --bind /scratch /local/software/brass/source/brass2.img ascat.pl -o /home/lym1e14/YURANY/Bioinformatic_tools/test_data/test_1/ascat/output -t /home/lym1e14/YURANY/Bioinformatic_tools/test_data/data/cell-line/HCC1143_BL.bam -n /home/lym1e14/YURANY/Bioinformatic_tools/test_data/data/cell-line/HCC1143.bam -r /home/lym1e14/YURANY/Bioinformatic_tools/test_data/cgp-test/reference_files/genome.fa -sg /home/lym1e14/YURANY/Bioinformatic_tools/test_data/cgp-test/reference_files/ascat/SnpGcCorrections.tsv -sp /home/lym1e14/YURANY/Bioinformatic_tools/test_data/cgp-test/reference_files/ascat/SnpPositions.tsv -s /home/lym1e14/YURANY/Bioinformatic_tools/test_data/cgp-test/reference_files/ascat/SnpLocus.tsv -q 20 -g L -rs HUMAN -ra GRCh37 -pr WGS -pl ILLUMINA -c 8

I got the tmpAcat with the following directories and files:
0.normal_gender.tsv SnpGcCorrections.tsv SnpPositions.tsv allele_count ascat logs progress

In ascat, there are the files of the counts for the normal and tumour bam files. In progress, there were Sanger_CGP_Ascat_Implement_allele_count.XX (XX=1 to 44), but there are empty. In logs, all the .out files are empty.

At the end, I got the following message:

"/usr/bin/time /scratch/EXOME_DATA/YURANY/Bioinformatic_tools/test_data/test_1/ascat/output/tmpAscat/logs/Sanger_CGP_Ascat_Implement_ascat.0.sh 1> /scratch/EXOME_DATA/YURANY/Bioinformatic_tools/test_data/test_1/ascat/output/tmpAscat/logs/Sanger_CGP_Ascat_Implement_ascat.0.out 2> /scratch/EXOME_DATA/YURANY/Bioinformatic_tools/test_data/test_1/ascat/output/tmpAscat/logs/Sanger_CGP_Ascat_Implement_ascat.0.err" unexpectedly returned exit value 1 at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 263.
at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 261

I though I have to invoke R within the container and I tried a couple of options without a solution.

I would be really grateful if you can help to understand the reason of the failure with this command.

Many thanks,

Yurany

Input step - make per-chromosome

The input step takes far longer than necessary (wall-time). Should be modified to work on per-chr basis.

Current usage can be maintained by doing the merging in the beginning of the ascat process.

Remove ascat.R from distribution

Have setup.sh pull from the ascat repository archives:

https://github.com/Crick-CancerGenomics/ascat

Probably a good idea to allow the version number to be presented to the setup script so that users can specify later versions without a release of our code (assuming no API changes).

fail on CN_to_VCF step

ascatNgs fails on CN_to_VCF at the end of the run with the following error:
"/usr/bin/perl /opt/share/ascatNgs/bin/CN_to_VCF.pl -o xx.copynumber.caveman.vcf -r/human_g1k_v37.fasta -i tumor.copynumber.caveman.csv -sbm tumor.bam -sbw normal.bam" unexpectedly returned exit value 1 at (eval 319) line 13.
at /opt/share/ascatNgs/lib/perl5/PCAP/Threaded.pm line 207

running the command gives:
ERROR: rs|reference-species must be defined.

If species defined by -rs, it will complain about assembly and if assembly defined, it will complain about platform.

My bam header looks like this
@rg ID:1 PL:Illumina PU:machine1 LB:library1 SM:tumor
@pg ID:bwa PN:bwa VN:0.5.9-r16

I don't need the VCF output, I just want the run to complete successfully.

Modifying Ascat.R parameters

Hi,

I want to modify some core ASCAT parameters that are hardcoded in the ascat.R script (such as MINPLOIDY, etc.). My question is, if I modify the ascat.R present under $(INST_LIB)/auto/share/module/Sanger-CGP-Ascat-Implement/ascat/ascat.R , will those be picked up during the ascatNgs run? Or should I modify those prior to installation?
I am illiterate in perl, so I am having trouble trying to track how the core ascat.R script is being used in ascatNgs.
Many thanks,
Jose

Add detail of anticipated minimum depth of genome for successful results

We'd expect it to work with ~10x+

all cmd paths need to be fully expanded

Or internals need re-write to handle relative paths

1000 genome SNP panel generation

The code snippet intended to create the 1000g SNP panel:

$ export TG_DATA=ftp://ftp.ensembl.org/pub/grch37/release-83/variation/vcf/homo_sapiens/1000GENOMES-phase_3.vcf.gz
$ curl -sSL $TG_DATA | zgrep -F 'E_Multiple_observations' | grep -F 'TSA=SNV' |
perl -ane '
next if($F[0] !~ m/^\d+$/ && $F[0] !~ m/^[XY]$/);
next if($F[0] eq $l_c && $F[1]-1000 < $l_p);
$F[7]=~m/MAF=([^;]+)/; next if($1 < 0.05);
printf "%s\t%s\t%d\n", $F[2],$F[0],$F[1];
$l_c=$F[0]; $l_p=$F[1];
' > SnpPositions_GRCh37_1000g.tsv

gives the error:

Can't modify single ref constructor in scalar assignment at -e line 6, near "];"
syntax error at -e line 7, at EOF
Execution of -e aborted due to compilation errors.

Can it work without matched control?

Hi, ascat can work without a matched control when analysing array data. Is it possible to do the same with ascatNGS? I tried with unmatched control and the BAF was a mess and affected the absolute copy number inference.

X.samplestatistics.csv should be X.samplestatistics.txt

The file is not csv, or tsv, correct the file naming

cancerit / ascatngs Goto Github PK

ascatngs's People

Stargazers

Watchers

Forkers

ascatngs's Issues

Recommend Projects

Recommend Topics

Recommend Org