mikessh / vdjtools Goto Github PK
View Code? Open in Web Editor NEWPost-analysis of immune repertoire sequencing data
Home Page: vdjtools-doc.readthedocs.io
License: Other
Post-analysis of immune repertoire sequencing data
Home Page: vdjtools-doc.readthedocs.io
License: Other
subj
Hi everybody !
I am using VDJTools for RNA-seq analysis but I am not good in informatic of mathematic or statistic. I am just a biologist :)
So i am asking how the "relative overlap diversity" is measured ? Because I was thinking this value should be really different of similarity index, but it seems that they follow the same variations when I overlap my samples...
Thank in advance for the answer.
Best regards
When running the ApplySampleAsFilter routine, I get the same output whether or not I include "-e" or "--negative" in [options]. I am running vdjtools-1.0.6 Any advice would be much appreciated.
The full software list is available here
subj.
EDIT: Use suffix tree searches (SequenceTreeMap from milib)
Clonotypes and frequencies:
Segments:
A java NullPointerException occurs if I call the CalcCdrAAProfile routine. As far as I can see this seems to happen even before command line arguments are read. Here's what the _vdjtools_error.log
says:
[Wed Jul 27 11:37:35 CEST 2016 BEGIN]
[Script]
CalcCdrAAProfile
[CommandLine]
executing vdjtools-1.1.0.jar CalcCdrAAProfile -h
[Message]
java.lang.NullPointerException: Cannot invoke method join() on null object
[StackTrace-Short]
com.antigenomics.vdjtools.profile.CalcCdrAAProfile.run(CalcCdrAAProfile.groovy:41)
com.antigenomics.vdjtools.profile.CalcCdrAAProfile$run.call(Unknown Source)
com.antigenomics.vdjtools.misc.ExecUtil.run(ExecUtil.groovy:94)
com.antigenomics.vdjtools.misc.ExecUtil$run.call(Unknown Source)
com.antigenomics.vdjtools.VdjTools.run(VdjTools.groovy:207)
com.antigenomics.vdjtools.VdjTools.main(VdjTools.groovy)
[StackTrace-Full]
java.lang.NullPointerException: Cannot invoke method join() on null object
at org.codehaus.groovy.runtime.NullObject.invokeMethod(NullObject.java:88)
at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:45)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45)
at org.codehaus.groovy.runtime.callsite.NullCallSite.call(NullCallSite.java:32)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
at com.antigenomics.vdjtools.profile.CalcCdrAAProfile.run(CalcCdrAAProfile.groovy:41)
at com.antigenomics.vdjtools.profile.CalcCdrAAProfile$run.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:112)
at com.antigenomics.vdjtools.misc.ExecUtil.run(ExecUtil.groovy:94)
at com.antigenomics.vdjtools.misc.ExecUtil$run.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:120)
at com.antigenomics.vdjtools.VdjTools.run(VdjTools.groovy:207)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:233)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1085)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:909)
at org.codehaus.groovy.runtime.InvokerHelper.invokePogoMethod(InvokerHelper.java:901)
at org.codehaus.groovy.runtime.InvokerHelper.invokeMethod(InvokerHelper.java:884)
at org.codehaus.groovy.runtime.InvokerHelper.runScript(InvokerHelper.java:406)
at org.codehaus.groovy.runtime.InvokerHelper$runScript.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:120)
at com.antigenomics.vdjtools.VdjTools.main(VdjTools.groovy)
[END]
Investigate V/D/J end/start coords reporting convention in various tools and update clonotype parsing accordingly
Implement sample pool (mass join)
Total diversity estimate for a population of samples using Chao2
More meaningful error message = less confused biologists
hydropathy
and charge
in source code.
Base all sample connections on InputStream
I'm using MiTCR latest version on a TRA experiment (mouse). The conversion fails like this - can you advise on what is wrong?
$ vdjtools Convert -S mitcr full/mid1_clones.csv vdjtools/mid1_clones.csv
Executing com.antigenomics.vdjtools.misc.Convert -S mitcr full/mid1_clones.csv vdjtools/mid1_clones.csv
[Mon Sep 12 13:18:04 CEST 2016 Convert] Reading sample(s)
[Mon Sep 12 13:18:04 CEST 2016 Convert] 1 sample(s) loaded
[Mon Sep 12 13:18:04 CEST 2016 SampleStreamConnection] Loading sample mid1_clones
[ERROR] java.lang.RuntimeException: Unable to parse clonotype string 1 2687 0.0605548419083677 TGTGCTTTGCGGGGGCAGCAAGGCACTGGGTCTAAGCTGTCATTT GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG TRAV12-1*03(306.7),TRAV12-1*04(306.6),TRAV12D-2*04(300.1),TRAV12D-2*02(298.9),TRAV12N-2*01(298.6) TRAJ58*01(308.7) 270|279|304|0|9||45.0;270|279|302|0|9||45.0;267|276|300|0|9||45.0;267|276|301|0|9||45.0;267|276|301|0|9||45.0 21|52|83|14|45||155.0 TGTGCTTTGCGGGGGCAGCAAGGCACTGGGTCTAAGCTGTCATTT 38 CALRGQQGTGSKLSF :::::::::0::9:::::14::45::: for MiTcr input type., see _vdjtools_error.log for details
FullCdr3 does not extend Cdr3Region which leads to exception in KnownCdr3Regions.
...
private final Map<String, Cdr3Region> regions
...
NDN
region size, N
region size, CDR3 lengthsubj
Add R scripts for
We have ImmunoSeq datasets containing the value "unresolved" in the columns v
, d
or j
.
In case this occurs at least once in v
and d
this causes the facyvj
plot to fail.
I think the reason is line 75 of vj_pairing_plot.r
. This line grid.col = c(rcols, ccols)
causes a vector with two occurrences of "unresolved" (One from v
and one from d
). Upon transformation to a factor this messes up the blotting.
However, you could also argue that filtering "unresolved" before analyzing would be the preferred way to go anyway. This is just to let you know that we ran into this problem.
All the best and tanks a lot for this amazing piece of software!
Sample Annotation class
database.each { String seq ->
dbCdrFreqs.put(seq, 0)
causes error (0 is Integer value)
resolved by
database.each { String seq ->
dbCdrFreqs.put(seq, (Double) 0)
Implement/refine tools for cloneset annotation using our curated database
Hello all.
I install vdjtools 1.2 on my server via root.
But if my users run this tool they got error:
$ /srv/dna_tools/vdjtools-1.1.1/vdjtools CalcSegmentUsage -m /shared/vdjtools/CalcSegmentUsage/input/metadata.txt -p -f age -n /shared/vdjtools/CalcSegmentUsage/output/2
[RUtil] Executing Rscript vexpr_plot.r /shared/vdjtools/CalcSegmentUsage/output/2.segments.wt.V.txt 48 0 3 TRUE /shared/vdjtools/CalcSegmentUsage/output/2.segments.wt.V.pdf
[ERROR] Loading required package: gplotsAttaching package: ‘gplots’
The following object is masked from ‘package:stats’:
lowess
Loading required package: RColorBrewer
Loading required package: ggplot2
Loading required package: plotrixAttaching package: ‘plotrix’
The following object is masked from ‘package:gplots’:
plotCI
Error in pdf(fname) :
cannot open file '/shared/vdjtools/CalcSegmentUsage/output/2.segments.wt.V.pdf'
Calls: custom.dev -> pdf
Execution halted[RUtil] Executing Rscript vexpr_plot.r /shared/vdjtools/CalcSegmentUsage/output/2.segments.wt.J.txt 13 0 3 TRUE /shared/vdjtools/CalcSegmentUsage/output/2.segments.wt.J.pdf
[ERROR] Loading required package: gplotsAttaching package: ‘gplots’
The following object is masked from ‘package:stats’:
lowess
Loading required package: RColorBrewer
Loading required package: ggplot2
Loading required package: plotrixAttaching package: ‘plotrix’
The following object is masked from ‘package:gplots’:
plotCI
Error in pdf(fname) :
cannot open file '/shared/vdjtools/CalcSegmentUsage/output/2.segments.wt.J.pdf'
Calls: custom.dev -> pdf
Execution halted
All files on tool folder have permission 777 .
vdjtools Rinstall run corretcly and install packages on tool folder Rpackages. When i Run tool via root it's all ok.
Why this error got?
subj.
Hi there!
Running latest version of Mac OS X VDJTools (installed via Homebrew). Can't run PlotFancySpectratype, as following error pops up:
Executing com.antigenomics.vdjtools.basic.PlotFancySpectratype VDJ_.3_Nt-sequences.txt FINAL
[Thu Nov 17 21:31:17 CST 2016 PlotFancySpectratype] Reading sample
[Thu Nov 17 21:31:17 CST 2016 SampleStreamConnection] Loading sample VDJ_.3_Nt-sequences
[Thu Nov 17 21:31:18 CST 2016 ClonotypeStreamParser] Finished parsing. 1 header and 0 bad line(s) were skipped.
[Thu Nov 17 21:31:18 CST 2016 SampleStreamConnection] Loaded sample VDJ_.3_Nt-sequences with 508 clonotypes and 518 cells. Memory usage: 4 of 8 GB
[Thu Nov 17 21:31:18 CST 2016 PlotFancySpectratype] Writing output and plotting data
[RUtil] Executing Rscript fancy_spectratype.r FINAL.fancyspectra.txt FINAL.fancyspectra.pdf Clonotype TRUE
[ERROR] Loading required package: ggplot2
*** caught segfault ***
address 0x18, cause 'memory not mapped'
Traceback:
1: dyn.load(file, DLLpath = DLLpath, ...)
2: library.dynam(lib, package, package.lib)
3: loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]])
4: asNamespace(ns)
5: namespaceImportFrom(ns, loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]), i[[2L]], from = package)
6: loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]])
7: namespaceImport(ns, loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]), from = package)
8: loadNamespace(package, lib.loc)
9: doTryCatch(return(expr), name, parentenv, handler)
10: tryCatchOne(expr, names, parentenv, handlers[[1L]])
11: tryCatchList(expr, classes, parentenv, handlers)
12: tryCatch(expr, error = function(e) { call <- conditionCall(e) if (!is.null(call)) { if (identical(call[[1L]], quote(doTryCatch))) call <- sys.call(-4L) dcall <- deparse(call)[1L] prefix <- paste("Error in", dcall, ": ") LONG <- 75L msg <- conditionMessage(e) sm <- strsplit(msg, "\n")[[1L]] w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w") if (is.na(w)) w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L], type = "b") if (w > LONG) prefix <- paste0(prefix, "\n ") } else prefix <- "Error : " msg <- paste0(prefix, conditionMessage(e), "\n") .Internal(seterrmessage(msg[1L])) if (!silent && identical(getOption("show.error.messages"), TRUE)) { cat(msg, file = stderr()) .Internal(printDeferredWarnings()) } invisible(structure(msg, class = "try-error", condition = e))})
13: try({ attr(package, "LibPath") <- which.lib.loc ns <- loadNamespace(package, lib.loc) env <- attachNamespace(ns, pos = pos, deps)})
14: library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, warn.conflicts = warn.conflicts, quietly = quietly)
15: doTryCatch(return(expr), name, parentenv, handler)
16: tryCatchOne(expr, names, parentenv, handlers[[1L]])
17: tryCatchList(expr, classes, parentenv, handlers)
18: tryCatch(library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, warn.conflicts = warn.conflicts, quietly = quietly), error = function(e) e)
19: require(ggplot2)
An irrecoverable exception occurred. R is aborting now ...
The error occurs when I run the following part of the fancy_spectratype.R script in RStudio:
ggplot(df.m, aes(x = Len, y = value, fill = variable)) +
geom_bar(width = 1, stat = "identity") +
xlab("CDR3 length, bp") +
labs(fill=label) +
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0)) +
scale_fill_manual(values=c("grey75", pal)) +
theme_bw() +
theme(legend.text=element_text(size=8), axis.title.y=element_blank()) +
guides(fill = guide_legend(reverse = TRUE))
Currently running R 3.3.1, and have been able to use PlotVJFancy and CalcBasicStats and CalcSpectratype without issue. Thanks in advance for your help!
Check data from https://clients.adaptivebiotech.com/publishedProjects, update test file & re-implement parser
Utilize methods from Common/Misc classes instead of manual implementation in worker scripts
Hi Mikhail and @antigenomics team!
Have really enjoyed using VDJTools so far for human data. Had to do a vanilla install of R 3.3.0 to get it working on my Mac but otherwise no issues until I tried to import IMGT alignments of mouse TRB sequences. The overwhelming majority of reads were rejected as bad lines (e.g. 38/18276 lines were retained after conversion from IMGT) -- is this because the mouse gene lists are currently unsupported in VDJTools? Thanks a ton for your help.
Implement re-normalizing utility (clonotype freqs in output sum to 1.0)
Consider cases when it is needed to retain clonotype frequencies (total < 1.0), e.g. VDJdb output re-annotation. Perhaps this can be made the default option.
cdr3align
Introduction
VDJtools is an open-source Java/Groovy-based framework designed to facilitate analysis of immune repertoire sequencing (RepSeq) data. VDJtools computes a wide set of statistics and is able to perform diverse cross-sample analysis. Both comprehensive tabular output and publication-ready plots are provided.
RepSeq link does not work
http://www.ncbi.nlm.nih.gov/pubmed/22043864a
Likely an issue with ggplot
java -Xmx20G -jar vdjtools-1.0.7.jar OverlapPair -p ./samples/TW437.txt ./samples/TW438.txt out/A
Executing com.antigenomics.vdjtools.overlap.OverlapPair -p ./samples/TW437.txt ./samples/TW438.txt out/A
[Sat Mar 26 23:15:51 CST 2016 OverlapPair] Reading samples ./samples/TW437.txt and ./samples/TW438.txt
[Sat Mar 26 23:15:51 CST 2016 SampleStreamConnection] Loading sample TW437
[Sat Mar 26 23:15:53 CST 2016 ClonotypeStreamParser] Finished parsing. 1 header and 0 bad line(s) were skipped.
[Sat Mar 26 23:15:53 CST 2016 SampleStreamConnection] Loaded sample TW437 with 18262 clonotypes and 3272130 cells. Memory usage: 1 of 18 GB
[Sat Mar 26 23:15:53 CST 2016 SampleStreamConnection] Loading sample TW438
[Sat Mar 26 23:15:54 CST 2016 ClonotypeStreamParser] Finished parsing. 1 header and 0 bad line(s) were skipped.
[Sat Mar 26 23:15:54 CST 2016 SampleStreamConnection] Loaded sample TW438 with 28973 clonotypes and 3317916 cells. Memory usage: 1 of 18 GB
[Sat Mar 26 23:15:54 CST 2016 OverlapPair] Intersecting
[Sat Mar 26 23:15:54 CST 2016 Overlap] Intersecting samples #0 and 1
[Sat Mar 26 23:15:54 CST 2016 OverlapEvaluator] Computing Correlation
[Sat Mar 26 23:15:54 CST 2016 OverlapEvaluator] Computing Diversity
[Sat Mar 26 23:15:54 CST 2016 OverlapEvaluator] Computing Frequency
[Sat Mar 26 23:15:54 CST 2016 OverlapEvaluator] Computing Frequency2
[Sat Mar 26 23:15:54 CST 2016 OverlapEvaluator] Computing vJSD
[Sat Mar 26 23:15:54 CST 2016 SegmentUsage] Processing sample TW437
[Sat Mar 26 23:15:54 CST 2016 SegmentUsage] Processing sample TW438
[Sat Mar 26 23:15:54 CST 2016 OverlapEvaluator] Computing vjJSD
[Sat Mar 26 23:15:54 CST 2016 OverlapEvaluator] Computing vj2JSD
[Sat Mar 26 23:15:54 CST 2016 OverlapEvaluator] Computing sJSD
[Sat Mar 26 23:15:54 CST 2016 OverlapEvaluator] Computing Jaccard
[Sat Mar 26 23:15:54 CST 2016 OverlapEvaluator] Computing MorisitaHorn
[Sat Mar 26 23:15:54 CST 2016 OverlapPair] Writing output
[Sat Mar 26 23:15:54 CST 2016 OverlapPair] Plotting
[RUtil] Executing Rscript a5816912-425f-4434-9974-9619cf6d4d03_intersect_pair_scatter.r TW437 TW438 out/A.xy.txt out/A.xx.txt out/A.yy.txt out/A.strict.paired.scatter.pdf
null device
1
[RUtil] Executing Rscript e6669afc-2f85-4cfb-b7cc-4fd8152689da_intersect_pair_area.r TW437 TW438 out/A.paired.strict.table.collapsed.txt out/A.paired.strict.table.collapsed.pdf
[ERROR] During startup - Warning messages:
1: Setting LC_CTYPE failed, using "C"
2: Setting LC_COLLATE failed, using "C"
3: Setting LC_TIME failed, using "C"
4: Setting LC_MESSAGES failed, using "C"
5: Setting LC_MONETARY failed, using "C"
6: Setting LC_PAPER failed, using "C"
7: Setting LC_MEASUREMENT failed, using "C"
Loading required package: ggplot2
Loading required package: RColorBrewer
Error: Unknown parameters: guide
Execution halted
Introduction
VDJtools is an open-source Java/Groovy-based framework designed to facilitate analysis of immune repertoire sequencing (RepSeq) data. VDJtools computes a wide set of statistics and is able to perform diverse cross-sample analysis. Both comprehensive tabular output and publication-ready plots are provided.
RepSeq link does not work
http://www.ncbi.nlm.nih.gov/pubmed/22043864a
This line causes null poiter exception with MiTcr and MiGec Software option
CalcPairwiseDistances
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.