imminfo / tcr Goto Github PK

View Code? Open in Web Editor NEW

36.0 16.0 22.0 34.16 MB

[DEPRECATED, see https://immunarch.com/] tcR: an R package for immune receptor repertoire advanced data analysis.

Home Page: https://immunarch.com/

R 96.20% C++ 3.80%

immunoinformatics immunology tcr data-analysis ig tcr-repertoire ig-repertoire bioinformatics bioinformatics-analysis

tcr's Introduction

tcR

The tcR package is no longer supported and current issues will not be fixed.

A new package is available that is designed to replace tcR called immunarch: https://immunarch.com/

We have solved most of the problems tcR package had and improved the overall pipeline, providing functions for painless repertoire file parsing and publication-ready plot making. We will be happy to help you to integrate the new package into your pipelines. Please do not hesitate to contact us via emails on https://immunarch.com/ or via issues on https://github.com/immunomind/immunarch, should any question arise.

Sincerely, immunarch dev team and Vadim I. Nazarov, lead developer

tcR is a platform designed for TCR and Ig repertoire data analysis in R after preprocessing data with software tools for CDR3 extraction and gene segments aligning (MiTCR, MiXCR, MiGEC, ImmunoSEQ, IMSEQ, etc.). With the power and flexibility of R language and procedures supported by tcR users can perform advanced statistical analysis of TCR and Ig repertoires. The package was published in BMC Bioinformatics, please cite if you use it:

Nazarov et al., tcR: an R package for T cell receptor repertoire advanced data analysis

The project was developed mainly in the Laboratory of Comparative and Functional Genomics.

tcr's People

Contributors

Stargazers

Watchers

tcr's Issues

parse.cloneset

Привет, а вот такой вопрос:
если я хочу распарсить клонсет .txt, но там другие названия колонок и структура тоже другая, то я использую parse.cloneset, где я могу вручную сопоставить названия колонок в файле и в tcR. Но, если у меня нет колонок, например v.end, то он выдает эррор:

Error in .make.names(.vend) :
argument ".vend" is missing, with no default

А если вручную везде прописать NA, то:

b = parse.cloneset(.filename = '/media/RAID/users/KK/mitcr/Luk/chu/10months.txt', .aa.seq = 'AA.Sequence', .reads = 'Seq.Count', .vgenes = 'V.segments', .nuc.seq = 'N.Sequence', .jgenes = 'J.segments', .skip = 1, .sep = '\t' , .barcodes = NA, .dgenes = NA, .vend = NA, .jstart = NA, .dalignments = NA,.vd.insertions = NA,.dj.insertions = NA,.total.insertions = NA)
Error in [.data.frame(df, , make.names(.reads)) :
undefined columns selected

function pca.segments.2D - error when using .text = TRUE

In the function pca.segments.2D there seems to be an error when using .text = TRUE. It seems that p is overwritten by geom_text (p <- geom_text()) instead of modified (p <- p + geom_text()).

Issue with name fixes

V/D/J segment name fixes for immunoSEQ parsers won't work correctly even if the parsers were working. Example: V20-1 is left as V20-1 even though tcR uses V20 for gene usage analysis.

Roadmap / todos

Code

2.2.2 version

2.3 version

Add a general function for gene usage analysis / comparison repGeneAnalysis (PCA, entropy, JS-div).
Add the repOverlapAnalysis function.
Add error correction subroutine based on in-frame parent search and check other error correction subroutines for errors.
Fix other error correction and decontamination procedures.

2.4 version

Do something with find.clonotypes to make it simpler and cleaner.
Optimise shared.repertoire.
Add to shared.repertoire an option to skip some pairs of input data frames.

future

Add a data structure for efficient representation of repertoires. (???)
- Implement gene segments as factors.
- Fast search for mutated neighbors between repertoires represented as such data structures.
- Fast grep.
- Memory efficient.
  - Implement nucleotide sequences as bit sequences with two-bit per bp (~ .5 memory consumption).
Optimise other functions like morisita.index for use them with lists with data frames and / or rewrite them on C++ and / or write their parallel versions. parallel

Documentation

Vignette

2.3 version

vis.shared.clonotypes manual.
Add kmers examples and functions' descriptions.
Add rarefaction analysis examples and functions' descriptions.
Add description and examples for data filtering and decontamination.
Add cool plots for mutation networks.
Add repGeneAnalysis manual.
Add repOverlapAnalysis manual.

2.4 version

Check examples for find.clonotypes and make them cleaner (explain NAs in the output and / or add new examples).

Other

2.2.2 version

Add TCR and Ig gene segment alphabets (for all possible chains) and tables for human.
Add TCR and Ig gene alleles alphabets (for all possible chains) and tables for human.
Add TCR and Ig subgroups alphabets (for all possible chains) and tables for human.
Add TRAV / TRAJ data for mouse.

2.3 version

Add TCR and Ig gene segment alphabets and tables for other than human species (from here)
- Mouse

future

Add data for time points analysis (e.g., vaccination).
Add a pipeline .Rmd file for time points analysis.
Add tables with known antigenic clones and peptides:
- Influenza
- CMV
- EBV

vis.clonal.space bug when applying to groups

function chao1 - what if no singletons but duplicates

The current function seems to result in errors when there are no singletons but duplicates. It runs the third part of the code (i.e. else) but the value for f1 is NA since this is the value of counts for '1'. Also it seems that this option is not really covered by the comment lines. Three options are considered: no singletons and no duplicates, singletons but no duplicates, singletons and duplicates. But what if no singletons but duplicates?

plot "vis.gene.usage" by "Read.proportion"

vis.gene.usage(twb, HUMAN_TRBJ, .main = 'twb J-usage dodge', .dodge = T)

is based on "Read.count", can I use "Read.proportion" to plot this figure?

Thanks!

tracking_function

may be it wouldn't be very difficult to add specific modification for shared/overlap/find.clonotype function which will allow you to "track" repertoire through out several other repertoires? The difference from find.clonotype(d=c(a,b,c), target=a$CDR3.nucleotide.sequence, ...) is that the track.clonotype function will have two important possibilities:
a - it will allow you to search for set of target vectors (several target sets of clonotypes) in several sets of repertoires
b - it will be able to customize the output and get several columns from repertoire-to-track, not only CDR3.nuc.seq and V.seg

Differential abundance calculation?

Do you have plans to add a test to identify sequences that are differentially abundant between two samples? Such as for expansion of clones after some stimulation or other intervention?
There is a method developed by Adaptive team:
J. Virol. April 2015 vol. 89 no. 8 4517-4526
That calculates p-value and adjusts for multiple hypotheses. I have written an R function, but it is slow... maybe you want to use it and see if you can speed it up?

Bug when parsing immunoseq Error in `[.data.frame`(df, , make.names(.reads)) : undefined columns selected

Hi, thanks for sharing your package!

I found an error when parsing Immunoseq files. The header doesn't match my files, so I had to fix it by changing:

reads <- 'count (reads)'

into

reads <- 'count (templates/reads)'

I'd suggest combining all 3 immunoseq parsers and allow passing a custom header as an argument.

Unable to parse.vdjtools due to Error

Hey,

I am trying to analyse data from IMGT, VDJtools allows conversion of IMGT data into the VDJ-format, which in turn I would like to read into R to analyse with tcR. When I run parse.vdjtools("filename.txt") it returns following Error:

Error in `[.data.frame`(df, , make.names(.reads)) : 
  undefined columns selected

So does that mean that the file that VDJtools created from my IMGT input cannot be read/parsed via tcR, or is it a different setting or step I have missed? I have attached a sample file that has the same format .

VDJSF6.3_Nt-sequences.txt

I saw a same error with MiTCR but that answer there was not helpful for the same issue with VDJtools

tcR conflict with other R packages?

This is not my observation, check out mikessh/vdjtools#56 (comment)

Error log:

Loading required package: ggplot2

 *** caught segfault ***
address 0x18, cause 'memory not mapped'

Traceback:
 1: dyn.load(file, DLLpath = DLLpath, ...)
 2: library.dynam(lib, package, package.lib)
 3: loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]])
 4: asNamespace(ns)
 5: namespaceImportFrom(ns, loadNamespace(j <- i[[1L]], c(lib.loc,     .libPaths()), versionCheck = vI[[j]]), i[[2L]], from = package)
 6: loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]])
 7: namespaceImport(ns, loadNamespace(i, c(lib.loc, .libPaths()),     versionCheck = vI[[i]]), from = package)
 8: loadNamespace(package, lib.loc)
 9: doTryCatch(return(expr), name, parentenv, handler)
10: tryCatchOne(expr, names, parentenv, handlers[[1L]])
11: tryCatchList(expr, classes, parentenv, handlers)
12: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if (!is.null(call)) {        if (identical(call[[1L]], quote(doTryCatch)))             call <- sys.call(-4L)        dcall <- deparse(call)[1L]        prefix <- paste("Error in", dcall, ": ")        LONG <- 75L        msg <- conditionMessage(e)        sm <- strsplit(msg, "\n")[[1L]]        w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w")        if (is.na(w))             w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L],                 type = "b")        if (w > LONG)             prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg <- paste0(prefix, conditionMessage(e), "\n")    .Internal(seterrmessage(msg[1L]))    if (!silent && identical(getOption("show.error.messages"),         TRUE)) {        cat(msg, file = stderr())        .Internal(printDeferredWarnings())    }    invisible(structure(msg, class = "try-error", condition = e))})
13: try({    attr(package, "LibPath") <- which.lib.loc    ns <- loadNamespace(package, lib.loc)    env <- attachNamespace(ns, pos = pos, deps)})
14: library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,     warn.conflicts = warn.conflicts, quietly = quietly)
15: doTryCatch(return(expr), name, parentenv, handler)
16: tryCatchOne(expr, names, parentenv, handlers[[1L]])
17: tryCatchList(expr, classes, parentenv, handlers)
18: tryCatch(library(package, lib.loc = lib.loc, character.only = TRUE,     logical.return = TRUE, warn.conflicts = warn.conflicts, quietly = quietly),     error = function(e) e)
19: require(ggplot2)
An irrecoverable exception occurred. R is aborting now ...

V D J gene name incompatibitlity with MiGEC output

Hello,

Seems the genesegments.rda object does not contain the *01 suffix of the allele names that MiGEC reports in its output resulting in no results when running the geneUsage function.

Perhaps a regular expression line can be added to remove the *01 from the MiGEC reported alleles?

Thank you ... best,

Kory

importing mixcr dataset via tcR

hi all,

I'm new to this: I've run mixcr/1.8.1 on my RNA-seq dataset using the partialAssemble workflow described on the main page and would like to create the analog of the twa list object in R which represents the Cloneset. I have several questions:

I'm guessing I need to run exportClones with a couple of specified fields in order to use

parse.mitcr

directly on it. Is this correct? If so, would anybody have the exact call to exportClones which would allow the output to be directly input into a Cloneset object in R? I don't know in other words how twa or twb were generated or cherrypicked from exportClones output

Along the same lines, how do I combine all the exportClones output from mixcr so that I can represent my entire dataset in a Cloneset, e.g. all four subjects are in twa

Basically, if there was a concrete example of how to go from N clns files (from mixcr assemble) to N exportClones files with the right fields picked out to be directly input into tcR, and then make a N x 1 list containing the entire dataset in a Cloneset object, that would be great!

Looks like a great set of routines and visualization scripts, so I can't wait to use it.

all the best,
zo

diversity() function warning message

Umi.proportion column does sum in 1, I've checked, but

diversity(a[[1]]$Umi.proportion, .q = 1)
Warning! Sum of the input vector is NOT equal to 1. Function may produce incorrect results.
To fix this try to set .do.norm = TRUE in the function's parameters.
[1] 152946.6
diversity(a[[1]]$Umi.proportion, .q = 1, .do.norm = T)
Warning! Sum of the input vector is NOT equal to 1. Function may produce incorrect results.
To fix this try to set .do.norm = TRUE in the function's parameters.

function vis.kmer.histogram - additional parameter

When using the vis.kmer.histogram function, It would be interesting to have the x-axis sorted on decreasing frequency instead of alphabetically. Maybe it is an option to include an additional parameter where you can specify whether you would like to have the plot showing decreasing frequency or how you want to sort the x-axis?
(If desired I can pull and push changes since I modified the function for myself.)

Performance of JSD is inconsistent with the raw data(original) verses randomised data

Hi Vadim,
I performed the JSD by shuffling the frequency count and read count values of my data set and found a similar result. Can you please clarify the behavior of JSD algorithm in this prospect

Thanks

VJ combination geneUsage result missed the last V and J segments

When using geneUsage with .genes = list(HUMAN_TRBV_MITCR, HUMAN_TRBJ) either on a single df or on list with df's
geneUsage(datalist[[1]], .genes = list(HUMAN_TRBV_MITCR, HUMAN_TRBJ), .quant = NA, .norm = T, .ambig = F)
, it results in a 47x12 tables, missing TRBV9 row and TRBJ2-7 column.

I use the very last version of Master branch.

find.clonotype: again (

Error if you are trying to get more than 4 columns via .col.name

V-segments usage plot only shows fixed 50 V？

I have noticed one thing that, in gene usage plot, (http://imminfo.github.io/tcr/tcrvignette.html#gene-usage), V-usage always only show the fixed 50 V segments(Human TRBV). Why tcR only show the 50 Vs? From my samples I know there are more than 50 and some a not shown in the graph. e.g. TRBV3-2，V5-3，V5-7，V6-9，V7-5，V12-1，V12-2，V17，V22，V26，

Have any idea?

Incorrect index calculations in shared.repertoire()

Hi Vadim,
when trying to get an shared rep-r with Index numbers

sh111 = shared.repertoire(ash111, .type = 'avi', .min.ppl = 1, .clear = T, .verbose = T)

something went wrong, because I see a lot of same(!) indices in columns, e.g. (data.frame ordered by col i111-0):

But when using .sum.col='Index' all works fine.

New input formats

Please specify here which software for processing NGS data and extracting CDR3 sequences do you use so I can update tcR with function for reading the output of this software.

pca.segments.2D

hi
met this following problem while learning how to use tcR to plot PCA.

pca.segments.2D(twb, .genes = HUMAN_TRBV)
Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric
In addition: Warning message:
In prcomp.default(do.call(rbind, .data), ...) :
extra argument ‘.genes’ will be disregarded

add pattern column to find.clonotypes

function check.distribution - warning and .do.norm

For the warnings inside the function check.distribution, the function 'cat' is used. This results in printed text in the output. Is it possible to use warning or message instead?
Currently the use of set .do.norm = TRUE is suggested when the sum(.data)!=1, even if you already has considered TRUE for this parameter. Maybe it is an idea to only suggest this, when NA or FALSE is considered for .do.norm?

(If desired I can pull and push changes since I modified the function for myself.)

New immunoSEQ files won't parse.

I'm trying to parse a few hundred immunoSEQ files and I don't think it's ID'ing the columns. Could their be a new format that is not compatible?

function vis.heatmap - additional parameters

Is it maybe possible to add an additional parameter which allows to request a scientific notation in the heatmap? Since otherwise there might be a lot of digits in the plot, making the plot unclear.
(If desired I can pull and push changes since I modified the function for myself.)

Genes / gene segments / segments

There is a some contradiction among scientists in naming of Variable-Joining-Diversity gene segments parts of receptors. Currently in tcR they are named as "V.gene", "J.gene" and "D.gene". Any suggestions about the more correct way to name them?

Regarding count.frame

Hi,

When I use a function 'count.frames', should the sum of # of in-frame and out-of-frame sequences be equal to # of all sequences? For example, from my analysis, I found out that

count.frames(tcr_mm[[1]], 'all')
[1] 1223
count.frames(tcr_mm[[1]], 'in')
[1] 1166
count.frames(tcr_mm[[1]], 'out')
[1] 52

I thought that
count.frames(tcr_mm[[1]], 'out') + count.frames(tcr_mm[[1]], 'in') = count.frames(tcr_mm[[1]], 'all')

Did I miss something? Please let me know.

Thanks,
Joon

Gibbs sampler broken

Using latest stable version of tcR via devtools. Workflow as follows:

Used get.kmers to store kmers from dataframe in a new variable, set meat as T.
Ran command: newvariable <- gibbs.sampler(get.kmers.dataframe, .k = 5, .niter = 500)

Errors as follows:

Error in gzfile(file, "wb") : cannot open connection
In addition: Warning message:
In gzfile(file, "wb") :

cannot open compressed file 'gibbs.500Wed Sep 21 10:11:07 2016.rda', probable reason 'Invalid argument'

Monkey TCR support

Does tcR support monkey TCR？ I found segments.alphabets support human and mouse only. Could you please add monkey TCR. There are 59 TRBV and 13 TRBJ genes of monkey in IMGT.

Can I analyse the IMGT results?

Thank you.

Error parsing MiXCR file

The command parse.mixcr("<path to mixcr.txt>")

produces the following error:

Error in FUN(X[[i]], ...) : subscript out of bounds

Converting the file into vdjtools format and importing using parse.vdjtools works.

Using tcR version 2.2.1

R CMD check issues

checking whether package ‘tcR’ can be installed ... WARNING
Found the following significant warnings:
  Warning: replacing previous import by ‘grid::arrow’ when loading ‘tcR’
  Warning: replacing previous import by ‘grid::unit’ when loading ‘tcR’
See ‘/private/tmp/Rtmp7xChS1/check_cran15edd13c5afb7/tcR.Rcheck/00install.out’ for details.

checking installed package size ... NOTE
  installed size is  5.5Mb
  sub-directories of 1Mb or more:
    data   1.2Mb
    doc    3.9Mb

checking re-building of vignette outputs ... NOTE
Error in re-building vignettes:
  ...

    union

Warning: replacing previous import by 'grid::arrow' when loading 'tcR'
Warning: replacing previous import by 'grid::unit' when loading 'tcR'

Attaching package: 'tcR'

The following object is masked from 'package:igraph':

    diversity

Using People as id variables
Using Gene as id variables
Using Gene as id variables
Using Gene as id variables
Warning: Removed 4 rows containing missing values (geom_text).
Warning: Removed 20 rows containing missing values (geom_point).
Warning: Removed 20 rows containing missing values (geom_point).
Warning: Removed 20 rows containing missing values (geom_point).
Warning: Removed 20 rows containing missing values (geom_point).
Quitting from lines 501-503 (tcrvignette.Rmd) 
Error: processing vignette 'tcrvignette.Rmd' failed with diagnostics:
Unknown parameters: binwidth, bins, origin, right
Execution halted

Please fix ASAP. I am submitting to CRAN tomorrow.

add parameter to parse.f*** for dots in names

regarding 'find.clonotype'

Hi Vadim,

I have a question about a tcR function 'find.clonotype'. I am reading through tcR vignette. One of the examples is to show how to search for a target CDR3 sequence. As shown in the vignette, if I run the command lines,

cmv.imm.hamm.v <- find.clonotypes(twb[1:3], cmv, 'hamm', 'Rank',
                             .target.col = c('CDR3.amino.acid.sequence', 'V.gene'), .verbose = F)
head(cmv.imm.hamm.v)

then we get the following output
CDR3.amino.acid.sequence V.gene Rank.Subj.A Rank.Subj.B Rank.Subj.C
CASSALGGAGTGELFF CASSALGGAGTGELFF TRBV4-1 NA NA NA
CASSLIGVSSYNEQFF CASSLIGVSSYNEQFF TRBV4-1 NA NA NA
CASSLTGNTEAFF CASSLTGNTEAFF TRBV4-1 NA NA NA
CASSSANYGYTF CASSSANYGYTF TRBV4-1 NA NA NA
CSVGRAQNEQFF CSVGRAQNEQFF TRBV4-1 NA NA NA

I thought that information from the column "Rank" (which is generated by using set.rank) would be shown in those columns but those three columns are just 'NA'. Would you please explain this? Thanks a lot.

Best,
Joon

js.div.seg

Если остальные параметры в js.div.seg - это параметры freq.Vb, то там не хватает .laplace.
Не разобралась с Labels..

pca.segments (ggplot color & names)

Hi,
Is it possible to make changes in pca.segments? I would like to change the colours in ggplot and also drop the sample names in the figure. Would you please, let me know how to fix it?

function geneUsage when joint gene distribution

In the function geneUsage there seems to be a mistake in case of the joint gene distribution. For the part of gencols, the do.call function should contain cbind instead of rbind.

pca.segments doesn't work with other than HUMAN_TRBV_MITCR segments

function js.div.seg - additional parameter for .do.norm

Is it possible to create an additional parameter in this function: .do.norm? Currently it is not possible to specify .do.norm = TRUE when calling the entropy function. (Although you might get the suggestion from the function check.distribution.)

geneUsage error if .norm=T

x=geneUsage(d0.new.data[[1]], .norm = T)
Error in apply(res[, -1], 2, function(col) col/sum(col)) :
dim(X) must have a positive length

str(d0.new.data[[1]])
'data.frame': 270080 obs. of 16 variables:
$ Umi.count : int 8086 6312 4731 4401 3745 3542 3534 3506 3010 2693 ...
$ Umi.proportion : num 0.01376 0.01074 0.00805 0.00749 0.00637 ...
$ Read.count : int 8086 6312 4731 4401 3745 3542 3534 3506 3010 2693 ...
$ Read.proportion : num 0.01376 0.01074 0.00805 0.00749 0.00637 ...
$ CDR3.nucleotide.sequence: chr "TGCGCCAGCAGCCAAGATACCGGGATGAAATTAAGCTCCTACAATGAGCAGTTCTTC" "TGTGCCAGCAGTGAAAGACCGCCTAAACCTCAAAACATTCAGTACTTC" "TGCAGCGTTGATGTGGTACAATACACTAGCACAGATACGCAGTATTTT" "TGTGCCAGCAGCTTCCTATCTAGCTCCTACGAGCAGTACTTC" ...
$ CDR3.amino.acid.sequence: chr "CASSQDTGMKLSSYNEQFF" "CASSERPPKPQNIQYF" "CSVDVVQYTSTDTQYF" "CASSFLSSSYEQYF" ...
$ V.gene : chr "TRBV4-3" "TRBV6-1" "TRBV29-1" "TRBV12-4, TRBV12-3" ...
$ J.gene : chr "TRBJ2-1" "TRBJ2-4" "TRBJ2-3" "TRBJ2-7" ...
$ D.gene : chr "TRBD2" "TRBD2" "TRBD1, TRBD2" "TRBD2" ...
$ V.end : int 16 14 10 10 14 14 11 15 15 15 ...
$ J.start : int 35 31 27 23 30 32 22 25 29 34 ...
$ D5.end : int 20 27 22 19 15 20 13 22 18 22 ...
$ D3.end : int 24 30 24 22 27 26 16 24 26 28 ...
$ VD.insertions : int 3 12 11 8 0 5 1 6 2 6 ...
$ DJ.insertions : int 10 0 2 0 2 5 5 0 2 5 ...
$ Total.insertions : int 13 12 13 8 2 10 6 6 4 11 ...

How to update to the latest version?

devtools::install_github("imminfo/tcr", build_vignettes = FALSE)

Downloading github repo imminfo/tcr@master
Installing tcR
'/Library/Frameworks/R.framework/Resources/bin/R' --vanilla CMD INSTALL
'/private/var/folders/w7/b1cyd7nn5p1_kc6b9kgz8n5w0000gn/T/RtmpZYK2Vn/devtools523538bc4d15/imminfo-tcr-624ead0'
--library='/Library/Frameworks/R.framework/Versions/3.1/Resources/library' --install-tests

installing source package ‘tcR’ ...
Error : Invalid DESCRIPTION file

Malformed package version.

See the information on DESCRIPTION files in section 'Creating R
packages' of the 'Writing R Extensions' manual.

ERROR: installing package DESCRIPTION failed for package ‘tcR’

removing ‘/Library/Frameworks/R.framework/Versions/3.1/Resources/library/tcR’
restoring previous ‘/Library/Frameworks/R.framework/Versions/3.1/Resources/library/tcR’
Error: Command failed (1)

Skipping files while parsing

Hello

I am trying to parse many mixcr files using tcR package but along the way I get this error:

Error in $<-.data.frame(*tmp*, "VD.insertions", value = -1) :
replacement has 1 row, data has 0

I guess some of my files do not have VD insertions. Is there an arg I can pass to continue parsing if it finds a file with no VD insertions?

Thanks

cloneset.stats(twb)

Hi Nazarov,
Hi,

Thanks for the examples online: https://imminfo.github.io/tcr/

however when I am executing

cloneset.stats(twb)

I am getting the following error:
Error in get.outframes(.data, .head, .coding) : unused argument (.coding)

Could you please help me to fix it?

Thanks,
Pramod

entropy.seg & the js.div.seg function Warning messages ...

Hello,

In attempting to run the entropy.seg function, I am encountering the below Warning message:

"Sum of the input vector is NOT equal to 1. Function may produce incorrect results.
To fix this try to set .do.norm = TRUE in the function's parameters."

When exploring the options I can pass to the entropy.seg function, the .do.norm parameter is not a definable option by the user.

When I explore the code for the entropy.seg function, I see the entropy function call within the higher entropy.seg function. When I attempt to add the .do.norm = T option to the entropy function call within the entropy.seg function, the Message still persists?

The Warning message also occurs when I attempt to use the js.div.seg function.

Need resolution please.

Thank you ... best,

Kory

js.div.seg is not using .quant !

I've got pretty same numbers with .quant=NA and .quant='umi.count'

the problem in here:

"
js.div.seg
function (.data, .genes = HUMAN_TRBV, .frame = c("all", "in",
"out"), .quant = c(NA, "read.count", "umi.count", "read.prop",
"umi.prop"), .norm.entropy = T, .ambig = F, .verbose = F,
.data2 = NULL)
...
if (has.class(.genes, "list") && length(.genes) == 2) {
freq.alpha <- geneUsage(.data, .genes = .genes, .ambig = .ambig)
freq.beta <- geneUsage(.data2, .genes = .genes, .ambig = .ambig)
}
...
"

ver 2.1.1

Include MOUSE_TRAV and MOUSE_TRAJ in segments.alphabets within tcR

Some labs study T-cell development using high-throughput sequencing of TCR alpha chains in TCR-beta chain transgenic mice. To be more useful to these labs, alphabets for mouse alpha chains, i.e. MOUSE_TRAV and MOUSE_TRAJ, could be included in segments.alphabets within tcR.

shared(overlap, findclonotype - whatever) - need more columns

Why don't allow getting more informative columns when using function shared.. ? For example: if i need to get the set like 'Rank', 'Barcode.count', 'Percentage' for shared clonotypes...

unable to parse MiTCR outputs

Hi,

Currently, I am using MiTCR java application (v1.0.3) and tcR 2.1. I was attempting to parse MiTCR files by using the command "parse.mitcr" but I had trouble loading those MiTCR files. The error message that I've got is as follows:

immdata1 <- parse.file("/Volumes/AdHoc_Analysis/Misc/TCR-Seq/MiTCR_results/tcR/ELM19.txt", 'mitcr')
Error in [.data.frame(df, , make.names(.reads)) :
undefined columns selected

The following is the first three lines from one of those MiTCR outputs

Read count Percentage CDR3 nucleotide sequence CDR3 amino acid sequence V segments J segments D segments Last V nucleotide position First D nucleotide position Last D nucleotide position First J nucleotide position VD insertions DJ insertions Total insertions
1524 0.058755493869997684 TGTGCCAGCAGTCGCCCGGACTTTAGCTCCTATGAACAGTACTTC CASSRPDFSSYEQYF TRBV26, TRBV24 TRBJ2-7 TRBD2 14 17 21 26 2 4 6
1297 0.050003855347366795 TGTGCCAGCTCTCTCGATTCAGGGGGACTGGGGGGGGCTAGTGCAGAAACGCTGTATTTT CASSLDSGGLGGASAETLYF TRBV12-2 TRBJ2-3 TRBD2 8 23 35 39 14 3 17

Would you please help me with this?

Thanks a lot!

Joon

bunch.translate() don't translate small letters

bunch.translate() don't translate small letters, but it would be useful.

imminfo / tcr Goto Github PK

tcr's Introduction

tcR

tcr's People

Contributors

Stargazers

Watchers

Forkers

tcr's Issues

Code

Documentation

Vignette

Other

Recommend Projects

Recommend Topics

Recommend Org