Giter Club home page Giter Club logo

phybreak's Introduction

phybreak

Outbreak reconstruction with sequence data

The package implements the method described in Klinkenberg et al (2016), doi: http://dx.doi.org/10.1101/069195

Workflow:

  • enter data and priors by constructing an object of S3-class 'phybreak', with function 'phybreak'

  • do mcmc-updates with functions 'burnin_phybreak' and 'sample_phybreak'; remove samples with 'thin.phybreak'

  • access the 'phybreak'-object by get_phybreak-functions such as 'get_transtree', 'get_data', 'get_parameters'

  • summarize the mcmc-chain with the functions 'ESS', 'transtree', 'infectorsets', 'phylotree', 'get_mcmc', 'get_phylo'

  • plotting with 'plot', 'plotTrans', and 'plotPhylo'

  • it is possible to simulate data with 'sim_phybreak'

phybreak's People

Contributors

donkeyshot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

phybreak's Issues

feature request: set seed

Hi Don,

Is there an option to set the seed from where to start the analysis? I would like to run the MCMC multiple times from different starting 'locations' to assess whether the runs are converging on the same likelihood.

Thanks,

Mark

tree$edge does not reconcile with phylotree() data.frame

Hi,

Thanks for the software. It looks like it will be very useful for us.

I'm trying to use it to plot an MCC tree with posterior support values overlaid. In the example below, I create the phybreak object using phybreak(data, times). I burnin with 1000 cycles and sample for 10000 cycles.

MCMCstate = phybreak(data=core_tab_chrmsm, times=d3)
MCMCstate = burnin.phybreak(MCMCstate, ncycles = 1000)
MCMCstate = sample.phybreak(MCMCstate, nsample=10000, thin=1)

Then, I get the the phylotree() data.frame and tree:

phylo_tree_matrix = phylotree(MCMCstate, support = "proportion")
phylo_tree = phylotree(MCMCstate, support = "proportion", phylo.class = T)

My aim is to overlay the posterior supports from the phylo_tree_matrix on the phylo_tree, but the phylo_tree$edge matrix and the phylo_tree_matrix$parent columns are irreconcilable, as can be found when comparing the plot of the tree (with tiplabels() and nodelabels()) with the following two tables:

plot(phylo_tree)
nodelabels() #these are the blue squares
tiplabels() #these are the yellow squares

screen shot 2016-08-30 at 14 19 03

That is, the parent-child relationships between the following two tables do not align. Here is the edge table (the blue squares are column 1 and the yellow squares are column 2):

> phylo_tree$edge
      [,1] [,2]
 [1,]   26   27
 [2,]   27   28
 [3,]   28   29
 [4,]   29   30
 [5,]   30   33
 [6,]   33   38
 [7,]   38   36
 [8,]   36   37
 [9,]   37   35
[10,]   35   41
[11,]   41   44
[12,]   44   45
[13,]   45   48
[14,]   48    9
[15,]   48   12
[16,]   45   14
[17,]   44   49
[18,]   49   46
[19,]   46   11
[20,]   46   13
[21,]   49   10
[22,]   41   18
[23,]   35   22
[24,]   37   42
[25,]   42   19
[26,]   42   24
[27,]   36   16
[28,]   38   39
[29,]   39   43
[30,]   43    6
[31,]   43   17
[32,]   39   47
[33,]   47   21
[34,]   47   23
[35,]   33    3
[36,]   30    1
[37,]   29   31
[38,]   31    2
[39,]   31   15
[40,]   28   34
[41,]   34    4
[42,]   34    7
[43,]   27   40
[44,]   40    5
[45,]   40   20
[46,]   26   32
[47,]   32    8
[48,]   32   25

And here is the data.frame (rows 35, 36, 38, 39, 42, 43, 47 have bad parent-child relationships, where child is the row-name and parent is the corresponding cell value in column 1) :

> phylo_tree_matrix
   parent    support node.times.mean node.times.sd node.times.mc.tree
1      30 1.00000000      0.00000000    0.00000000         0.00000000
2      31 1.00000000      0.30000000    0.00000000         0.30000000
3      33 1.00000000      0.40000000    0.00000000         0.40000000
4      34 1.00000000      0.81000000    0.00000000         0.81000000
5      40 1.00000000      0.86000000    0.00000000         0.86000000
6      43 1.00000000      0.90000000    0.00000000         0.90000000
7      34 1.00000000      0.94000000    0.00000000         0.94000000
8      32 1.00000000      1.56000000    0.00000000         1.56000000
9      48 1.00000000      1.88000000    0.00000000         1.88000000
10     49 1.00000000      2.26000000    0.00000000         2.26000000
11     46 1.00000000      2.38000000    0.00000000         2.38000000
12     48 1.00000000      2.52000000    0.00000000         2.52000000
13     46 1.00000000      2.55000000    0.00000000         2.55000000
14     45 1.00000000      2.79000000    0.00000000         2.79000000
15     31 1.00000000      0.71000000    0.00000000         0.71000000
16     38 1.00000000      0.88000000    0.00000000         0.88000000
17     43 1.00000000      0.90000000    0.00000000         0.90000000
18     41 1.00000000      0.90000000    0.00000000         0.90000000
19     42 1.00000000      0.90000000    0.00000000         0.90000000
20     40 1.00000000      0.90000000    0.00000000         0.90000000
21     47 1.00000000      0.91000000    0.00000000         0.91000000
22     35 1.00000000      0.91000000    0.00000000         0.91000000
23     38 1.00000000      0.92000000    0.00000000         0.92000000
24     42 1.00000000      0.92000000    0.00000000         0.92000000
25     32 1.00000000      1.56000000    0.00000000         1.56000000
26      0 1.00000000     -1.51126789    0.90524149        -1.67566591
27     26 0.65118812     -1.42796803    0.61749479        -1.29354997
28     27 0.74316832     -1.11291185    0.59893442        -1.05383677
29     28 0.99732673     -0.49866271    0.39040749        -0.53178046
30     29 0.41752475     -0.29407567    0.17194680        -0.21870728
31     29 0.52405941     -0.21429529    0.25190572        -0.14663459
32     26 0.99940594     -0.04581645    0.65440055        -0.41669197
33     30 0.96425743      0.16568535    0.19175916         0.13751129
34     28 0.99732673     -0.21341553    0.35359052        -0.36316498
35     39 0.09653465      0.78918218    0.07259913         0.78166042
36     33 0.71603960      0.43536170    0.11343933         0.28688438
37     36 0.03366337      0.56681280    0.10266716         0.36312121
38     37 0.08178218      0.63215878    0.11109348         0.76105123
39     37 0.03792079      0.63332049    0.10035708         0.58598696
40     27 0.99930693      0.07795032    0.40078764         0.03305225
41     35 0.21554455      0.88923055    0.01873173         0.89209207
42     39 0.10227723      0.73601620    0.11624829         0.75397219
43     47 0.97485149      0.73915054    0.12418226         0.71125887
44     41 0.80980198      1.28028753    0.26203872         1.01338487
45     44 0.70732673      1.50948099    0.27434150         1.49648718
46     49 0.99782178      2.15139832    0.20358827         1.88874520
47     36 0.06138614      0.50711223    0.31232034         0.48576206
48     45 0.80425743      1.64236086    0.30733507         1.58391568
49     44 0.99930693      1.74499925    0.31908351         1.62513775

I have seen in http://rpackages.ianhowson.com/cran/phybreak/man/phylotree.html that the phylotree data.frame should be: "A data.frame with per item (=node) its parent and support per clade, and optionally summary node times. The first n nodes are the samples, the last n-1 nodes are the internal nodes." But, still, this does not seem to add up correctly.

Is this an error or am I missing something? Could you please add a feature that adds these posterior support values to the tree (phylo) object tree$node.label so that I don't need to manually try to do this myself?

Also, will be adding more to the documentation and providing tutorials? This would also be very helpful.

All the best,

Mark

collection dates unclear

Hi Don,

In my case, the collection dates of certain strains are unclear, e.g. like these 2009-11-XX or 2009-XX-XX. Is it possible to apply phybreak for such analysis?

Thanks!

Zhuofei

ordering problem in phybreakdata

I think there is a minor bug that causes problems when different hosts share the same sampling times. I think line 122 in phybreakdata needs to be adjusted from

sapply(allhosts, function(x) allfirsttimes[which(min(sample.times[host.names == x]) == sample.times)[1]] <<- TRUE)

to

sapply(allhosts, function(x) allfirsttimes[which(min(sample.times[host.names == x]) == sample.times & (host.names == x))[1]] <<- TRUE)

as the first condition min(sample.times[host.names == x]) == sample.times may match with a different host that shares the same sampling time. This leads to cryptic error messages later on when sampling times precede infection times.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.