Giter Club home page Giter Club logo

ggmcmc's People

Contributors

akrawitz avatar alvinwmtan avatar brianstock avatar johnaponte avatar jonathan-g avatar jucor avatar juste avatar knokknok avatar mbjoseph avatar schloerke avatar stragu avatar xfim avatar zmjones avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ggmcmc's Issues

ggs_Rhat gives different answers than gelman.diag

Hi! This is a great package, but I just had a question about the ggs_Rhat function. It seems to be giving different estimates than the gelman.diag function from the coda package. I'm not sure if they are different, but I thought I'd put this out there to see if this is either an explanation, or to learn about the difference between the two functions. Here's a reproducible example:

library(coda)
data(radon)
radon.long <- radon$s.radon
g <- gelman.diag(radon.long, autoburnin = F, multivariate = F)$psrf
r <- ggs_Rhat(ggs(radon.long))$data

g[rownames(g) == "sigma.beta",]
r[r$Parameter == "sigma.beta",]

You can see that the gelman.diag estimate gives a point estimate of 1.22 for this parameter, while the ggs_Rhat function gives a point estimate of 1.09.

Fix: error in ggs_autocorrelation

Calling ggs_autocorrelation with a number of lags larger than the number of iterations actually present produces an error. Commit be37820, appended to pull-request #21 , fixes this problem, generates a warning and plot until the max possible lag instead.

Cannot include the warmup using ggs for a stanfit object

Hi there

When I run the following command, I cant seem to get out the warmup samples..

But I guess this is the idea with inc_warmup = T?

> ggs(STAN_fit, family = 'sigma', inc_warmup = T)
Source: local data frame [1,000 x 4]

   Iteration Chain          Parameter    value
1          1     1 transactions_sigma 208765.6
2          2     1 transactions_sigma 206573.2
3          3     1 transactions_sigma 198310.6
4          4     1 transactions_sigma 214327.9
5          5     1 transactions_sigma 209830.3
6          6     1 transactions_sigma 205860.9
7          7     1 transactions_sigma 199150.3
8          8     1 transactions_sigma 209000.2
9          9     1 transactions_sigma 200834.9
10        10     1 transactions_sigma 217648.9
..       ...   ...                ...      ...
Warning message:
In ggs(STAN_fit, family = "sigma", inc_warmup = T) :
  inc_warmup must be 'FALSE', so it is ignored.

ggs_Rhat error with one chain (makes ggmcmc fail)

If you have only one chain, ggs_Rhat produces an error.

If you call ggs_Rhat individually, this is a minor problem, even if an warning "At least two chains..." would be nicer.

However, when you run ggmcmc, it stops processing and does not cleanly close the pdf file.

Take burn-in period into account

It would be nice if the gggmcmc-package (I am using version ggmcmc_0.2) would take a burn-in period into account. Let's assume a burn-in period of 10,000, then coda's traceplot() command starts at iteration 10,000, whereas ggs_traceplot() starts at iteration 0. Thanks, Bernd

Error using ggs()

Hi Xavier,

I'm writing an R package that uses your ggmcmc routine to create MCMC output plots from JAGS. It's been working very nicely until today, when I made some changes. Now when I call ggs() on my mcmc.list object, I get the following error:

> ggmcmc(ggs(jags1.mcmc),file="diagnostics.pdf")
Error in eval(expr, envir, enclos) : object 'Parameter' not found

I've tried supplying ggs() with par_labels from varnames(jags1.mcmc), but that didn't help. Any idea what the problem could be? I can send my R workspace if that would help.

Thanks,
Brian

Suggestion: Make parallel = FALSE by default

Since setting up parallel is quite a bit of additional work (especially on Windows), I suggest to make parallel=FALSE by default to avoid many warnings (or suppress the warnings)

Error with 0-var check in ggs_geweke from sde0f

Hello,

I just submitted a pull request with a fix, but posting an issue as well. When using your ggmcmc package to look at diagnostics for a JAGS model, I encountered an error at line 40 of ggs_geweke:

> D.geweke <- D.geweke %>%
+     dplyr::group_by (Parameter, Chain, part) %>%
+     dplyr::summarize(m=mean(value), sde0f=sde0f(value), n=n())
Error in summarise_impl(.data, dots) : 
  Evaluation error: zero-variance series.

Traced this to the check for 0 variance in sde0f and believe my pull request fixes the issue: #63

Problem with dev tidyr

checking examples ... ERROR
Running examples in ‘ggmcmc-Ex.R’ failed
The error most likely occurred in:

> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: ggmcmc
> ### Title: Wrapper function that creates a single pdf file with all plots
> ###   that ggmcmc can produce.
> ### Aliases: ggmcmc ggmcmc-package
> 
> ### ** Examples
> 
> data(linear)
> ggmcmc(ggs(s))  # Directly from a coda object
Plotting histograms
Error in factor(x, levels = 1:n, labels = labels) : 
  invalid 'labels'; length 0 should be 1 or 3
Calls: ggmcmc ... print -> ggs_histogram -> cbind -> gl_unq -> factor
Execution halted

Could you please check to see if this is a new bug in tidyr?

How was the binary dataset created?

The documentation says

in order to illustrate the functions suitable for binary outcomes a new dataset must be simulated, and samples of parameters from a simple logistic regression model must be obtained

But it does not explain how this dataset was derived or created. An actual example would be useful. I have a logistic regression model and I would like to plot a roc-curve. I would like to understand how to simulate and create a dataset suitable for creating a roc-curve.

Allow custom facet_wrap layout

Hello,

Big fan of the package so far. Could you change the facet_wrap in all the plotting from nrow = 1 to the default or as an optional setting? If I want to plot more than 5 parameters they quickly run off the edge of the plots so a grid layout would be helpful.

Thanks

Kieran

ggmcmc: allow NULL for file name

I frequently try to append other things to the pdf. I suggest that NULL can be passed as a filename, in which case pdf and dev.off() are skipped. This does not break old code, and is 2 additional lines.

Choosing layout when plotting multiple parameters

My plotting panel in Rstudio is quite small, rendering the graphs hard to read. See below. It would be very convenient to either (1) have an ncol argument to choose number of columns or (2) have an faceting argument which defaults to TRUE but could be set to FALSE so that one could do a manual +facet_wrap(~Parameter, ncol=3) or, say, a +facet_grid(chain~Parameter). I like solution 2 better.

image

ggmcmc is really great! I just discovered it a week ago and I am thrilled to get MCMC into the safe home of ggplot2!

ggs_caterpillar ignores thick_ci and thin_ci arguments

Calling ggs_caterpillar with values for thick_ci and thin_ci doesn't change the plot.

When I call ggs_caterpillar(D) and ggs_caterpillar(D, thick_ci = c(0.4,0.6), I get identical plots. Using ggmcmc_0.7.3 from CRAN.

I get the same result when I install the github verson with devtools::install_github('xfim/ggmcmc')

stat_bin warning

Plotting histogram gives many ugly infos:

stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

I know these are not critical, but it would be better if they were suppressed by providing a fixed binwidth.

stan and ggs

The documentation says that ggs should work with stan output, but it does not.
Quite close to the top there is

S <- as.array(S)

which will always fail for the complex stanfit output. Since no examples or test cases are provided (would be nice....) this goes unnoticed.

HPD intervals in density plots and traceplots

It took me some time to figure out how to add HPD intervals to density plots. I think it would be great to have an argument like HPD=FALSE to ggs_density (horizontal) and ggs_traceplot (vertical).

Here you go for density:

# Functions to get HPD
hpd_lower = function(x, prob=0.95) HPDinterval(as.mcmc(x), prob=prob)[1]
hpd_upper = function(x, prob=0.95) HPDinterval(as.mcmc(x), prob=prob)[2]

# Make data for geom_segment to use
hpd_data = dplyr::group_by(my_ggs_data, Parameter) %>% 
  dplyr::summarize(x=hpd_lower(value), xend=hpd_upper(value))

# Density and HPD
ggplot(plot1_ggs, aes(x=value)) + 
  geom_density(aes(fill='gray', y=..scaled..)) + scale_fill_manual(values = "gray") +
  geom_segment(data=hpd_data, aes(x=x, xend=xend, y=0.01, yend=0.01), size=3) + 

Solve R CMD check in R-devel

Unexported object imported by a ‘:::’ call: ggplot2:::bin’

Otherwise in next R release, R CMD check will not pass.

ggp_ppsd and ggs_ppmean give error message

data(samples)
ggs_ppsd(ggs(S, family="y"), outcome="y")

Error in ggs_ppsd(ggs(d.stan, family = "beta"), outcome = "beta") :
The length of the outcome must be equal to the number of Parameters of the ggs object.

Version: 0.4.1

setting greek=T mixes up labels

When I call ggs_caterpillar with greek=T, the parsed labels appear on the chart in alphabetical order, not in the order of the medians of the distributions, so the bars on the chart get the wrong labels.

Calling ggs_caterpillar(g) and ggs_caterpillar(g, greek=T) the bars appear in the same places, but the labels appear in different order.
not_greek
greek

g.zip

Disable/change sorting in ggs_catarpillar

I make inference on a variable where there is a meaningful order to the indices. I would like to keep that order in the plot. Could you implement either (1) a method to disable value-sorting and keep the alphabetical sorting? or (2) an option to manually specify a sorting.

Here is the plot. In this case, I would just like them in the order of the index to see if discriminability covaries with item difficulty:
image

problem with ggs()

Hi, I'm having problems trying to make the ggs() function select a "family", with ggs() returning "Error in S[[1]] : this S4 class is not subsettable".
This error does not occur when family=NA, and also does not occur for mcmc.list objects (confirmed using the radon dataset).
I'm using R 3.5.3, the newest ggmcmc and rstan package (downloaded today) under a windows 10 64 bit environment.

Below is a reproducible example using the 8school example from the Rstan "getting started" page.

// saved as test.stan
data {
int<lower=0> J; // number of schools
real y[J]; // estimated treatment effects
real<lower=0> sigma[J]; // standard error of effect estimates
}
parameters {
real mu; // population treatment effect
real<lower=0> tau; // standard deviation in treatment effects
vector[J] eta; // unscaled deviation from mu by school
}
transformed parameters {
vector[J] theta = mu + tau * eta; // school treatment effects
}
model {
target += normal_lpdf(eta | 0, 1); // prior log-density
target += normal_lpdf(y | theta, sigma); // log-likelihood
}

// data
schools_dat <- list(J = 8,
y = c(28, 8, -3, 7, -1, 1, 18, 12),
sigma = c(15, 10, 16, 11, 9, 11, 10, 18))

// stan and ggmcmc code
library(rstan)
fit <- stan(file = 'test.stan', data = schools_dat)

library(ggmcmc)
ggs(fit, family="mu")

multiple models for caterplot

I'd like to add the ability to pass a list of models (in ggs format) to the caterplot function and then use facet_grid to split up the plots horizontally or vertically. Are you interested in having this functionality in the package and if so would you prefer I have it as a separate function or integrated into the original caterplot function?

R CMD check fails with dev version of tidyr

checking whether package ‘ggmcmc’ can be installed ... WARNING
Found the following significant warnings:
  Warning: replacing previous import by ‘tidyr::%>%’ when loading ‘ggmcmc’
See ‘/private/tmp/Rtmp5nYH93/check_cranb673158d7feb/ggmcmc.Rcheck/00install.out’ for details.

probably because you're doing too many unqualified imports

ggmcmc() can produce only one type of plots

Right now ggmcmc() produces a pdf file with all the types of plots. It would be possible to make it produce only one type of plots, and still take advantage of the division in several pages.

Suggested by dmenne by email.

Add HPD interval to ggs_density

In ggs_density, it would be nice to have a thick line along the x-axis indicating the X% HPD interval. Kruschke's BEST package does this as do many others. For consistency, maybe use the thick-and-thin ggs_catarpillar style.

ggmcmc is great, thank you!

separate the Rhat code from ggs_rhat

Hi,

Thanks for your package, which I'm making very heavy use of! I wonder if you'd be open to separating out the rhat calculation from ggs_rhat into a separate function? Happy to push some code doing that if it makes it easier.

A big chunk of what I use it for is simply the ggs() conversion of mcmc to a data frame, which I then conduct lots of post processing of. As part of that post-processing, I often find myself wanting to calculate the Rhat, and while ggs_rhat() does that for me, it's a bit round-about to try and pull the values from the ggplot object. I've ended up simply cutting the relevant rhat calculation code out of the code body, and using that as a stand-alone function instead.

Keeping order of parameters

Hi, I have a feature request. I would like to keep the order of parameters the same as it is from the output of the mcmc.list. I mostly use runjags package, so the summary() output of a runjags object will typically be in the order I set it in the monitor argument of run.jags().

E.g. The plots from the runjags model mod <- run.jags(..., monitor = c("gamma", "beta" )) will sort the parameters in alphabetical order, so it will appear as beta[1], beta[2],..., gamma[1], gamma[2],... etc.

I understand there is the sort=FALSE option in ggs() function, but this doesn't seem to be doing what I'd like it to.

One quick and lazy way that I found to work was to edit the lines in ggs.R where it says D$Parameter <- factor(D$Parameter) to D$Parameter <- factor(D$Parameter, levels = unique(D$Parameter)), in particular under the if statement for sort. This then allows me to run ggs() function with sort = FALSE, and it will give me plots exactly in the order as it appears from the runjags output.

The only thing is I don't know if this breaks the sort = FALSE functionality that you originally intended. I can't seem to figure out what this does?

Hope I was clear. Thanks for a great package!

Posterior predictive checks

Add visual ways to have a way to compare the distribution of the observed data against the distribution of the posterior predicted outcomes.

import (coda) still required

ggs_density as an example still required import of coda when used together with rstan. When coda is imported, traceplot(with implicit) rstan falsely dispatches to coda::traceplot, giving an error. rstan::traceplot works in this case, but is not nice.

ggmcmc github version, installed today.

Simple Stan test case

library(ggmcmc)
stanmodelcode <- "
data {
int<lower=0> N;
real y[N];
} 
parameters {
real mu;
} 
model {
mu ~ normal(0, 10);
y ~ normal(mu, 1); 
} 
"

y <- rnorm(20) 
dat <- list(N = 20, y = y); 
fit <- stan(model_code = stanmodelcode, data = dat, chains = 3)
#### Another bug in ggmcmc: this dependency is missing. 
# Things should work without the next line
library(coda) # Bit this one gives an warning which should be avoided by using namespaces
s = ggs(fit)
#Error in 1:n.samples : argument of length 0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.