Giter Club home page Giter Club logo

docker-rstats's People

Contributors

benhamner avatar brandenkmurray avatar dansbecker avatar dchudz avatar djherbis avatar emzeq avatar ifigotin avatar jplotts avatar kevinykuo avatar mrisdal avatar neil-schneider avatar nerdcha avatar philmod avatar rosbo avatar sebbov avatar vimota avatar wcuk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

docker-rstats's Issues

update package "mlr3verse"

Hi, I tried to update the following package on kaggle with the command:

devtools :: update_packages ("mlr3verse", upgrade = "always", dependencies = TRUE)

Unfortunately it doesn't work. Can you help me please?

The packages on Kaggle are very old:

mlr3verse (0.1.1 -> 0.2.1 ) [CRAN]
lgr (0.3.4 -> 0.4.2 ) [CRAN]
mlr3misc (0.2.0 -> 0.9.1 ) [CRAN]
paradox (0.2.0 -> 0.7.1 ) [CRAN]
future.apply (1.5.0 -> 1.7.0 ) [CRAN]
mlr3measures (0.1.3 -> 0.3.1 ) [CRAN]
parallelly (NA -> 1.25.0) [CRAN]
palmerpen... (NA -> 0.1.0 ) [CRAN]
future (1.17.0 -> 1.21.0) [CRAN]
globals (0.12.5 -> 0.14.0) [CRAN]
clue (0.3-57 -> 0.3-59) [CRAN]
mlr3 (0.2.0 -> 0.11.0) [CRAN]
bbotk (NA -> 0.3.2 ) [CRAN]
mlr3pipel... (0.1.3 -> 0.3.4 ) [CRAN]
distr6 (NA -> 1.5.2 ) [CRAN]
R62S3 (NA -> 1.4.1 ) [CRAN]
set6 (NA -> 0.2.1 ) [CRAN]
rlang (0.4.10 -> 0.4.11) [CRAN]
tibble (3.1.1 -> 3.1.2 ) [CRAN]
ellipsis (0.3.1 -> 0.3.2 ) [CRAN]
fansi (0.4.2 -> 0.5.0 ) [CRAN]
pillar (1.6.0 -> 1.6.1 ) [CRAN]
vctrs (0.3.7 -> 0.3.8 ) [CRAN]
colorspace (2.0-0 -> 2.0-1 ) [CRAN]
cli (2.4.0 -> 2.5.0 ) [CRAN]
mlr3cluster (NA -> 0.1.1 ) [CRAN]
mlr3data (NA -> 0.3.1 ) [CRAN]
mlr3filters (0.2.0 -> 0.4.1 ) [CRAN]
mlr3fselect (NA -> 0.5.1 ) [CRAN]
mlr3learners (0.2.0 -> 0.4.5 ) [CRAN]
mlr3proba (NA -> 0.4.0 ) [CRAN]
mlr3tuning (0.1.2 -> 0.8.0 ) [CRAN]
mlr3viz (0.1.1 -> 0.5.3 ) [CRAN]

Update R environment

Would it be possible to update the R kernel environment? I was looking to use the sf package (which is on CRAN), but noticed that it's not available in the kernel.

image

RStudio Server starts with R 3.1.1 by default

platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          3                           
minor          1.1                         
year           2014                        
month          07                          
day            10                          
svn rev        66115                       
language       R                           
version.string R version 3.1.1 (2014-07-10)
nickname       Sock it to Me

Should be upgraded to latest version.

Cannot open parquet files

I cannot open parquet files. Loading reticulate package and importing pandas to access read_parquet() function doesn't work:

library(reticulate)
pandas <- import("pandas")

Error in py_module_import(module, convert = convert): ImportError: No module named pandas

If I configure the environment as follows

Sys.setenv(RETICULATE_PYTHON="/opt/conda/envs/py36/bin/python3.6", required=TRUE)

the import is successful, but the call to the read_parquet function results in a new error:

mydata <- pandas$read_parquet("mydata.parquet")

Error in py_call_impl(callable, dots$args, dots$keywords): ImportError: Unable to find a usable engine; tried using: 'pyarrow', 'fastparquet'. pyarrow or fastparquet is required for parquet support

I would need support to open parquet files with reticulate or by other means in my R notebooks at Kaggle.

Thank you

Distance Package

Showing up this warning on kaggle notebook : Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
also installing the dependency ‘mrds’
Warning message in install.packages("Distance"):
“installation of package ‘mrds’ had non-zero exit status”
Warning message in install.packages("Distance"):
“installation of package ‘Distance’ had non-zero exit status”

Then I Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)also installing the dependency ‘mrds’

Warning message in install.packages("Distance"):
“installation of package ‘mrds’ had non-zero exit status”
Warning message in install.packages("Distance"):
“installation of package ‘Distance’ had non-zero exit status”tried this way , Warning message in install.packages("Distance", dependencies = FALSE):
“installation of package ‘Distance’ had non-zero exit status”.

It still continue to show non-zero exit status.
Can you please add this package. I have also tried to installl it without dependencies, but nothing really worked.

resuming docker downloads

is there any way of resuming ocker run --rm -it kaggle/rstatson connection failure...

OR, can our good providers help us with a better way of downloading the files from here bit by bit. some of us have extremely spotty connections. my download failed after i had finished downloading about 10GB+. had to think about either starting ove or giving up on this.

Spark

Hi all.

I was trying to use Spark from within an workbook, but I'm having some versioning problems. It looks the current version of Spark/Hadoop is not compatible with the current version of Java.

library(sparklyr)
sc <- spark_connect("local")

Error: Java 11 is only supported for Spark 3.0.0+
Traceback:

  1. spark_connect("local")
  2. shell_connection(master = master, spark_home = spark_home, method = method,
    . app_name = app_name, version = version, hadoop_version = hadoop_version,
    . shell_args = shell_args, config = config, service = spark_config_value(config,
    . "sparklyr.gateway.service", FALSE), remote = spark_config_value(config,
    . "sparklyr.gateway.remote", spark_master_is_yarn_cluster(master,
    . config)), extensions = extensions, batch = NULL,
    . scala_version = scala_version)
  3. validate_java_version(master, spark_home)
  4. stop("Java 11 is only supported for Spark 3.0.0+", call. = FALSE)

spark_installed_versions()

spark hadoop dir
2.4.3 2.7 /root/spark/spark-2.4.3-bin-hadoop2.7

Upgrade h2o package

It looks like the r docker image is using an older version of h2o. When using the h2o.deeplearning()function it produces an error

label: unnamed-chunk-5
java version "1.7.0_91"
OpenJDK Runtime Environment (IcedTea 2.6.2) (7u91-2.6.2-1)
OpenJDK 64-Bit Server VM (build 24.91-b01, mixed mode)
Quitting from lines 49-91 (script.Rmd) 
Error: 
  unexpected argument "data", is this legacy code? Try ?h2o.shim

Execution halted

Using h2o.randomForest() works fine. Below is the code I am running in Kaggle RMarkdown Notebook with data already read in:

library(h2o)
## start a local cluster
localH2O = h2o.init(max_mem_size = '6g', # use 6GB of RAM of 8GB available on Kaggle
                    nthreads = -1) # use all CPUs (8 on my personal computer :3)

## import MNIST data as H2O
train_h2o = as.h2o(localH2O,train)
test_h2o = as.h2o(localH2O,test)

## set timer
s <- proc.time()

## train model
model =
  h2o.deeplearning(x = 2:785,  # column numbers for predictors
                   y = 1,   # column number for label
                   data = train_h2o, # data in H2O format
                   activation = "RectifierWithDropout", # algorithm
                   input_dropout_ratio = 0.2, # % of inputs dropout
                   hidden_dropout_ratios = c(0.5,0.5), # % for nodes dropout
                   balance_classes = TRUE, 
                   hidden = c(100,100), # two layers of 100 nodes
                   momentum_stable = 0.99,
                   nesterov_accelerated_gradient = T, # use it for speed
                   epochs = 10) # max. no. of epochs

P.S.
The link to R Docker image has a trailing ")" and, therefore, is broken.

untitled

cmdstanr

Is it possible to add cmdstanr? if yes what would be the path (set_cmdstan_path("~/CMDSTAN")) to it?

Include new ML package - NNS

Hi Folks, could be possible include NNS package? I'm trying to use it at Optiver Competition but I can only submit without using internet, but I need Internet to install NNS

my code to install at kaggle:

install.packages("NNS"); 
library(NNS)

Unable to install topicmodels package

I am having trouble installing the topicmodels package. It looks like there is a broken dependency with the gsl library. I have tried installing gsl through apt-get, but it seems the headers are getting installed someplace where R can't find them.

* installing *source* package ‘topicmodels’ ...
** package ‘topicmodels’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include  -fpic  "-I/usr/include/gsl/" -c cokus.c -o cokus.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include  -fpic  "-I/usr/include/gsl/" -c common.c -o common.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include  -fpic  "-I/usr/include/gsl/" -c ctm.c -o ctm.o
ctm.c:29:25: fatal error: gsl/gsl_rng.h: No such file or directory
 #include <gsl/gsl_rng.h>
                         ^
compilation terminated.
/usr/local/lib/R/etc/Makeconf:167: recipe for target 'ctm.o' failed
make: *** [ctm.o] Error 1
ERROR: compilation failed for package ‘topicmodels’
* removing ‘/usr/local/lib/R/site-library/topicmodels’

The downloaded source packages are in
	‘/tmp/RtmpNWaQuO/downloaded_packages’
Warning message:
In install.packages("topicmodels") :
  installation of package ‘topicmodels’ had non-zero exit status

Thank you.

How to set TF 2.2 in R ?

Hi. I wonder why TF is 2.2 in Python docker but not in R?

RUN pip install --user virtualenv && R -e 'keras::install_keras(tensorflow = "1.15", extra_packages = c("pandas", "numpy", "pycryptodome"))'

I tried to reinstall several times via tensorflow::install_tensorflow(version='2.2'). However it did not work. Could you help?

lstat patches/: no such file or directory

I'm trying to build a docker image from your Dockerfile and getting this error:

Step 6/15 : ADD patches/ /tmp/patches/
lstat patches/: no such file or directory

There's no patches dir in your repository, or should I create it manually and leave empty?

Issue installing autoxgboost

Hi,
I was trying to install the following package from github : https://github.com/ja-thomas/autoxgboost

It is not a CRAN package, but the build does not move so much. Anyway I run into the following problem :

devtools::install_github("ja-thomas/autoxgboost", dependencies = TRUE)

Installing 11 packages: digest, R6, waldo, testthat, generics, data.table, RcppArmadillo, cmaes, DiceKriging, mlrCPO, mlrMBO

Installing packages into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

Error: Failed to install 'autoxgboost' from GitHub:
(converted from warning) installation of package ‘R6’ had non-zero exit status
Traceback:

  1. devtools::install_github("ja-thomas/autoxgboost", dependencies = TRUE)
  2. pkgbuild::with_build_tools({
    . ellipsis::check_dots_used(action = getOption("devtools.ellipsis_action",
    . rlang::warn))
    . {
    . remotes <- lapply(repo, github_remote, ref = ref, subdir = subdir,
    . auth_token = auth_token, host = host)
    . install_remotes(remotes, auth_token = auth_token, host = host,
    . dependencies = dependencies, upgrade = upgrade, force = force,
    . quiet = quiet, build = build, build_opts = build_opts,
    . build_manual = build_manual, build_vignettes = build_vignettes,
    . repos = repos, type = type, ...)
    . }
    . }, required = FALSE)
  3. install_remotes(remotes, auth_token = auth_token, host = host,
    . dependencies = dependencies, upgrade = upgrade, force = force,
    . quiet = quiet, build = build, build_opts = build_opts, build_manual = build_manual,
    . build_vignettes = build_vignettes, repos = repos, type = type,
    . ...)
  4. tryCatch(res[[i]] <- install_remote(remotes[[i]], ...), error = function(e) {
    . stop(remote_install_error(remotes[[i]], e))
    . })
  5. tryCatchList(expr, classes, parentenv, handlers)
  6. tryCatchOne(expr, names, parentenv, handlers[[1L]])
  7. value[3L]

And :

install.packages("R6", dependencies = TRUE, verbose = TRUE)

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

system (cmd0): /usr/local/lib/R/bin/R CMD INSTALL

foundpkgs: R6, /tmp/RtmpNMSWAV/downloaded_packages/R6_2.5.0.tar.gz

files: /tmp/RtmpNMSWAV/downloaded_packages/R6_2.5.0.tar.gz

Warning message in install.packages("R6", dependencies = TRUE, verbose = TRUE):
“installation of package ‘R6’ had non-zero exit status”

I am a bit surprised by the dryness of the R error message which does not say a lot about this.
Do you see a way to solve this ?

R error.

Today I'm facing the following error when using reticulate. This did not happen just yesterday.

Describe your issue.

Error when importing sklearn.mixture

Reproducing Code Example

import("sklearn.mixture")

Error message

Error in py_module_import(module, convert = convert): ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.30' not found (required by /root/.local/share/r-miniconda/envs/r-reticulate/lib/python3.8/site-packages/scipy/optimize/_highs/_highs_wrapper.cpython-38-x86_64-linux-gnu.so)

Traceback:

1. import("sklearn.mixture")
2. py_module_import(module, convert = convert)

resuming docker pull downloads.

is there any way of resuming ocker run --rm -it kaggle/rstatson connection failure...

OR, can our good providers help us with a better way of downloading the files from here bit by bit. some of us have extremely spotty connections. my download failed after i had finished downloading about 10GB+. had to think about either starting over or giving up on this.

Install fastai Library in GPU and CPU Kernels

Hi Team Kaggle

Please install this library for both GPU and CPU kernels:
https://github.com/Kaggle/docker-rstats/blob/master/Dockerfile
https://github.com/Kaggle/docker-rstats/blob/master/gpu.Dockerfile

Thanks to Turgut who published the notebook on how to use fastai in R for image classification

Iam unable to use it for my learning and submission in competition because its not part of your kernels.

Please acknowledge and help adding the same asap

Regards
Gayathri

`naryn` package

Hi,

Please add the naryn R package. It is a toolkit for medical records data analysis. It implements an efficient data structure for storing and querying medical records, which can be useful in many competitions.

It is avaliable on CRAN but not on the kaggle notebooks:

https://cran.r-project.org/web/packages/naryn/index.html

Thank you very much

Install patchwork and update cowplot package

Hello, first of all, I don't know how to make a pull request on github, thats why I am writing here (sorry).

Are you able to install the package patchwork from github? it can be found here: https://github.com/thomasp85/patchwork

I also wonder if you can update the cowplot package? It was updatad three months ago on CRAN but Kaggle uses the older version.

Thanks before hand!

Thank you for all the hard work (^_^)

Problem with Rmd script

I am having a problem with the following block of code when running an Rmd script;

fit <- auto.arima(northts, lambda=0, d=0, D=1, max.order=4,
                  stepwise=FALSE, approximation=FALSE)
autoplot(forecast(fit, h = 36)) + xlim(2010, 2018) 

Executing this results in a 'unsupported character in output path' error, and seems to only arise fro plotting the figure, which is strange as ggplot seems to be supported well everywhere else, and I get the same error just using the basic 'plot(forecast(fit, h = 36)) + xlim(2010, 2018)' too.

The Rmd file in question can be found here

torch library update

To whom may it concern,

Hi, I am using torch package on kaggle notebook but I found that the installed version is little behind.

packageVersion("torch")

shows that the installed version is v0.3.0 but the latest version is 0.6.0. Could you update the torch package on the docker file?

Sincerely,
Issac

size of kaggle/rstats

I've been running docker run --rm -it kaggle/rstats for a two days now (internet is slightly slow) but i'e got all the parts bt there's a file f0b24ff7f2aa that is currently at 6GB and doesnt show how much is left.

can someone please inform me on the maximum size of that file please/

Error: Python module h2o4gpu was not found.

Hi there.

Using h2o4gpu package (which comes already installed in the image ) it gives the following error.
Error: Python module h2o4gpu was not found.

Thanks a lot for the very useful Docker image +++

library(h2o4gpu)
x <- iris[1:4]
y <- as.integer(iris$Species)
model <- h2o4gpu.random_forest_classifier() %>% fit(x, y)

Update TSstudio Package

Hi!

Would it possible to update the TSstudio R package to version 0.1.1?

Thank you in advance,
Rami

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.