Giter Club home page Giter Club logo

appengine-rstudio's Introduction

RStudio App Engine

App Engine is a managed compute service that recently started to support Docker environments that can run any code. This is a demo of deploying RStudio.

App Engine offers benefits such as auto scaling that turns off the instance when you are not using it. It reboots upon request of the application URL.

This builds on top of the persistent RStudio image developed for googleComputeEngineR, with which you can configure persistent GitHub/SSH keys and backups on Cloud Storage.

Pricing

Pricing is here. Flexible doesn't do a free tier yet, and the minimum is 1 instance running 24/7

A rough guide is $1.26 per 24hours per core, $0.17 per 24hours per GB of RAM.

Running a 1 core instance with 2GB of ram per day will be $1.60 for 24hours, or $0.07 an hour, $48 a month. This is more expensive than running an equivalent GCE instance, so this solution is only worth looking at if you have more users.

Launch

  1. If you haven't got one already, set up a Google Cloud Storage bucket that will contain the R session backup data
  2. Clone this repo
  3. The authentication service email will be the default App Engine project one: e.g. [email protected] - give this at least Cloud Storage access to the bucket from step 1
  4. Change this line in the Renviron.site to the bucket from step 1
GCS_SESSION_BUCKET="your bucket"
  1. [Optional] In the Dockerfile, add your default username that you use on other RStudio backups by altering your_username and your_password in the below. If you don't do this, you will be saving files under /home/rstudio username.
## add your default user and password
RUN useradd --create-home --shell /bin/bash your_username && \
    echo your_username:your_password | chpasswd
  1. In the same directory run:
gcloud app deploy --project your-project

It takes a while (10mins +)

  1. Once ready, log in at https://your-project.appspot.com with username rstudio and password rstudio, or the credentials you set in step 5.
  2. Configure Identity Aware Proxy https://console.cloud.google.com/iam-admin/iap for the App Engine project URL. How-to guide. This will add a Google OAuth2 login over the RStudio login page (much more secure)

You shuld now be able to log in and access all the files you saved under /home/your_user on the Cloud Service. When first logging in, it should load any SSH and GitHub configurations, and when you create a new RStudio Project with the same name as one already saved, it should download any missing files.

The server will turn off after some time of inactivity. See this link for more details.

Configuration

The app.yaml configures how the RStudio instance performs:

runtime: custom
env: flex
automatic_scaling:
  min_num_instances: 1
  max_num_instances: 1
  
resources:
  cpu: 2
  memory_gb: 4

The Dockerfile downloads a prepared RStudio instance with tools to persist data, and rserver.conf puts RStudio on port 8080 as required by App Engine, which will then route traffic onto normal web ports 80. It then adds any environment arguments saved in Renviron.site - minimum it needs GCS_SESSION_BUCKET but you can add other stuff such as other API keys, defaults etc. here.

Authentication

Identity Aware Proxy https://console.cloud.google.com/iam-admin/iap for the App Engine. How-to guide.

The instance uses this to add a Google OAuth2 proxy login over the top of the normal RStudio login, you can configure access

After Google Auth, you can log into RStudio using the default rstudio / rstudio login.

scaling config

The RStudio session disconnects on default scaling settings, as only one session is allowed per RStudio Server connection and appengine defaults to 2 https://cloud.google.com/appengine/docs/standard/python/how-instances-are-managed

Configure app.yaml to get around this:

app.yaml configuration

See how to config the app.yaml here

Set to only 1 per instance with 30min timeout via:

automatic_scaling:
  min_num_instances: 1
  max_num_instances: 1

If you want the instance to run all the time, replace basic scaling with manual scaling - but this is not worth doing as its much more expensive than running a GCE instance.

manual_scaling:
  instances: 1

Flexible containers do not support basic_scaling

Instance size

You can set the size of the instance via the resources config.

e.g.

resources:
  cpu: 2
  memory_gb: 2

reference

https://cloudyr.github.io/googleComputeEngineR/articles/persistent-rstudio.html

https://support.rstudio.com/hc/en-us/articles/200552316-Configuring-the-Server

https://github.com/rocker-org/rocker/blob/master/rstudio/testing/Dockerfile

https://cloud.google.com/appengine/docs/standard/python/an-overview-of-app-engine

debug

gcloud app --project [PROJECT-ID] instances enable-debug

appengine-rstudio's People

Contributors

markedmondson1234 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

voltek62

appengine-rstudio's Issues

Can't see Dockerfile env

Logging in to the debug shows the env is set

sudo docker inspect f5cc23207152

...
            "Env": [
                "GAE_DEPLOYMENT_ID=403820045745075277",
                "GAE_MEMORY_MB=4096",
                "GAE_SERVICE=default",
                "GAE_VERSION=20170902t120706",
                "GCLOUD_PROJECT=mark-rstudio",
                "GOOGLE_CLOUD_PROJECT=mark-rstudio",
                "PORT=8080",
                "GAE_INSTANCE=aef-default-20170902t120706-jf9c",
                "PATH=/usr/lib/rstudio-server/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "R_VERSION=3.4.1",
                "LC_ALL=en_US.UTF-8",
                "LANG=en_US.UTF-8",
                "TERM=xterm",
                "PANDOC_TEMPLATES_VERSION=1.18",
                "GCS_SESSION_BUCKET=gcer-mark-edmondson-gde-2017-08-11"
            ],
...

but they can't be access within the RStudio terminal

 printenv
R_PDFVIEWER=/usr/bin/xdg-open
R_RD4PDF=times,inconsolata,hyper
LD_LIBRARY_PATH=/usr/local/lib/R/lib::/lib:/usr/local/lib:/usr/lib/x86_64-linux-gnu:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server
RS_RPOSTBACK_PATH=/usr/lib/rstudio-server/bin/rpostback
R_SHARE_DIR=/usr/local/lib/R/share
LANG=en_US.UTF-8
DISPLAY=:0
RSTUDIO_HTTP_REFERER=https://mark-rstudio.appspot.com/
TAR=/usr/bin/tar
R_BROWSER=xdg-open
R_DOC_DIR=/usr/local/lib/R/doc
R_SESSION_TMPDIR=/tmp/RtmpZb3SSn
LN_S=ln -s
R_LIBS_USER=/usr/local/lib/R/site-library
GIT_EDITOR=/usr/lib/rstudio-server/bin/postback/rpostback-editfile
R_INCLUDE_DIR=/usr/local/lib/R/include
USER=rstudio
PAGER=/bin/cat
PWD=/home/rstudio
R_LIBS=/usr/local/lib/R/site-library:/usr/local/lib/R/library:/usr/lib/R/library
SSH_ASKPASS=rpostback-askpass
HOME=/home/rstudio
SED=/usr/bin/sed
RSTUDIO_PANDOC=/usr/lib/rstudio-server/bin/pandoc
R_PAPERSIZE=letter
R_SYSTEM_ABI=linux,gcc,gxx,gfortran,?
RSTUDIO=1
SVN_EDITOR=/usr/lib/rstudio-server/bin/postback/rpostback-editfile
R_ZIPCMD=/usr/bin/zip
R_HOME=/usr/local/lib/R
R_BZIPCMD=/usr/bin/bzip2
RSTUDIO_USER_IDENTITY=rstudio
RMARKDOWN_MATHJAX_PATH=/usr/lib/rstudio-server/resources/mathjax-26
R_PRINTCMD=/usr/bin/lpr
TERM=dumb
R_GZIPCMD=/usr/bin/gzip
RSTUDIO_WINUTILS=bin/winutils
SHLVL=1
R_PLATFORM=x86_64-pc-linux-gnu
RSTUDIO_SESSION_STREAM=rstudio-d
R_LIBS_SITE=
LOGNAME=rstudio
R_UNZIPCMD=/usr/bin/unzip
GIT_ASKPASS=rpostback-askpass
PATH=/usr/lib/rstudio-server/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:bin/msys-ssh-1000-18
PS1=\w$ 
R_TEXI2DVICMD=/usr/bin/texi2dvi
MAKE=make
_=/usr/bin/printenv

and consequently RStudio

> Sys.getenv("GCE_SESSION_BUCKET")
[1] ""
> Sys.getenv("GCS_SESSION_BUCKET")
[1] ""
> Sys.getenv("GAE_DEPLOYMENT_ID")
[1] ""

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.