rstudio / ai-blog Goto Github PK

View Code? Open in Web Editor NEW

49.0 13.0 36.0 216.73 MB

Repository for the RStudio AI Blog (formerly: TensorFlow for R Blog)

Home Page: https://blogs.rstudio.com/ai/

HTML 96.39% TeX 1.65% CSS 0.06% JavaScript 1.88% Scala 0.02%

ai-blog's People

Contributors

Stargazers

Watchers

ai-blog's Issues

post for RStudio/tensorflow blog

Hi, I have just done writing a post for tensorflow blog (Loading Bert model from R).
Could you review it, please?
https://github.com/henry090/BERT-from-R

VAE with FNN

Hello,

I am trying to replicate the results of the following article written by @skeydan :

https://blogs.rstudio.com/ai/posts/2020-07-31-fnn-vae-for-noisy-timeseries/

I would like to get an help on why teh following code :

library(tidyverse)
library(tensorflow)
library(keras)
library(tfdatasets)
library(tfautograph)
library(reticulate)
library(purrr)
library(listarrays)
library(abind)
library(deSolve)


parameters <- c(a = .2,
                b = .2,
                c = 5.7)

initial_state <-
  c(x = 1,
    y = 1,
    z = 1.05)

roessler <- function(t, state, parameters) {
  with(as.list(c(state, parameters)), {
    dx <- -y - z
    dy <- x + a * y
    dz = b + z * (x - c)
    
    list(c(dx, dy, dz))
  })
}

times <- seq(0, 2500, length.out = 20000)

roessler_ts <-
  ode(
    y = initial_state,
    times = times,
    func = roessler,
    parms = parameters,
    method = "lsoda"
  ) %>% unclass() %>% as_tibble()

n <- 10000
roessler <- roessler_ts$x[1:n]

roessler <- scale(roessler)



# add noise
noise <- 1 # also used 1.5, 2, 2.5
roessler <- roessler + rnorm(10000, mean = 0, sd = noise)
str(roessler)



vae_encoder_model <- function(n_timesteps,
                              n_features,
                              n_latent,
                              name = NULL) {
  keras_model_custom(name = name, function(self) {
    self$conv1 <- layer_conv_1d(kernel_size = 3,
                                filters = 16,
                                strides = 2)
    self$act1 <- layer_activation_leaky_relu()
    self$batchnorm1 <- layer_batch_normalization()
    self$conv2 <- layer_conv_1d(kernel_size = 7,
                                filters = 32,
                                strides = 2)
    self$act2 <- layer_activation_leaky_relu()
    self$batchnorm2 <- layer_batch_normalization()
    self$conv3 <- layer_conv_1d(kernel_size = 9,
                                filters = 64,
                                strides = 2)
    self$act3 <- layer_activation_leaky_relu()
    self$batchnorm3 <- layer_batch_normalization()
    self$conv4 <- layer_conv_1d(
      kernel_size = 9,
      filters = n_latent,
      strides = 2,
      activation = "linear" 
    )
    self$batchnorm4 <- layer_batch_normalization()
    self$flat <- layer_flatten()
    
    function (x, mask = NULL) {
      x %>%
        self$conv1() %>%
        self$act1() %>%
        self$batchnorm1() %>%
        self$conv2() %>%
        self$act2() %>%
        self$batchnorm2() %>%
        self$conv3() %>%
        self$act3() %>%
        self$batchnorm3() %>%
        self$conv4() %>%
        self$batchnorm4() %>%
        self$flat()
    }
  })
}




vae_decoder_model <- function(n_timesteps,
                              n_features,
                              n_latent,
                              name = NULL) {
  keras_model_custom(name = name, function(self) {
    self$reshape <- layer_reshape(target_shape = c(1, n_latent))
    self$conv1 <- layer_conv_1d_transpose(kernel_size = 15,
                                          filters = 64,
                                          strides = 3)
    self$act1 <- layer_activation_leaky_relu()
    self$batchnorm1 <- layer_batch_normalization()
    self$conv2 <- layer_conv_1d_transpose(kernel_size = 11,
                                          filters = 32,
                                          strides = 3)
    self$act2 <- layer_activation_leaky_relu()
    self$batchnorm2 <- layer_batch_normalization()
    self$conv3 <- layer_conv_1d_transpose(
      kernel_size = 9,
      filters = 16,
      strides = 2,
      output_padding = 1
    )
    self$act3 <- layer_activation_leaky_relu()
    self$batchnorm3 <- layer_batch_normalization()
    self$conv4 <- layer_conv_1d_transpose(
      kernel_size = 7,
      filters = 1,
      strides = 1,
      activation = "linear"
    )
    self$batchnorm4 <- layer_batch_normalization()
    
    function (x, mask = NULL) {
      x %>%
        self$reshape() %>%
        self$conv1() %>%
        self$act1() %>%
        self$batchnorm1() %>%
        self$conv2() %>%
        self$act2() %>%
        self$batchnorm2() %>%
        self$conv3() %>%
        self$act3() %>%
        self$batchnorm3() %>%
        self$conv4() %>%
        self$batchnorm4()
    }
  })
}




# to reparameterize encoder output before calling decoder
reparameterize <- function(mean, logvar = 0) {
  eps <- k_random_normal(shape = n_latent)
  eps * k_exp(logvar * 0.5) + mean
}

# loss FNN


loss_false_nn <- function(x) {
  
  # changing these parameters is equivalent to
  # changing the strength of the regularizer, so we keep these fixed (these values
  # correspond to the original values used in Kennel et al 1992).
  rtol <- 10 
  atol <- 2
  k_frac <- 0.01
  
  k <- max(1, floor(k_frac * batch_size))
  
  ## Vectorized version of distance matrix calculation
  tri_mask <-
    tf$linalg$band_part(
      tf$ones(
        shape = c(tf$cast(n_latent, tf$int32), tf$cast(n_latent, tf$int32)),
        dtype = tf$float32
      ),
      num_lower = -1L,
      num_upper = 0L
    )
  
  # latent x batch_size x latent
  batch_masked <-
    tf$multiply(tri_mask[, tf$newaxis,], x[tf$newaxis, reticulate::py_ellipsis()])
  
  # latent x batch_size x 1
  x_squared <-
    tf$reduce_sum(batch_masked * batch_masked,
                  axis = 2L,
                  keepdims = TRUE)
  
  # latent x batch_size x batch_size
  pdist_vector <- x_squared + tf$transpose(x_squared, perm = c(0L, 2L, 1L)) -
    2 * tf$matmul(batch_masked, tf$transpose(batch_masked, perm = c(0L, 2L, 1L)))
  
  #(latent, batch_size, batch_size)
  all_dists <- pdist_vector
  # latent
  all_ra <-
    tf$sqrt((1 / (
      batch_size * tf$range(1, 1 + n_latent, dtype = tf$float32)
    )) *
      tf$reduce_sum(tf$square(
        batch_masked - tf$reduce_mean(batch_masked, axis = 1L, keepdims = TRUE)
      ), axis = c(1L, 2L)))
  
  # Avoid singularity in the case of zeros
  #(latent, batch_size, batch_size)
  all_dists <-
    tf$clip_by_value(all_dists, 1e-14, tf$reduce_max(all_dists))
  
  #inds = tf.argsort(all_dists, axis=-1)
  top_k <- tf$math$top_k(-all_dists, tf$cast(k + 1, tf$int32))
  # (#(latent, batch_size, batch_size)
  top_indices <- top_k[[1]]
  
  #(latent, batch_size, batch_size)
  neighbor_dists_d <-
    tf$gather(all_dists, top_indices, batch_dims = -1L)
  #(latent - 1, batch_size, batch_size)
  neighbor_new_dists <-
    tf$gather(all_dists[2:-1, , ],
              top_indices[1:-2, , ],
              batch_dims = -1L)
  
  # Eq. 4 of Kennel et al.
  #(latent - 1, batch_size, batch_size)
  scaled_dist <- tf$sqrt((
    tf$square(neighbor_new_dists) -
      # (9, 8, 2)
      tf$square(neighbor_dists_d[1:-2, , ])) /
      # (9, 8, 2)
      tf$square(neighbor_dists_d[1:-2, , ])
  )
  
  # Kennel condition #1
  #(latent - 1, batch_size, batch_size)
  is_false_change <- (scaled_dist > rtol)
  # Kennel condition 2
  #(latent - 1, batch_size, batch_size)
  is_large_jump <-
    (neighbor_new_dists > atol * all_ra[1:-2, tf$newaxis, tf$newaxis])
  
  is_false_neighbor <-
    tf$math$logical_or(is_false_change, is_large_jump)
  #(latent - 1, batch_size, 1)
  total_false_neighbors <-
    tf$cast(is_false_neighbor, tf$int32)[reticulate::py_ellipsis(), 2:(k + 2)]
  
  # Pad zero to match dimensionality of latent space
  # (latent - 1)
  reg_weights <-
    1 - tf$reduce_mean(tf$cast(total_false_neighbors, tf$float32), axis = c(1L, 2L))
  # (latent,)
  reg_weights <- tf$pad(reg_weights, list(list(1L, 0L)))
  
  # Find batch average activity
  
  # L2 Activity regularization
  activations_batch_averaged <-
    tf$sqrt(tf$reduce_mean(tf$square(x), axis = 0L))
  
  loss <- tf$reduce_sum(tf$multiply(reg_weights, activations_batch_averaged))
  loss
  
}



# loss has 3 components: NLL, KL, and FNN
# otherwise, this is just normal TF2-style training 
train_step_vae <- function(batch) {
  with (tf$GradientTape(persistent = TRUE) %as% tape, {
    code <- encoder(batch[[1]])
    z <- reparameterize(code)
    prediction <- decoder(z)
    
    l_mse <- mse_loss(batch[[2]], prediction)
    # see loss_false_nn in 2 previous posts
    l_fnn <- loss_false_nn(code)
    # KL divergence to a standard normal
    l_kl <- -0.5 * k_mean(1 - k_square(z))
    # overall loss is a weighted sum of all 3 components
    loss <- l_mse + fnn_weight * l_fnn + kl_weight * l_kl
  })
  
  encoder_gradients <-
    tape$gradient(loss, encoder$trainable_variables)
  decoder_gradients <-
    tape$gradient(loss, decoder$trainable_variables)
  
  optimizer$apply_gradients(purrr::transpose(list(
    encoder_gradients, encoder$trainable_variables
  )))
  optimizer$apply_gradients(purrr::transpose(list(
    decoder_gradients, decoder$trainable_variables
  )))
  
  train_loss(loss)
  train_mse(l_mse)
  train_fnn(l_fnn)
  train_kl(l_kl)
}

# wrap it all in autograph
training_loop_vae <- tf_function(autograph(function(ds_train) {
  
  for (batch in ds_train) {
    train_step_vae(batch) 
    #str(batch)
  }
  
  tf$print("Loss: ", train_loss$result())
  tf$print("MSE: ", train_mse$result())
  tf$print("FNN loss: ", train_fnn$result())
  tf$print("KL loss: ", train_kl$result())
  
  train_loss$reset_states()
  train_mse$reset_states()
  train_fnn$reset_states()
  train_kl$reset_states()
  
}))




n_latent <- 10L
n_features <- 1

encoder <- vae_encoder_model(n_timesteps,
                             n_features,
                             n_latent)

decoder <- vae_decoder_model(n_timesteps,
                             n_features,
                             n_latent)
mse_loss <-
  tf$keras$losses$MeanSquaredError(reduction = tf$keras$losses$Reduction$SUM)

train_loss <- tf$keras$metrics$Mean(name = 'train_loss')
train_fnn <- tf$keras$metrics$Mean(name = 'train_fnn')
train_mse <-  tf$keras$metrics$Mean(name = 'train_mse')
train_kl <-  tf$keras$metrics$Mean(name = 'train_kl')

fnn_multiplier <- 1 # default value used in nearly all cases (see text)
fnn_weight <- fnn_multiplier * nrow(x_train)/batch_size

kl_weight <- 1

optimizer <- optimizer_adam(learning_rate = 1e-3)

n_timesteps <- 120
batch_size <- 32

gen_timesteps <- function(x, n_timesteps) {
  do.call(rbind,
          purrr::map(seq_along(x),
                     function(i) {
                       start <- i
                       end <- i + n_timesteps - 1
                       out <- x[start:end]
                       out
                     })
  ) %>%
    na.omit()
}


train <- gen_timesteps(roessler[1:(n/2)], 2 * n_timesteps)
test <- gen_timesteps(roessler[(n/2):n], 2 * n_timesteps) 

dim(train) <- c(dim(train), 1)
dim(test) <- c(dim(test), 1)

x_train <- train[ , 1:n_timesteps, , drop = FALSE]
y_train <- train[ , (n_timesteps + 1):(2*n_timesteps), , drop = FALSE]

ds_train <- tensor_slices_dataset(list(x_train, y_train)) %>%
  dataset_shuffle(nrow(x_train)) %>%
  dataset_batch(batch_size)

x_test <- test[ , 1:n_timesteps, , drop = FALSE]
y_test <- test[ , (n_timesteps + 1):(2*n_timesteps), , drop = FALSE]

ds_test <- tensor_slices_dataset(list(x_test, y_test)) %>%
  dataset_batch(nrow(x_test))

for (epoch in 1:100) {
  cat("Epoch: ", epoch, " -----------\n")
  training_loop_vae(ds_train)

  test_batch <- as_iterator(ds_test) %>% iter_next()
  encoded <- encoder(test_batch[[1]][1:1000])
  test_var <- tf$math$reduce_variance(encoded, axis = 0L)
  print(test_var %>% as.numeric() %>% round(5))
}

fails with the following message:

Error in py_call_impl(callable, call_args$unnamed, call_args$named) : 
  KeyError: 'The optimizer cannot recognize variable conv1d_transpose_24/kernel:0. This usually means you are trying to call the optimizer to update different parts of the model separately. Please call `optimizer.build(variables)` with the full list of trainable variables before the training loop or use legacy optimizer `tf.keras.optimizers.legacy.Adam.'
Run `reticulate::py_last_error()` for details.

Thanks.

Aiming for tidyverse style in blog posts?

Hi, I really enjoy reading your blog 👍. As an author of {styler}, my eye is very trained on code styling and I noticed a few deviations from the tidyverse style guide in the last few blog entries I read, so I wondered if it's worth adding a section on coding style to the CONTRIBUTING.md (if you want to adhere to the tidyverse style). {styler} can style rmd via the addin or the CLI API, {knitr} can pretty-print the output (without source modification) when `tidy = 'styler' as described in the R markdown cookbook).

The blog seems to be down

could not find function "%<-%

c(indices, target, segments) %<-% list(list(),list(),list())
Error in c(indices, target, segments) %<-% list(list(), list(), list()) :
could not find function "%<-%"

Can you help me, please?

Update _site.yml with move to blogs.posit.co/ai

Hi @skeydan -

If/when the AI Blog moves to posit.co, please update the Google Analytics tracking code. Details on how to do that are in Confluence.

Thank you!
Sarah

getting error in quo(sym(.))

Hello, I am testing code in https://blogs.rstudio.com/tensorflow/posts/2018-06-25-sunspots-lstm/
I am getting an error in this line:

cols <- map(coln, quo(sym(.)))

the error is:

Error in is_symbol(x): object '.' not found
Traceback:
1. map(coln, quo(sym(.)))
2. .f(.x[[i]], ...)
3. eval_tidy(~sym(.))
4. sym(.)
5. is_symbol(x)

I think is tidy related but I am not sure how I could fix it.
Could you give me an help?
Thanks.
Cheers

Move `master` branch to `main`

Cc @skeydan

The master branch of this repository will soon be renamed to main, as part of a coordinated change across several GitHub organizations (including, but not limited to: tidyverse, r-lib, tidymodels, and sol-eng). We anticipate this will happen by the end of September 2021.

That will be preceded by a release of the usethis package, which will gain some functionality around detecting and adapting to a renamed default branch. There will also be a blog post at the time of this master --> main change.

The purpose of this issue is to:

Help us firm up the list of targetted repositories
Make sure all maintainers are aware of what's coming
Give us an issue to close when the job is done
Give us a place to put advice for collaborators re: how to adapt

message id: entire_lizard

Principle mistake in blog code example

In your article
Chollet & Allaire (2017, Dec. 20). TensorFlow for R: Time Series Forecasting with Recurrent Neural Networks. Retrieved from https://blogs.rstudio.com/tensorflow/posts/2017-12-20-time-series-forecasting-with-recurrent-neural-networks/
the code of the generator function appears to be untested! The generator function works on features in columns and samples in rows, but the data shown at the top and as explained/treated in the text is the other way around, samples in col and features in rows. The generator tries to access rows as samples, since there are only 16 Features, it immediately runs out of boundary
RGDS
hk

Trouble with install_tensorflow(version = "somekindofversion")

Hi, I'm having trouble in installing a good version of tensorflow on my laptop.
First I installed a basic version of it (just using install_tensorflow()) but when I want to confirm that the installation succeeded:

_> library(tensorflow)

sess = tf$Session()
2018-08-28 12:06:53.362284: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
hello <- tf$constant('Hello, TensorFlow!')
sess$run(hello)
b'Hello, TensorFlow!'_

So I ry removing myconda environment and then re-installing with:

_library(reticulate)
conda_remove("r-tensorflow")
install.packages("tensorflow")
library(tensorflow)

install_tensorflow(version = "somekindofversion")_

but I get this error: Error: Error 2 occurred installing packages into conda environment r-tensorflow

This is the entire script:

install_tensorflow(version = "https://github.com/fo40225/tensorflow-windows-wheel/blob/master/1.4.0/py36/CPU/avx2/tensorflow-1.4.0-cp36-cp36m-win_amd64.whl")
Creating r-tensorflow conda environment for TensorFlow installation...
Solving environment: ...working... done

Package Plan

environment location: G:\G_PROGRAMMI\Anaconda3\envs\r-tensorflow

added / updated specs:
- python=3.6

The following NEW packages will be INSTALLED:

certifi:        2018.8.24-py36_1  
pip:            10.0.1-py36_0     
python:         3.6.6-hea74fb7_0  
setuptools:     40.2.0-py36_0     
vc:             14.1-h0510ff6_3   
vs2015_runtime: 15.5.2-3          
wheel:          0.31.1-py36_0     
wincertstore:   0.2-py36h7fe50ca_0

Preparing transaction: ...working... done
Verifying transaction: ...working...
SafetyError: The package for setuptools located at G:\G_PROGRAMMI\Anaconda3\pkgs\setuptools-40.2.0-py36_0
appears to be corrupted. The path 'Scripts/easy_install.exe'
has a sha256 mismatch.
reported sha256: 993203a406e04936a07829b1f482fd27d739b640482e213f4c49ea1ee78a5fcf
actual sha256: 0cc372d47e0e71c25012697ae0a2feca19cfa46caccb872a555c3669e79f701b

SafetyError: The package for wheel located at G:\G_PROGRAMMI\Anaconda3\pkgs\wheel-0.31.1-py36_0
appears to be corrupted. The path 'Scripts/wheel.exe'
has a sha256 mismatch.
reported sha256: 993203a406e04936a07829b1f482fd27d739b640482e213f4c49ea1ee78a5fcf
actual sha256: 2a882e6c7ea316634261d68f7ccb7aadc34fe864b4b834611c9a4cd26efc9d35

done
Executing transaction: ...working... done

To activate this environment, use:

> activate r-tensorflow

To deactivate an active environment, use:

> deactivate

* for power-users using bash, you must source

Determining latest installable release of TensorFlow...done
Installing TensorFlow...
Collecting tensorflow==1.4.0 from https://github.com/fo40225/tensorflow-windows-wheel/blob/master/1.4.0/py36/CPU/avx2/tensorflow-1.4.0-cp36-cp36m-win_amd64.whl
Downloading https://github.com/fo40225/tensorflow-windows-wheel/blob/master/1.4.0/py36/CPU/avx2/tensorflow-1.4.0-cp36-cp36m-win_amd64.whl
Exception:
Traceback (most recent call last):
File "G:\G_PROGRAMMI\Anaconda3\envs\r-tensorflow\lib\site-packages\pip_internal\basecommand.py", line 228, in main
status = self.run(options, args)
File "G:\G_PROGRAMMI\Anaconda3\envs\r-tensorflow\lib\site-packages\pip_internal\commands\install.py", line 291, in run
resolver.resolve(requirement_set)
File "G:\G_PROGRAMMI\Anaconda3\envs\r-tensorflow\lib\site-packages\pip_internal\resolve.py", line 103, in resolve
self._resolve_one(requirement_set, req)
File "G:\G_PROGRAMMI\Anaconda3\envs\r-tensorflow\lib\site-packages\pip_internal\resolve.py", line 257, in _resolve_one
abstract_dist = self._get_abstract_dist_for(req_to_install)
File "G:\G_PROGRAMMI\Anaconda3\envs\r-tensorflow\lib\site-packages\pip_internal\resolve.py", line 210, in _get_abstract_dist_for
self.require_hashes
File "G:\G_PROGRAMMI\Anaconda3\envs\r-tensorflow\lib\site-packages\pip_internal\operations\prepare.py", line 310, in prepare_linked_requirement
progress_bar=self.progress_bar
File "G:\G_PROGRAMMI\Anaconda3\envs\r-tensorflow\lib\site-packages\pip_internal\download.py", line 837, in unpack_url
progress_bar=progress_bar
File "G:\G_PROGRAMMI\Anaconda3\envs\r-tensorflow\lib\site-packages\pip_internal\download.py", line 678, in unpack_http_url
unpack_file(from_path, location, content_type, link)
File "G:\G_PROGRAMMI\Anaconda3\envs\r-tensorflow\lib\site-packages\pip_internal\utils\misc.py", line 575, in unpack_file
flatten=not filename.endswith('.whl')
File "G:\G_PROGRAMMI\Anaconda3\envs\r-tensorflow\lib\site-packages\pip_internal\utils\misc.py", line 460, in unzip_file
zip = zipfile.ZipFile(zipfp, allowZip64=True)
File "G:\G_PROGRAMMI\Anaconda3\envs\r-tensorflow\lib\zipfile.py", line 1108, in init
self._RealGetContents()
File "G:\G_PROGRAMMI\Anaconda3\envs\r-tensorflow\lib\zipfile.py", line 1175, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
You are using pip version 10.0.1, however version 18.0 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
Error: Error 2 occurred installing packages into conda environment r-tensorflow

The version I tried to install is takek from "https://github.com/fo40225/tensorflow-windows-wheel/blob/master/1.4.0/py36/CPU/avx2/tensorflow-1.4.0-cp36-cp36m-win_amd64.whl"

Any help would be greatly appreciated, thanks!

Cannot run `torch` intro code, see traceback attached

trying to run the code chunks from the blog post:
https://github.com/rstudio/ai-blog/tree/master/_posts/2020-09-29-introducing-torch-for-r
I get reproducible fails at the central code chunk:

for (epoch in 1:5) {

  l <- c()

  for (b in enumerate(train_dl)) {
    # make sure each batch's gradient updates are calculated from a fresh start
    optimizer$zero_grad()
    # get model predictions
    output <- model(b[[1]]$to(device = "cuda"))
    # calculate loss
    loss <- nnf_cross_entropy(output, b[[2]]$to(device = "cuda"))
    # calculate gradient
    loss$backward()
    # apply weight updates
    optimizer$step()
    # track losses
    l <- c(l, loss$item())
  }

  cat(sprintf("Loss at epoch %d: %3f\n", epoch, mean(l)))
}

Trying this with packages torch and torchvision installed today from github as suggested at the beginning of the blog.

Fehler in parent.env(x)[["batch"]][[name]] : Objekt des Typs 'symbol' ist nicht indizierbar
40.
`[[.enum_env`(b, 1)
39.
b[[1]]
38.
mget(x = c("input", "weight", "bias", "stride", "padding", "dilation", "groups"))
37.
torch_conv2d(input = input, weight = weight, bias = bias, stride = stride, padding = padding, dilation = dilation, groups = groups)
36.
nnf_conv2d(input, weight, self$bias, self$stride, self$padding, self$dilation, self$groups)
35.
self$conv_forward_(input, self$weight)
34.
self$conv1(.)
33.
mget(x = c("self"))
32.
torch_relu(input)
31.
nnf_relu(.)
30.
mget(x = c("input", "weight", "bias", "stride", "padding", "dilation", "groups"))
29.
torch_conv2d(input = input, weight = weight, bias = bias, stride = stride, padding = padding, dilation = dilation, groups = groups)
28.
nnf_conv2d(input, weight, self$bias, self$stride, self$padding, self$dilation, self$groups)
27.
self$conv_forward_(input, self$weight)
26.
self$conv2(.)
25.
mget(x = c("self"))
24.
torch_relu(input)
23.
nnf_relu(.)
22.
mget(x = c("self", "kernel_size", "stride", "padding", "dilation", "ceil_mode"))
21.
torch_max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
20.
nnf_max_pool2d(., 2)
19.
mget(x = c("input", "p", "train"))
18.
torch_feature_dropout(input, p, training)
17.
nnf_dropout2d(input, self$p, self$training, self$inplace)
16.
self$dropout1(.)
15.
mget(x = c("self", "dims", "start_dim", "end_dim", "out_dim"))
14.
torch_flatten(., start_dim = 2)
13.
nnf_linear(input, self$weight, self$bias)
12.
self$fc1(.)
11.
mget(x = c("self"))
10.
torch_relu(input)
9.
nnf_relu(.)
8.
mget(x = c("input", "p", "train"))
7.
torch_feature_dropout(input, p, training)
6.
nnf_dropout2d(input, self$p, self$training, self$inplace)
5.
self$dropout2(.)
4.
nnf_linear(input, self$weight, self$bias)
3.
self$fc2(.)
2.
x %>% self$conv1() %>% nnf_relu() %>% self$conv2() %>% nnf_relu() %>% nnf_max_pool2d(2) %>% self$dropout1() %>% torch_flatten(start_dim = 2) %>% self$fc1() %>% nnf_relu() %>% self$dropout2() %>% self$fc2()
1.
model(b[[1]]$to(device = "cuda"))

R version:

R version 4.0.3 (2020-10-10) -- "Bunny-Wunnies Freak Out"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-suse-linux-gnu (64-bit)

R Studio 1.4.1623 (from the dailies)

rsession using cuda properly has been verified by:

torch::torch_tensor(1, device = "cuda")

and

/usr/local/cuda-10.2/extras/demo_suite> nvidia-smi
Sat Mar 13 14:29:55 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.56       Driver Version: 460.56       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro M1200        Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   49C    P0    N/A /  N/A |   1054MiB /  4046MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     13898      C   .../lib/rstudio/bin/rsession      515MiB |
|    0   N/A  N/A     25740      C   .../lib/rstudio/bin/rsession      535MiB |
+-----------------------------------------------------------------------------+

I am do not know enough about all this to try to debug that issue by myself. I think this native torch in R package is a great way to get up to date NNs going in R. If my issue with the sample code is a general problem, it might hamper the success of the package, as it is a big hurdle for starting a torch in R journey.

Attention layer model fails with 'Could not find valid device for node.'

Hello, I am trying to get code from https://blogs.rstudio.com/tensorflow/posts/2018-07-30-attention-layer/ and trying to reproduce this example.

The following is my code:


reticulate::use_condaenv("tf-gpu", required = TRUE)



library(keras)
use_implementation("tensorflow")

library(tensorflow)
tfe_enable_eager_execution()

library(tfdatasets)

library(purrr)
library(stringr)
library(reshape2)
library(viridis)
library(ggplot2)
library(tibble)

filepath <- file.path("data", "nld.txt")

lines <- readLines(filepath, n = 10000)
sentences <- str_split(lines, "\t")
str(sentences)

space_before_punct <- function(sentence) {
  str_replace_all(sentence, "([?.!])", " \\1")
}

replace_special_chars <- function(sentence) {
  str_replace_all(sentence, "[^a-zA-Z?.!,¿]+", " ")
}

add_tokens <- function(sentence) {
  paste0("<start> ", sentence, " <stop>")
}

add_tokens <- Vectorize(add_tokens, USE.NAMES = FALSE)

preprocess_sentence <- compose(add_tokens,
                               str_squish,
                               replace_special_chars,
                               space_before_punct)

word_pairs <- map(sentences, preprocess_sentence)

create_index <- function(sentences) {
  unique_words <- sentences %>% unlist() %>% paste(collapse = " ") %>%
    str_split(pattern = " ") %>% .[[1]] %>% unique() %>% sort()
  index <- data.frame(
    word = unique_words,
    index = 1:length(unique_words),
    stringsAsFactors = FALSE
  ) %>%
    add_row(word = "<pad>",
            index = 0,
            .before = 1)
  index
}

word2index <- function(word, index_df) {
  index_df[index_df$word == word, "index"]
}
index2word <- function(index, index_df) {
  index_df[index_df$index == index, "word"]
}

src_index <- create_index(map(word_pairs, ~ .[[1]]))
target_index <- create_index(map(word_pairs, ~ .[[2]]))

sentence2digits <- function(sentence, index_df) {
  map((sentence %>% str_split(pattern = " "))[[1]], function(word)
    word2index(word, index_df))
}

sentlist2diglist <- function(sentence_list, index_df) {
  map(sentence_list, function(sentence)
    sentence2digits(sentence, index_df))
}

src_diglist <- sentlist2diglist(map(word_pairs, ~ .[[1]]), src_index)
src_maxlen <- map(src_diglist, length) %>% unlist() %>% max()
src_matrix <- pad_sequences(src_diglist, maxlen = src_maxlen,  padding = "post")

target_diglist <- sentlist2diglist(map(word_pairs, ~ .[[2]]), target_index)
target_maxlen <- map(target_diglist, length) %>% unlist() %>% max()
target_matrix <- pad_sequences(target_diglist, maxlen = target_maxlen, padding = "post")

train_indices <-
  sample(nrow(src_matrix), size = nrow(src_matrix) * 0.8)

validation_indices <- setdiff(1:nrow(src_matrix), train_indices)

x_train <- src_matrix[train_indices, ]
y_train <- target_matrix[train_indices, ]

str(x_train)
str(y_train)

x_valid <- src_matrix[validation_indices, ]
y_valid <- target_matrix[validation_indices, ]

str(x_valid)
str(y_valid)

buffer_size <- nrow(x_train)

# just for convenience, so we may get a glimpse at translation 
# performance during training
train_sentences <- sentences[train_indices]
validation_sentences <- sentences[validation_indices]
validation_sample <- sample(validation_sentences, 5)

str(train_sentences)

batch_size <- 32
embedding_dim <- 64
gru_units <- 256

src_vocab_size <- nrow(src_index)
target_vocab_size <- nrow(target_index)

train_dataset <- 
  tensor_slices_dataset(keras_array(list(x_train, y_train)))  %>%
  dataset_shuffle(buffer_size = buffer_size) %>%
  dataset_batch(batch_size, drop_remainder = TRUE)

str(train_dataset)

validation_dataset <-
  tensor_slices_dataset(keras_array(list(x_valid, y_valid))) %>%
  dataset_shuffle(buffer_size = buffer_size) %>%
  dataset_batch(batch_size, drop_remainder = TRUE)

str(validation_dataset)


attention_encoder <-
  
  function(gru_units,
           embedding_dim,
           src_vocab_size,
           name = NULL) {
    
    keras_model_custom(name = name, function(self) {
      
      self$embedding <-
        layer_embedding(
          input_dim = src_vocab_size,
          output_dim = embedding_dim
        )
      
      self$gru <-
        layer_gru(
          units = gru_units,
          return_sequences = TRUE,
          return_state = TRUE
        )
      
      function(inputs, mask = NULL) {
        
        x <- inputs[[1]]
        hidden <- inputs[[2]]
        
        x <- self$embedding(x)
        c(output, state) %<-% self$gru(x, initial_state = hidden)
        
        list(output, state)
      }
    })
  }


attention_decoder <-
  function(object,
           gru_units,
           embedding_dim,
           target_vocab_size,
           name = NULL) {
    
    keras_model_custom(name = name, function(self) {
      
      self$gru <-
        layer_gru(
          units = gru_units,
          return_sequences = TRUE,
          return_state = TRUE
        )
      
      self$embedding <-
        layer_embedding(input_dim = target_vocab_size, 
                        output_dim = embedding_dim)
      
      gru_units <- gru_units
      self$fc <- layer_dense(units = target_vocab_size)
      self$W1 <- layer_dense(units = gru_units)
      self$W2 <- layer_dense(units = gru_units)
      self$V <- layer_dense(units = 1L)
      
      function(inputs, mask = NULL) {
        
        x <- inputs[[1]]
        hidden <- inputs[[2]]
        encoder_output <- inputs[[3]]
        
        hidden_with_time_axis <- k_expand_dims(hidden, 2)
        
        score <- self$V(k_tanh(self$W1(encoder_output) + 
                                 self$W2(hidden_with_time_axis)))
        
        attention_weights <- k_softmax(score, axis = 2)
        
        context_vector <- attention_weights * encoder_output
        context_vector <- k_sum(context_vector, axis = 2)
        
        x <- self$embedding(x)
        
        x <- k_concatenate(list(k_expand_dims(context_vector, 2), x), axis = 3)
        
        c(output, state) %<-% self$gru(x)
        
        output <- k_reshape(output, c(-1, gru_units))
        
        x <- self$fc(output)
        
        list(x, state, attention_weights)
        
      }
      
    })
  }

encoder <- attention_encoder(
  gru_units = gru_units,
  embedding_dim = embedding_dim,
  src_vocab_size = src_vocab_size
)

decoder <- attention_decoder(
  gru_units = gru_units,
  embedding_dim = embedding_dim,
  target_vocab_size = target_vocab_size
)


optimizer <- tf$compat$v1$train$AdamOptimizer()

cx_loss <- function(y_true, y_pred) {
  mask <- ifelse(y_true == 0L, 0, 1)
  loss <-
    tf$nn$sparse_softmax_cross_entropy_with_logits(labels = y_true,
                                                   logits = y_pred) * mask
  tf$reduce_mean(loss)
}


n_epochs <- 50

encoder_init_hidden <- k_zeros(c(batch_size, gru_units))

for (epoch in seq_len(n_epochs)) {
  
  total_loss <- 0
  iteration <- 0
  
  iter <- make_iterator_one_shot(train_dataset)
  
  until_out_of_range({
    
    batch <- iterator_get_next(iter)
    loss <- 0
    x <- batch[[1]]
    y <- batch[[2]]
    iteration <- iteration + 1
    
    with(tf$GradientTape() %as% tape, {
      c(enc_output, enc_hidden) %<-% encoder(list(x, encoder_init_hidden))
      
      dec_hidden <- enc_hidden
      dec_input <-
        k_expand_dims(rep(list(
          word2index("<start>", target_index)
        ), batch_size))
      
      
      for (t in seq_len(target_maxlen - 1)) {
        c(preds, dec_hidden, weights) %<-%
          decoder(list(dec_input, dec_hidden, enc_output))
        loss <- loss + cx_loss(y[, t], preds)
        
        dec_input <- k_expand_dims(y[, t])
      }
      
    })
    
    total_loss <-
      total_loss + loss / k_cast_to_floatx(dim(y)[2])
    
    paste0(
      "Batch loss (epoch/batch): ",
      epoch,
      "/",
      iter,
      ": ",
      (loss / k_cast_to_floatx(dim(y)[2])) %>% 
        as.double() %>% round(4),
      "\n"
    )
    
    variables <- c(encoder$variables, decoder$variables)
    gradients <- tape$gradient(loss, variables)
    
    optimizer$apply_gradients(
      purrr::transpose(list(gradients, variables)),
      global_step = tf$train$get_or_create_global_step()
    )
    
  })

  paste0(
    "Total loss (epoch): ",
    epoch,
    ": ",
    (total_loss / k_cast_to_floatx(buffer_size)) %>% 
      as.double() %>% round(4),
    "\n"
  )
}

this code fails with the following error:

2020-02-12 12:48:30.175011: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
Error: NotFoundError: Could not find valid device for node.
Node:{{node SparseSoftmaxCrossEntropyWithLogits}}
All kernels registered for op SparseSoftmaxCrossEntropyWithLogits :
  device='CPU'; T in [DT_FLOAT]; Tlabels in [DT_INT32]
  device='CPU'; T in [DT_FLOAT]; Tlabels in [DT_INT64]
  device='CPU'; T in [DT_DOUBLE]; Tlabels in [DT_INT32]
  device='CPU'; T in [DT_DOUBLE]; Tlabels in [DT_INT64]
  device='CPU'; T in [DT_HALF]; Tlabels in [DT_INT32]
  device='CPU'; T in [DT_HALF]; Tlabels in [DT_INT64]
  device='GPU'; T in [DT_FLOAT]; Tlabels in [DT_INT32]
  device='GPU'; T in [DT_FLOAT]; Tlabels in [DT_INT64]
  device='GPU'; T in [DT_HALF]; Tlabels in [DT_INT32]
  device='GPU'; T in [DT_HALF]; Tlabels in [DT_INT64]
 [Op:SparseSoftmaxCrossEntropyWithLogits]

It is not clear to me what is the reason for this failure.
Do you get the same result?

TensorFlow for R Blog: Auto-Keras post proposal.

I would be pleased to propose my post entitled "Auto-Keras: An R easily accessible deep learning library" to be published in the TensorFlow for R Blog.
I have carefully followed the blog contribution guide detailed in https://blogs.rstudio.com/tensorflow/contributing.html .
You can find my proposed post in the URL https://github.com/jcrodriguez1989/tf_blog_autokeras .
I am at your entire disposal and open to any suggestions that will undoubtedly lead to improvements in my article.

Regards,
Juan Cruz Rodriguez

Can't load imdb dataset in text classification example

I was unable to load the example dataset in Tutorial: Text Classification example: . I'm using the keras library from rstudio:

imdb <- dataset_imdb(num_words = 10000)
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz
17473536/17464789 [==============================] - 1s 0us/step
 Error in py_call_impl(callable, dots$args, dots$keywords) : 
  ValueError: Object arrays cannot be loaded when allow_pickle=False

Thanks...Keith Erskine

ps: My environment is rstudio.cloud

> Sys.getenv()
_                                    /usr/bin/env
CLICOLOR_FORCE                       1
DISPLAY                              :0
EDITOR                               vi
GIT_ASKPASS                          rpostback-askpass
HOME                                 /home/rstudio-user
LANG                                 C.UTF-8
LD_LIBRARY_PATH                      /opt/R/3.5.2/lib/R/lib::/lib:/usr/local/lib:/usr/lib/x86_64-linux-gnu:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server
LD_PRELOAD                           /lib/x86_64-linux-gnu/libSegFault.so
LN_S                                 ln -s
LOGNAME                              rstudio-user
MAIL                                 /var/mail/rstudio-user
MAKE                                 make
PAGER                                /usr/bin/pager
PATH                                 /cloud/project/r-tensorflow/bin:/cloud/project/r-tensorflow/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/opt/R/3.5.2/lib/R/bin
PWD                                  /
R_BROWSER                            xdg-open
R_BZIPCMD                            /bin/bzip2
R_DOC_DIR                            /opt/R/3.5.2/lib/R/doc
R_ENVIRON                            /etc/R/Renviron.site
R_GZIPCMD                            /bin/gzip
R_HOME                               /opt/R/3.5.2/lib/R
R_INCLUDE_DIR                        /opt/R/3.5.2/lib/R/include
R_LIBS_SITE                          
R_LIBS_USER                          ~/R/x86_64-pc-linux-gnu-library/3.5
R_PAPERSIZE                          letter
R_PDFVIEWER                          /usr/bin/xdg-open
R_PLATFORM                           x86_64-pc-linux-gnu
R_PRINTCMD                           /usr/bin/lpr
R_PROFILE                            /etc/R/Rprofile.site
R_RD4PDF                             times,inconsolata,hyper
R_SESSION_INITIALIZED                PID=1683:NAME="reticulate"
R_SESSION_TMPDIR                     /tmp/Rtmp8rfxNG
R_SHARE_DIR                          /opt/R/3.5.2/lib/R/share
R_SYSTEM_ABI                         linux,gcc,gxx,gfortran,?
R_TEXI2DVICMD                        /usr/bin/texi2dvi
R_UNZIPCMD                           /usr/bin/unzip
R_ZIPCMD                             /usr/bin/zip
RETICULATE_REQUIRED_MODULE           tensorflow
RMARKDOWN_MATHJAX_PATH               /usr/lib/rstudio-server/resources/mathjax-26
RS_RPOSTBACK_PATH                    /usr/lib/rstudio-server/bin/rpostback
RSESSION_DIAGNOSTICS_ENABLED         1
RSESSION_DIAGNOSTICS_FILE            /tmp/rsession-diagnostics-rstudio-user.log
RSESSION_PROFILE_OPTIONS             
RSTUDIO                              1
RSTUDIO_CONSOLE_COLOR                256
RSTUDIO_CONSOLE_WIDTH                126
RSTUDIO_DISABLE_PROJECT_SHARING      1
RSTUDIO_DISABLE_SECURE_DOWNLOAD_WARNING
                                     1
RSTUDIO_HTTP_REFERER                 https://keith-erskine.rstudio.cloud/dcad4180e3f94b24b7ed14dce1a0667e/
RSTUDIO_PANDOC                       /usr/lib/rstudio-server/bin/pandoc
RSTUDIO_R_MODULE                     
RSTUDIO_R_PRELAUNCH_SCRIPT           
RSTUDIO_R_VERSION_LABEL              
RSTUDIO_SESSION_STREAM               rstudio-user-d
RSTUDIO_USER_IDENTITY                rstudio-user
RSTUDIO_VERSION                      1.2.1206-2
RSTUDIO_WINUTILS                     bin/winutils
SED                                  /bin/sed
SEGFAULT_OUTPUT_NAME                 /dev/fd/63
SEGFAULT_SIGNALS                     abrt segv
SHLVL                                2
SSH_ASKPASS                          rpostback-askpass
TAR                                  /bin/tar
TERM                                 xterm-256color
USER                                 rstudio-user
VIRTUAL_ENV                          /cloud/project/r-tensorflow
WORKON_HOME                          /cloud/project

Blog proposal: the text-package

We are proposing a post entitled The text-package for the RStudio AI blog.

We have followed the instructions from https://blogs.rstudio.com/ai/contributing.html.

Our proposed blog post can be found at https://github.com/OscarKjell/ai-blog/tree/text_huggingface_in_r/_posts/2022-09-29-r-text
Looking forward to your feedback.
Kind Regards,
Oscar

Narative differs from code on 2018-06-25-sunspots-lstm for training, testing, and skip periods.

The narative says a

training period of 50 years, a testing period of 10 years, and a skip span of 20 years but the code has a
training period of 100 years, a testing period of 50 years, and skip span of 22 years.

Again, thanks for what you do. 👍

broken link in tensorflow-blog/_posts/2018-06-25-sunspots-lstm/

On line 194, the link to "Time Series Analysis Example" should point to

https://tidymodels.github.io/rsample/articles/Applications/Time_Series.html

instead of

https://topepo.github.io/rsample/articles/Applications/Time_Series.html

as it is currently broken and I think this is new link. Thanks for this EXCELLENT post and keep up the good good good good work. 💯

URL to RStudio AI Blog

Could you add URL to RStudio AI Blog in this repository in About section?

Only trigger deploy to Netlify from main branch

The Netlify step appears to run on every change, regardless of branch, resulting in no-op deployments in Netlify.
Should it have if: github.ref == 'refs/heads/main' as a guard clause like the GitHub deploy?

https://github.com/rstudio/ai-blog/blob/main/.github/workflows/main.yml#L93

Reconcile and generate the report from 2 data sets

I have 2 data files and i need to reconcile them using the RStudio. I cannot do this in Excel since the data length is too much. My current VB tool is manual and too much time consuming. So need to create a more quick tool. Can you please advise what is the best method and tool to be used? I am also open to use Python.

Data1
C1	C2	C3	C4	Total
A	B	C	D	5
A	B	C	D	2
A	B	C	D1	1
A	B1	C	D2	0
A	B2	C	D	0
A	B	C	D	3

Data2
C5	C6	C7	C7.1	C8	Total
A	B	C	Cq	D	5
A	B	C	Cq	D	2
A	B	C	Cq	D1	1
A	B1	C	Cq	D2	0
A	B2	C	Cq	D	0
A	B	C	Cq	D	3

Oder of the columns remain Same, however order might differ.
We need to reconcile the data between the Data1 and Data2
Currently I have a VB Tool which actually clubs the data from required columns and form a string and compare.
Required to generate the report, how many are match, howmany are mismatch, Incase of mismatch what is the difference up to 6 decimal places

no subscription button/link

I don't see a button/link I could click to subscribe (on Firefox)

<div id="subscribe-caption" style="line-height: 1.2; margin-bottom: 2px; width: 377px; font-size: 15px;">
Enjoy this blog? Get notified of new posts by email:
</div>

Feature Request

Hi, Sigrid.
There are a lot of issues regarding the installation/configuration of TensorFlow by readers on the blog.
Would it be better if we have included reproducible examples at the end of the posts?
I can help to configure python+TF for recent posts on Google Colab in that case.
What do you think?

What counts as a modern GPU: Post about word embeddings

Hi,

My lab's server has 2 nvidia GTX 1080 Ti GPUs and when I run the code as shown in this post it takes ~3000 s per epoch with the exact same parameters as in the post. I was wondering if you guys used a different GPU that is much better than the 1080 Ti.

Please let me know if you need me to post the code sample (I literally copy pasted to see how long my Lab's GPU takes to do this task) or anything else!

Thanks

Some posts require packages to run

We should try to disable code that actually runs r code, see for instance:

ai-blog/_posts/2020-07-30-state-of-the-art-nlp-models-from-r/state-of-the-art-nlp-models-from-r.Rmd

Lines 255 to 262 in 30a4bea

 ```{r eval=T,echo=F} 

 library(dplyr) 

 res = data.table::fread('files/res.csv') %>%  

  filter(rowname %in% 'val_auc') %>% arrange(desc(V2)) %>%  

  rename(epoch_1 = V1, epoch_2 = V2, metric = rowname) %>%  

  mutate(epoch_1 = round(epoch_1,3),epoch_2 = round(epoch_2,3)) 

 DT::datatable(res, options = list(dom = 't')) 

 ```

Otherwise, when building the blog automatically, this might break or change inadvertently.

Credit risk classification with deep learning

Hi all,

I would like to puplish this article on R blog TensorFlow,
the link of my Github:

https://dsaada.github.io/CreditRiskTensorFlow/

Thanks,
[email protected]

David

	```{r eval=T,echo=F}
	library(dplyr)
	res = data.table::fread('files/res.csv') %>%
	filter(rowname %in% 'val_auc') %>% arrange(desc(V2)) %>%
	rename(epoch_1 = V1, epoch_2 = V2, metric = rowname) %>%
	mutate(epoch_1 = round(epoch_1,3),epoch_2 = round(epoch_2,3))
	DT::datatable(res, options = list(dom = 't'))
	```