Giter Club home page Giter Club logo

Comments (7)

DavisVaughan avatar DavisVaughan commented on September 26, 2024

A reference to this article is mentioned in one sentence at the bottom of the docs for step_embed()! https://keras.rstudio.com/articles/faq.html#how-can-i-obtain-reproducible-results-using-keras-during-development

from embed.

topepo avatar topepo commented on September 26, 2024

Can you provide a reprex where you run the same recipe more than once? Try setting the seed before prepping the recipe. The underlying code should set the TF seed(s) based on R's random number stream.

from embed.

ciberger avatar ciberger commented on September 26, 2024

Thanks for your quick replies guys!

Basically, each time I rerun the embeddings I get a new session seed number, it seems there is no way to modify the seeds param on the tf_coefs2 function.

On the other hand, using use_session_with_seed(...) before calling the step_embed function does not work either.

Test 1

library(vcd)
data(Arthritis)

require(recipes)
require(embed)
#> Loading required package: embed
recipe.obj <- recipe(formula("Improved ~ Treatment + Sex + Age"), data = Arthritis) %>%
  step_embed(
    all_nominal(), -all_outcomes(),
    predictors = vars(all_numeric()),
    outcome = vars(Improved),
    num_terms = 2,
    hidden_units = 0,
    options = embed_control(
      epochs = 400, validation_split = .2
    )
  ) %>%
  prep(training = Arthritis)
#> Set session seed to 1695 (disabled GPU, CPU parallelism)

Created on 2019-01-20 by the reprex package (v0.2.1)

Test 2

library(vcd)
data(Arthritis)

require(recipes)
require(embed)
#> Loading required package: embed
recipe.obj <- recipe(formula("Improved ~ Treatment + Sex + Age"), data = Arthritis) %>%
  step_embed(
    all_nominal(), -all_outcomes(),
    predictors = vars(all_numeric()),
    outcome = vars(Improved),
    num_terms = 2,
    hidden_units = 0,
    options = embed_control(
      epochs = 400, validation_split = .2
    )
  ) %>%
  prep(training = Arthritis)
#> Set session seed to 5576 (disabled GPU, CPU parallelism)

Created on 2019-01-20 by the reprex package (v0.2.1)

from embed.

skeydan avatar skeydan commented on September 26, 2024

To me it seems that when

tf_coefs2 <- function(x, y, z, opt, num, lab, h, seeds = sample.int(10000, 4), ...) {

gets called, you get different seeds vectors in fresh R sessions, as well as when calling the function several times in a row:

# restart R
sample.int(10000, 4)
sample.int(10000, 4)
## restart R
sample.int(10000, 4)
sample.int(10000, 4)
> # restart R
> sample.int(10000, 4)
[1] 8528 4771 4492 7039
> sample.int(10000, 4)
[1] 2927 7478  803 7227

Restarting R session...

> # restart R
> sample.int(10000, 4)
[1] 4363 6519 7028 6792
> sample.int(10000, 4)
[1] 7621 7222 9856 2809

from embed.

topepo avatar topepo commented on September 26, 2024

If have never set the seed, you should get different results.

Since the recipe steps pull the tensorflow seeds from R's random numbers, you only need to set R's seed to get the same TF seeds:

library(embed)
#> Loading required package: recipes
#> Loading required package: dplyr
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
#> 
#> Attaching package: 'recipes'
#> The following object is masked from 'package:stats':
#> 
#>     step
data(okc)

set.seed(34523)
take_1 <- 
  recipe(Class ~ age + location, data = okc) %>%
  step_embed(location, outcome = vars(Class),
             options = embed_control(epochs = 10)) %>% 
  prep(training = okc) %>% 
  tidy(number = 1)
#> Set session seed to 5504 (disabled GPU, CPU parallelism)

set.seed(34523)
take_2 <- 
  recipe(Class ~ age + location, data = okc) %>%
  step_embed(location, outcome = vars(Class),
             options = embed_control(epochs = 10)) %>% 
  prep(training = okc) %>% 
  tidy(number = 1)
#> Set session seed to 5504 (disabled GPU, CPU parallelism)

all.equal(take_1, take_2)
#> [1] TRUE

Created on 2019-01-21 by the reprex package (v0.2.1)

from embed.

skeydan avatar skeydan commented on September 26, 2024

That makes sense!

from embed.

github-actions avatar github-actions commented on September 26, 2024

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

from embed.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.