Hi there. I've been checking out your package and it's very nice. I was wondering if you have an associated publication discussing the library / methods? Also is there other documentation beyond the R package docs and the vignette linked from the main page? Just curious.


Confusion about bias in laplaciansvm

I am trying to figure out how lapsvm works via code, and your package helps me a lot.
I am a little confused about how you choose the bias, can you explain it for me?
Many thanks.

Is 'Class Mass Normalisaiton' correctly specified in GRFClassier()?

Hi @jkrijthe, thanks for putting this package together, its helped me learn a lot about semi-supervised learning 😃

At line 49 in the GRFClassier() code there is a switch to assign class labels based on the raw values from the harmonic function OR from the 'class mass normalised' values, but it appears the options are the wrong way around.

For example, should this

if (class_mass_normalization) {
  class_ind <- which_rowMax(responsibilities)
} else {
  class_ind <- which_rowMax(harmonic_f$fu_cmn)

look like this instead?

if (class_mass_normalization) {
  class_ind <- which_rowMax(harmonic_f$fu_cmn)
} else {
  class_ind <- which_rowMax(responsibilities)

The current version of the package does not work for the example in README.Rmd

In a clean session in R, I got this result after doing install_github("jkrijthe/RSSL").

> library(dplyr,warn.conflicts = FALSE)
> library(ggplot2,warn.conflicts = FALSE)
Warning message:
package ‘ggplot2’ was built under R version 3.4.4 
> library(RSSL)
> set.seed(2)
> df <- generate2ClassGaussian(200, d=2, var = 0.2, expected=TRUE)
> # Randomly remove labels
> df <- df %>% add_missinglabels_mar(Class~.,prob=0.98) 
> # Train classifier
> g_nm <- NearestMeanClassifier(Class~.,df,prior=matrix(0.5,2))
Error in factor_to_dummy_cpp(na.omit(y), length(levs)) : 
  object '_RSSL_factor_to_dummy_cpp' not found
> g_self <- SelfLearning(Class~.,df,
+                        method=NearestMeanClassifier,
+                        prior=matrix(0.5,2))
Error in factor_to_dummy_cpp(na.omit(y), length(levs)) : 
  object '_RSSL_factor_to_dummy_cpp' not found
> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RSSL_0.7      ggplot2_2.2.1 dplyr_0.7.4  

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.16      cluster_2.0.6     bindr_0.1         magrittr_1.5      MASS_7.3-47       munsell_0.4.3    
 [7] colorspace_1.3-2  lattice_0.20-35   R6_2.2.2          rlang_0.2.0.9001  quadprog_1.5-5    stringr_1.3.0    
[13] plyr_1.8.4        tools_3.4.3       grid_3.4.3        gtable_0.2.0      yaml_2.1.18       lazyeval_0.2.1   
[19] assertthat_0.2.0  tibble_1.4.2      Matrix_1.2-12     bindrcpp_0.2      kernlab_0.9-26    reshape2_1.4.3   
[25] glue_1.2.0        stringi_1.1.6     compiler_3.4.3    pillar_1.2.0      scales_0.5.0.9000 pkgconfig_2.0.1  

install failure

Interested in checking out your library. It's failing to install though:

> library(devtools)
> install_github("jkrijthe/RSSL")
Downloading github repo jkrijthe/RSSL@master
Installing RSSL
'/usr/lib64/R/bin/R' --vanilla CMD INSTALL  \
  '/tmp/RtmptFh612/devtoolsc8663127fe66/jkrijthe-RSSL-e0738eb'  \
  --library='/home/me/R/x86_64-redhat-linux-gnu-library/3.1' --install-tests 

* installing *source* package ‘RSSL’ ...
** libs
make: Nothing to be done for `all'.
installing to /home/me/R/x86_64-redhat-linux-gnu-library/3.1/RSSL/libs
** R
** tests
** preparing package for lazy loading
Creating a new generic function for ‘predict’ in package ‘RSSL’
Creating a new generic function for ‘plot’ in package ‘RSSL’
Error in library(klaR) : there is no package called ‘klaR’
Error : unable to load R code in package ‘RSSL’
ERROR: lazy loading failed for package ‘RSSL’
* removing ‘/home/me/R/x86_64-redhat-linux-gnu-library/3.1/RSSL’
Error: Command failed (1)

Question about category variables


I'm new to this package and have 2 questions related to it:

  1. Is there any function in this package is able to deal with category variable? I tried some, but all gave an error: Error in storage.mode(from) <- "double" : (list) object cannot be coerced to type 'double'

  2. Is there any function can deal with more than 2 classes? I notice that all sample data used in examples only have 2 classes.


Problem with EMLinearDiscriminantClassifier


I guess that there is a problem with the function EMLinearDiscriminantClassifier. Using it, I get
Error in while (abs(logmarginal - logmarginal_old) > eps) { : missing value where TRUE/FALSE needed
which I thinks is due to the following code chunk:

    logmarginal <- losspart(g_iteration,Xe,Ye) #losslogsum(g_iteration,X,Y,X_u,responsibilities)
#     print(logmarginal)
    while (abs(logmarginal-logmarginal_old) > eps)

Am I right?

Installation Issue:

Hi jkrijthe,

I found the following message whne install RSSL through github. Do you have any clue on how to solve this issue? Thanks in advance.

Downloading GitHub repo jkrijthe/RSSL@master
Installing RSSL
Skipping 9 packages not available: dplyr, ggplot2, kernlab, quadprog, Rcpp, RcppArmadillo, reshape2, scales, tidyr
'/Library/Frameworks/R.framework/Resources/bin/R' --no-site-file --no-environ --no-save --no-restore CMD INSTALL
--library='/Library/Frameworks/R.framework/Versions/3.2/Resources/library' --install-tests

* installing source package ‘RSSL’ ...
** libs
clang++ -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -I/usr/local/include -I/usr/local/include/freetype2 -I/opt/X11/include -I"/Library/Frameworks/R.framework/Versions/3.2/Resources/library/Rcpp/include" -I"/Library/Frameworks/R.framework/Versions/3.2/Resources/library/RcppArmadillo/include" -fPIC -Wall -mtune=core2 -g -O2 -c GRFClassifier.cpp -o GRFClassifier.o
clang++ -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -I/usr/local/include -I/usr/local/include/freetype2 -I/opt/X11/include -I"/Library/Frameworks/R.framework/Versions/3.2/Resources/library/Rcpp/include" -I"/Library/Frameworks/R.framework/Versions/3.2/Resources/library/RcppArmadillo/include" -fPIC -Wall -mtune=core2 -g -O2 -c RcppExports.cpp -o RcppExports.o
clang++ -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -I/usr/local/include -I/usr/local/include/freetype2 -I/opt/X11/include -I"/Library/Frameworks/R.framework/Versions/3.2/Resources/library/Rcpp/include" -I"/Library/Frameworks/R.framework/Versions/3.2/Resources/library/RcppArmadillo/include" -fPIC -Wall -mtune=core2 -g -O2 -c ssl.cpp -o ssl.o
clang++ -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -I/usr/local/include -I/usr/local/include/freetype2 -I/opt/X11/include -I"/Library/Frameworks/R.framework/Versions/3.2/Resources/library/Rcpp/include" -I"/Library/Frameworks/R.framework/Versions/3.2/Resources/library/RcppArmadillo/include" -fPIC -Wall -mtune=core2 -g -O2 -c svmlin_rcpp.cpp -o svmlin_rcpp.o
clang++ -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/usr/local/lib -o GRFClassifier.o RcppExports.o ssl.o svmlin_rcpp.o -L/Library/Frameworks/R.framework/Resources/lib -lRlapack -L/Library/Frameworks/R.framework/Resources/lib -lRblas -L/usr/local/lib/gcc/x86_64-apple-darwin13.0.0/4.8.2 -lgfortran -lquadmath -lm -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
ld: warning: directory not found for option '-L/usr/local/lib/gcc/x86_64-apple-darwin13.0.0/4.8.2'
ld: library not found for -lgfortran
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [] Error 1
ERROR: compilation failed for package ‘RSSL’

  • removing ‘/Library/Frameworks/R.framework/Versions/3.2/Resources/library/RSSL’
    ERROR: Command failed (1)

Can't install this package from GitHub.

Error: Failed to install 'RSSL' from GitHub:
(converted from warning) installation of package ‘/var/folders/v1/h75nz5l95c3gx54wjtjwctzw0000gn/T//Rtmpn1XptP/filedef6ecb6e02/RSSL_0.9.1.tar.gz’ had non-zero exit status

  1. install_github("jkrijthe/RSSL")
  2. pkgbuild::with_build_tools({
    . ellipsis::check_dots_used(action = getOption("devtools.ellipsis_action",
    . rlang::warn))
    . {
    . remotes <- lapply(repo, github_remote, ref = ref, subdir = subdir,
    . auth_token = auth_token, host = host)
    . install_remotes(remotes, auth_token = auth_token, host = host,
    . dependencies = dependencies, upgrade = upgrade, force = force,
    . quiet = quiet, build = build, build_opts = build_opts,
    . build_manual = build_manual, build_vignettes = build_vignettes,
    . repos = repos, type = type, ...)
    . }
    . }, required = FALSE)
  3. install_remotes(remotes, auth_token = auth_token, host = host,
    . dependencies = dependencies, upgrade = upgrade, force = force,
    . quiet = quiet, build = build, build_opts = build_opts, build_manual = build_manual,
    . build_vignettes = build_vignettes, repos = repos, type = type,
    . ...)
  4. tryCatch(res[[i]] <- install_remote(remotes[[i]], ...), error = function(e) {
    . stop(remote_install_error(remotes[[i]], e))
    . })
  5. tryCatchList(expr, classes, parentenv, handlers)
  6. tryCatchOne(expr, names, parentenv, handlers[[1L]])
  7. value[3L]

Confusion about ?LearningCurveSSL

I am not sure if I understand it correctly but in the sentence below, should type="unlabeled" be changed to type="fraction" . Please correct me if I am wrong. Thanks!

If type="unlabeled" the total number of objects remains fixed, while the fraction of labeled objects is changed. frac sets the fractions of labeled objects that should be considered

SSL for Regression

Hi, thanks for developing such a great package!
Just wanna check, does SSL support regression problem?
I found few packages in R or Python can be used for regression. Perhaps SSL is more suitable for classification problem?

question about evaluating the performance

Hello jkrijthe,
I have some questions.

  1. when i using 'CrossValidationSSL' to evaluating the performance, i can only use the labeled data to classify and run 'CrossValidationSSL' function?
    2.when i try the Examples after 'EMLeastSquaresClassifier',how can i plot when my dataset have many columns(The examples only have X1 X2 2 columns).
    3.If i want to use both labeled and unlabeled datasets to classify, which kind of functions should i try and how can i evaluate the performance?

Thanks in advance.

