Giter Club home page Giter Club logo

feedzai-openml-r's Introduction

Feedzai OpenML Provider for R

Build Status codecov Codacy Badge

Implementations of the Feedzai OpenML API to allow support for machine learning models in the R programming language using RServe.

Modules

Generic R

Maven metadata URI

The openml-generic-r module contains a provider that allows developers to load R code that conforms to a simple API. This is the most powerful approach (yet more cumbersome) since models can actually hold state.

The provider can be pulled from Maven Central:

<dependency>
  <groupId>com.feedzai</groupId>
  <artifactId>openml-generic-r</artifactId>
  <!-- See project tags for latest version -->
  <version>0.4.0</version>
</dependency>

Caret

Maven metadata URI

The implementation in the openml-caret module adds support for models built with Caret.

This module can be pulled from Maven Central:

<dependency>
  <groupId>com.feedzai</groupId>
  <artifactId>openml-caret</artifactId>
  <!-- See project tags for latest version -->
  <version>0.4.0</version>
</dependency>

Building

This is a Maven project which you can build using

mvn clean install

Prerequisites for running tests

To use these providers you need to have R Project installed in your environment. After installing R, you need to install the R packages that the provider uses. The easiest way is to install them from CRAN.

Note that this section only describes the known prerequisites that are common to any model generated in R. Before importing a model you need to ensure that the required packages for that model are also installed.

Finally you must install Rserve.

Example in CentOS7:

Execute the following bash commands:

# repo that has R
yum -y install epel-release;

# needed for R dependencies
yum -y install libcurl-devel openssl-devel gsl-devel libwebp-devel librsvg2-devel R;

# start R
R

Execute the following R instructions:

# Load caret
install.packages("caret", dependencies=TRUE, repos = "http://cran.radicaldevelop.com/")

# Load all classification model implementations
# https://topepo.github.io/caret/available-models.html
# https://github.com/tobigithub/caret-machine-learning/wiki/caret-ml-setup
library(caret)
modNames <- unique(modelLookup()[modelLookup()\$forClass,c(1)])
install.packages(modNames, dependencies=TRUE, repos = "http://cran.radicaldevelop.com/")

# Load Rserve (needed for Pulse <-> R communication)
install.packages("Rserve", dependencies=TRUE, repos = "http://cran.radicaldevelop.com/"})

Docker

Feedzai has built a helpful docker image for testing, available on docker hub, that is being used in this repository's continuous integration. See the travis-ci configuration commands on how to use it.

feedzai-openml-r's People

Contributors

dependabot[bot] avatar henriquerferrer avatar henriquevcosta avatar jcsf avatar jpdsousa avatar krisztinaknagy avatar miguelscruz avatar nmldiegues avatar paulojrp avatar pedrorijo91 avatar shengwangsw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

feedzai-openml-r's Issues

Possible concurrency problem when classifying

While the access to the rConnection is synchronized, thus preventing concurrency problems, we are using a global variable to setup the instance to classify on https://github.com/feedzai/feedzai-openml-r/blob/master/openml-r-common/src/main/java/com/feedzai/openml/r/ClassificationGenericRModel.java#L111

This means that if 2 threads try to classify in instance, it may happen that thread_1 setups it's instance, and in the meanwhile thread_2 setups other instance, and both threads end classifying the same instance.

the classify and getClassDistribution method should probably be synchronized.

Set up CI with Rserve

OpenML tests for R models requires Rserve to be running. Therefore we need to set up an automated environment where that is available (e.g., using Docker).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.