Giter Club home page Giter Club logo

contana-data-science-example-r's Introduction

Cortana Analytics Conference Data Science Tutorial Example:

Forecasting with Azure ML and R

Overview

This repository contains R code for a Microsoft Azure Machine Learning experiment used for the Data Science Example at the 2015 Cortana Analytics Conference. The experiment is in the Microsoft Azure ML gallery.

The Azure ML experiment illustrates the basics of building and evaluating a regression forecasting machine learning model, using Azure Machine Learning and R. The goal of this experiment is to forecast the monthly milk production for the State of California.

Data

The data are contained in a .csv file. These data are derived from dairy production information published by the United States Department of Agriculture. The data set contains a time series of dairy production data for several products, along with milk fat pricing, for 128 months.

Code organization

The Azure ML experiment contains R code running in three Execute R Script modules.

The milkutilities.R file contains utility functions used in the Execute R Script modules.

Two columns containing the square and cube of the month count are computed in an Execute R Script module with the code in the transform.R file. These new features are used in a polynomial regression of the time series trend.

The code milkvisualize.R file generates graphics for the visualization and exploration of the data set. Specifically, one can see that the time series has a strong trend. Further, these data exhibit a significant seasonal (monthly) variation.

The splitdata.R file contains final bit of data munging code. This code divides the data into training and test sets. The last 12 months of milk production data are held back to test the forecasting power of our model. Note, the Split module would not work in this case, since it randomly samples the data.

The lmevaluate.R file contains code to compute summary statistics and visualize the model scoring (prediction) results.This information is used to evaluate model performance.

The milklm.R and lmscore.R files contain code to compute an R linear model and predictions (scores) for that model. This code is used for testing in RStudio and to explore feature selection.

contana-data-science-example-r's People

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.