Giter Club home page Giter Club logo

ucimlrepo.jl's Introduction

UCI Machine Learning Repository

A Julia package for UCI ML repositories

UC Irvine Machine Learning Repository is one the most popular collection of datasets that are avalaible for free.

This Package provides functions for the user to easily download from the website directly into a DataFrame.

Additionally, another function allows the user to view the accompanying metadata about the dataset.

Installation

julia> Pkg.clone("git://github.com/siddhantjain/UCIMLRepo.jl.git")

note: There are some errors that have been reported so far when trying to run this package on a windows machine. This space will be updated as and when the errors are cleared for windows machine

Exported Functions

Two functions are available

1. ucirepodata("DataSetName")
2. ucirepoinfo("DataSetName")
3. ucirepolist()

Basic Examples

Obtain a DataFrame with the entire iris data set

using UCIMLRepo
df = ucirepodata("iris") 

Alternatively, you may mention the exact link of the dataset to be loaded. There is an optional argument that you need to set to false to do so.

using UCIMLRepo
df = ucirepodata("http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data",false) 

Fetching information on the dataset

print on STDOUT all the relevant information regarding the dataset

using UCIMLRepo
ucirepoinfo("iris") 

As before the exact link may be mentioned for more information on the dataset

using UCIMLRepo
ucirepoinfo("http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.names", false)

Fetching list of all datasets and default task

The package also displays all the packages that are available at the UCI ML repositories. For this end, a simple function as follows can be used

using UCIMLRepo
ucirepolist()

TO DO

  • Add functionality to parse the output from ucirepoinfo and automatically name the attributes in the DataFrame

  • Add functionality to have a seperate datatype for each attribute in the dataset based on the output from ucirepoinfo

  • Better error handling routines

  • Allow for user to enter the url of the dataset

  • Improve speed of ucirepolist

ucimlrepo.jl's People

Contributors

henri-gerard avatar jiahao avatar siddhantjain avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.