Giter Club home page Giter Club logo

rickfarmer / data-science-vm Goto Github PK

View Code? Open in Web Editor NEW
10.0 4.0 6.0 544 KB

A Big Data Analytics VM for doing Data Science. It provides a huge kickstart to those working with the Big Data Analytics side of Data Science. Essentially, this project automates the creation of the Big Data Scientist's toolbox on a virtual machine (VM). In a few minutes one can begin working with a fully configured data science lab instead of performing the complex installations and configuration required for a functioning development environment. The Data Scientist's VM includes R, Git, Python, Cloudera, Hadoop, YARN, MRv2, Mahout, MongoDB, Spark, Neo4j, etc. pre-installed. The Data Scientist's Toolbox VM is automatically built for you on a single CentOS VM using the Vagrant DevOps tool with Chef and shell-scripts for VMware Fusion.

License: GNU General Public License v2.0

Ruby 85.18% HTML 8.02% Shell 6.80%

data-science-vm's Introduction

Data Science VM

##Need to install the following Gems vagrant plugin install vagrant-omnibus vagrant plugin install vagrant-env

Users

root/vagrant joe/joe chuck/chuck cloudera/cloudera

Hive Embedded DB

PostgresSQL Host, e63:7432 DB name, hive Username, hive Password, 8xlpmpA6NE

Hue

http://e63:8888/ hdfs/hdfs

Test the Cluster

sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 10 100

data-science-vm's People

Contributors

rickfarmer avatar

Stargazers

Khaled Adrani avatar the Artist Adamo avatar Artem Tyrnov avatar Jake Carter avatar 安静 avatar Mattia Zoccarato avatar Arif Qureshi avatar  avatar Rich Morgan avatar Minh-Triet Pham Tran avatar

Watchers

Minh-Triet Pham Tran avatar Jake Carter avatar 安静 avatar Sean Roche avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.