Repository for Big Data and Data Analytics (CMPN451) labs and project in R and Python.
Lab | Description | Tools |
---|---|---|
Lab 01 | Get insights from the titanic dataset | R |
Lab 02 | Apply MapReduce on the dataset | Hadoop, Jave |
Lab 03 | Predictive analysis using Naive Bayes and Decision Tree | R |
Lab 04 | Clustering using K-means | R |
Lab 05 | Market Basket Analysis | R |
Assignment | Data Analytics on a cars dataset | Pandas, SKlearn, Regression, Matplotlib, Seaborn |
Project | Customer Segmentation and Market Basket Analysis on this dataset | PySpark, Python, Matplotlib, Plotly |