vishalv91 Goto Github PK
Name: Vishal V
Type: User
Bio: Aspiring Data Scientist
Location: Bengaluru
Name: Vishal V
Type: User
Bio: Aspiring Data Scientist
Location: Bengaluru
The objective of the project was to build various models and compare their prediction performance based on accuracy.
Data: Boston Housing Dataset (HousingData.csv) Programming language(s): R Tool(s): RStudio Business problem: To understand the drivers behind the value of houses in Boston and provide data-driven recommendation to the client on how they can increase the value of housing.The Boston housing dataset consisted of 506 observations and 14 variables. Project challenge(s): MEDV (Median value of homes in Boston) was identified as the dependent variable. While the rest, were the independent variables. The goal was to find out which among the independent variables were statistically significant in driving the house prices (MEDV). The dataset consisted of missing values and outliers. Some of the variables had a skewed distribution. There was multicollinearity among few independent variables. Our Approach: Prior to model building, we tidied up our dataset by eliminating the rows that contained missing values. Replacing the missing values with median and mean of those variables were also done. Considering the three approaches, median imputation(replacing missing values with mean) was found to be the best approach. As the dependent variable "MEDV" (median value of houses) was continuous(numerical) in nature, we implemented the Multiple linear regression to build our model. Additional models were built from Decision trees and Random forest. On further investigation, we discovered that the dependent variable had a skewed distribution. By log transformation of this variable, we were able to get a normal distribution. Post transformation, we found out that the model built from Multiple linear regression with log transformed MEDV was the best in terms of MSE (Mean squared error) value and Adjusted R^2. All the assumptions of linear regression were met.
The project concerns an international e-commerce company* based in the USA who want to discover key insights from their customer database. They want to use some of the most advanced machine learning techniques to study their customers.
Cheat Sheets
The objective of the project was to create innovative and interactive Tableau dashboards that focus on potential commodities, countries, year, trade amount and quantity. The client wanted to launch a new business unit, focusing on global trade and logistics, majorly in the countries such as USA, Canada and Australia The dataset provided by the client contained 59090 observations of 10 variables. The client insisted the data to be cleaned using Excel or R. The Dataset contained missing values and was cleaned using the R programming language. Tableau dashboards were created from the cleaned dataset.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.