This GitHub repository is a collection of the projects in the course, INDE 577: Data Science and Machine Learning, in Rice University.
This data science course focuses on the topics in machine learning. The topics covers the algorithms from supervised learning and unsupervised learning. The key feature of this course is that instead of using well-established packages, we build up algorithms from scratch. In this process, we learned all the details of the algorithms including basic concepts, computational steps, model design, and evaluation metrics. This course is taught using Python, version 3.6 and higher.
This course is taught by Dr. Randy Davila, Assistant Professor of Data Science in University of Houston - Downtown.
Textbook
- Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition, by Aurélien Géron
The following topics are included:
Supervised Learning
- K Nearest Neighbors (KNN)
- Gradient Descent
- Linear Regression
- Logistic Regression
- Perceptron
- Neuron Network
- Decision Tree
- Ensemble Learning and Random Forest
- Support Vector Machines (SVMs)
Unsupervised Learning
Reinforcement Learning
The sub-repositories in this repository are named by the algorithm names. Each sub-repository contains two jupyter notebook files that introduce and implement the algorithm, and/or Data and Image folders containing datasets or images used in that sub-repository.
Acknowledgement
I would like to thank Dr. Davila for his helpfully informative lectures during the Fall 2021 semester.
Machine learning is a very important tool in data science practices nowadays. Using the machine learning technology, key insights could be mined out from huge datasets, while it would take enormous manpower.
There are three main types of machine learning, supervised learning, unsupervised learning, and reinforcement learning. You can check each sub-repository in this repository to learn more.