Introduction
Introduction
This lesson summarizes the topics we'll be covering in section 34 and why they'll be important to you as a data scientist.
Objectives
You will be able to:
- Understand and explain what is covered in this section
- Understand and explain why the section will help you to become a data scientist
Dimensionality Reduction with PCA
Unsupervised Learning
In this section we start off by introducing the concept of unsupervised learning and provide an overview of some of the strengths and weaknesses of using unsupervised machine learning techniques.
The Curse of Dimensionality
We then highlight one of the biggest challenges that you often face as a data scientist - the fact that you are often provided with data with so many dimensions that it is computationally impractical to model it directly.
Principal Component Analysis (PCA)
We then spend the rest of the section learning about PCA - one of the most common techniques for combining dimensions in a way that minimizes the information lost by the reduction in dimensionality.
Summary
The curse of dimensionality is going to be an issue you're dealing with on a regular basis and dimensionality reduction techniques like PCA are going to be a common tool for making your data set more tractible while losing as little information as possible.