Giter Club home page Giter Club logo

udacity-001-aws-ml-foundations's Introduction

Back to All Courses

AWS Machine Learning Foundations Course

Link: AWS Machine Learning Foundations Course

Lesson 2: Software Engineering Practices Part 1

  • "using vectorized operations and more efficient data structures can optimize your code" - what are vectorized operations?

Links

Lesson 3: Lesson 2: Software Engineering Practices Part 1

  • TDD: Test Driven Development: write tests before the code

  • unit tests: a test that covers a small unit of code

  • install pytest:

pip install -U pytest
  • You need your test files to start with the word "test" as in test_nearest.py and each test function must start with "test" as in def test_nearest_square_5():
  • You then just type pytest in the terminal to run the tests

Questions to Ask Yourself When Conducting a Code Review

Is the code clean and modular?

  • Can I understand the code easily?
  • Does it use meaningful names and whitespace?
  • Is there duplicated code?
  • Can you provide another layer of abstraction?
  • Is each function and module necessary?
  • Is each function or module too long?

Is the code efficient?

  • Are there loops or other steps we can vectorize?
  • Can we use better data structures to optimize any steps?
  • Can we shorten the number of calculations needed for any steps?
  • Can we use generators or multiprocessing to optimize any steps?

Is documentation effective?

  • Are in-line comments concise and meaningful?
  • Is there complex code that's missing documentation?
  • Do function use effective docstrings?
  • Is the necessary project documentation provided?

Is the code well tested?

  • Does the code high test coverage?
  • Do tests check for interesting cases?
  • Are the tests readable?
  • Can the tests be made more efficient?

Is the logging effective?

  • Are log messages clear, concise, and professional?

  • Do they include all relevant and useful information?

  • Do they use the appropriate logging level?

  • Use a linter like pylint

Links

Lesson 4: Introduction to Object-Oriented Programming

Code for the lesson

  • Objects have characteristics and can perform actions
  • An object is a specific instance of something whereas a class is the generic version of the object, or blueprint of it
  • Here are some terms worth knowing:
    • class - a blueprint consisting of methods and attributes
    • object - an instance of a class. It can help to think of objects as something in the real world like a yellow pencil, a small dog, a blue shirt, etc. However, as you'll see later in the lesson, objects can be more abstract.
    • attribute - a descriptor or characteristic. Examples would be color, length, size, etc. These attributes can take on specific values like blue, 3 inches, large, etc.
    • method - an action that a class or object could take
    • OOP - a commonly used abbreviation for object-oriented programming
    • encapsulation - one of the fundamental ideas behind object-oriented programming is called encapsulation: you can combine functions and data all into a single entity. In object-oriented programming, this single entity is called a class. Encapsulation allows you to hide implementation details much like how the scikit-learn package hides the implementation of machine learning algorithms.
  • method vs function:
    • a method is a function inside a class while a function is outside of a class
  • when writing class methods, notice how you don't have to pass self in as an argument; it is passed implicitly
  • If you saved your Shirt class in a file called shirt.py, you would import it by doing the following:
from shirt import Shirt
  • this assumes that your class is named "Shirt" (with a capital "S")
  • There are a number of drawbacks of accessing object properties directly vs. using getter and setter methods. Python is looser than other OO languages

Gaussian Package

gaussian_one = Gaussian(25, 3)
gaussian_two = Gaussian(30, 4)
gaussian_sum = gaussian_one + gaussian_two # __add__ magic method

Inheritance

  • Inheritance is pretty self-explanatory in Python. Here is an example of the Shirt class that inherits from Clothing:
class Clothing:

    def __init__(self, color, size, style, price):
        self.color = color
        self.size = size
        self.style = style
        self.price = price
        
    def change_price(self, price):
        self.price = price
        
    def calculate_discount(self, discount):
        return self.price * (1 - discount)
    
    def calculate_shipping(self, weight, rate):
        return weight * rate
        
class Shirt(Clothing):
    
    def __init__(self, color, size, style, price, long_or_short):
        
        Clothing.__init__(self, color, size, style, price)
        self.long_or_short = long_or_short
    
    def double_price(self):
        self.price = 2*self.price
  • Clothing is pretty normal, nothing exciting there
  • Shirt first has (Clothing) on the class defintion line
  • Notice the __init__ method; it's a normal __init__ method except you first call the Clothing class and then set any properties for your Shirt class

Advanced OOP Topics

Here are some Python-focused OOP articles and concepts:

Making a package

  • I won't go through everything here are the basics. You can see what's really happening in the folder in this repo: 3a_python_package

  • my_python_package (package_root)

    • setup.py (sets up package)
    • distributions (code for my package)
      • __init__: the init code for my package
      • Generaldistribution.py: the parent class for my Gaussian distribution class
      • Gaussiandistribution.py: Gaussian distribution class
  • To use it, I could go to that folder and do

pip install .
  • this will install it. And then do python in the Terminal to bring up the interpreter:
from distributions import Gaussian
gaussian_one = Gaussian(25, 2)
gaussian_one.mean
gaussian_one + gaussian_one

Uploading Package to PyPi

cd binomial_package_files
python setup.py sdist
pip install twine

# commands to upload to the pypi test repository
twine upload --repository-url https://test.pypi.org/legacy/ dist/*
pip install --index-url https://test.pypi.org/simple/ dsnd-probability

# command to upload to the pypi repository
twine upload dist/*
pip install dsnd-probability

Lesson 5: Machine Learning with AWS DeepComposer

ML Techniques and Generative AI

  • Types of ML Techniques
    • Supervised Learning
      • every training example has a corresponding label
    • Unsupervised Learning
      • No labels for training data
      • Most Generative AI is unsupervised learning
    • Reinforcement Learning
      • learns through consequences of action in specific environment
  • Generative AI is one of the most promising new technologies
  • Generative AI pits two different neural networks against each other to produce new and original digital works based on sample inputs

their notes: Machine Learning Techniques

Supervised Learning: Models are presented wit input data and the desired results. The model will then attempt to learn rules that map the input data to the desired results.

Unsupervised Learning: Models are presented with datasets that have no labels or predefined patterns, and the model will attempt to infer the underlying structures from the dataset. Generative AI is a type of unsupervised learning.

Reinforcement learning: The model or agent will interact with a dynamic world to achieve a certain goal. The dynamic world will reward or punish the agent based on its actions. Overtime, the agent will learn to navigate the dynamic world and accomplish its goal(s) based on the rewards and punishments that it has received.

AWS DeepComposer

  • AWS DeepComposer is how Amazon is teaching developers how to use GAN (Generative Adversarial Networks) to generate music
  • GANs pit 2 networks, a generator and a discriminator, against each other to generate new content.
    • generator: creates new output
    • discriminator: evaluates the quality of output AND provides feedback
  • Each iteration of the training cycle is called an epoch
  • The goal of iterating and completing epochs is to improve the output or prediction of the model.
  • output that deviates from the ground truth is referred to as an error.
    • The measure of an error, given a set of weights, is called a loss function.
    • Weights represent how important an associated feature is to determining the accuracy of a prediction
  • loss functions are used to update the weights after every iteration.
  • Ideally, as the weights update, the model improves making less and less errors.
  • Convergence happens once the loss functions stabilize.

Challenges with GANs

  • Clean datasets are hard to obtain

  • Not all melodies sound good in all genres

  • Convergence in GAN is tricky – it can be fleeting rather than being a stable state

    • if you keep training it, it could be training on junk feedback
  • Complexity in defining meaningful quantitive metrics to measure the quality of music created

  • the Similarity Index trends toward zero but not necessarily reach zero

Generative AI Overview

  • Generative AI techniques include:
    • Generative Adversarial Networks (GANs)
    • Variational Autoencoders
    • Transformers

GANs

  • The generator and the discriminator are trained in alternating cycles such that the generator learns to produce more and more realistic data while the discriminator iteratively gets better at learning to differentiate real data from the newly created data.

GAN Overview

Model Architecture

  • The generator network used in AWS DeepComposer is adapted from the U-Net architecture, a popular convolutional neural network
  • Order of steps in the U-Net architecture:
    • Input
    • Encoder
    • Latent space
    • Decoder
    • Output
  • In the case of AWS DeepComposer:

The network consists of an “encoder” that maps the single track music data (represented as piano roll images) to a relatively lower dimensional “latent space“ and a ”decoder“ that maps the latent space back to multi-track music data.

Evaluation

  • The discriminator loss has been found to correlate well with sample quality.

Building a Custom GAN

How the Model Works

The model consists of two networks, a generator and a critic. These two networks work in a tight loop:

  • The generator takes in a batch of single-track piano rolls (melody) as the input and generates a batch of multi-track piano rolls as the output by adding accompaniments to each of the input music tracks.
  • The discriminator evaluates the generated music tracks and predicts how far they deviate from the real data in the training dataset.
  • The feedback from the discriminator is used by the generator to help it produce more realistic music the next time.
  • As the generator gets better at creating better music and fooling the discriminator, the discriminator needs to be retrained by using music tracks just generated by the generator as fake inputs and an equivalent number of songs from the original dataset as the real input.
  • We alternate between training these two networks until the model converges and produces realistic music.
  • The discriminator is a binary classifier which means that it classifies inputs into two groups, e.g. “real” or “fake” data.

Lesson 6: Dive Deeper into Machine Learning

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.