Light

ahmedtarek1325 / dlsys_hw0 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from abnerzheng/hw0

0.0 0.0 0.0 11 MB

C++ 4.20% Python 11.83% Makefile 0.15% Jupyter Notebook 83.82%

dlsys_hw0's Introduction

DL systems HW0

This is assignment is a part of DL-systems course presented by CMU.

What you will find in this repo

This hw is just a warm up for the course. Navigating this repo, you will find implementation for the following:

How to read mnist data
Softmax Loss
Softmax regression on one epoch
a Two layer Neural Net on one epoch
softmax regression implemented in CPP

Tests

To run tests you can use the following commands in the repo directory: ps if you are running in jupyter add ! before the following commands

python3 -m pytest -k "softmax_loss"
python3 -m pytest -k "softmax_regression_epoch and not cpp"
python3 -m pytest -k "nn_epoch"
make

python3 -m pytest -k "softmax_regression_epoch_cpp"

my AHA moments!!

Logical Mistake while implementing the softmx loss:
- SOFTMAX eqution is $\ell_{\mathrm{softmax}}(z, y) = \log\sum_{i=1}^k \exp z_i - z_y.$
at first I did it as if it was

$\ell_{\mathrm{softmax}}(z, y) = \log(\sum_{i=1}^k \exp (z_i - z_y))$

and it worked out BUT WHY!!

to answer this let's take it on multiple steps
1. zy did enter the log as $\log(\exp(zy))$ which is equal to zy
2. $\log\sum_{i=1}^k\exp(zi)) - \log(\exp(zy)) = \log(\frac{\sum_{i=1}^k\exp(zi)}{\exp(zy)})$
3. $\log(\frac{\sum_{i=1}^k\exp(zi)}{\exp(zy)})$ = $\log(\sum_{i=1}^k\exp(zi-zy))$
so this wrong formultion for softmaax still gave us right answer!. But we can see that it's computionally is very expensive as it calculates $z_i-z_y$ many many times!!.
why we do take log in softmax instead of softmax?

the softmax in literature is often written as $\frac{\exp(z_y)}{\sum_{i=1}^k \exp(z_i)}$,

But whenever we do it as a code we ususaly take the -log to reduce it to look like:

$\ell_{\mathrm{softmax}}(z, y) = \log\sum_{i=1}^k \exp z_i - z_y.$

The answer:

This term $\sum_{i=1}^k \exp(z_i)$ is very big that if we put it in the denimerator we may encounter an underflow. For simplicitly Think of dividing $\frac{1}{10000}\approx 0$

Want to see more of the assignments ?

Click here to see the rest of the assignments and my take outts

Refrences:

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.