vict0rsch / deep_learning Goto Github PK

Deep Learning Resources and Tutorials using Keras and Lasagne

License: GNU General Public License v2.0

Python 100.00%

deep_learning's Introduction

MOVING

I'm moving the tutorial to vict0rsch.github.io to make it nicer for you guys. This is an ongoing process and your feedbacks are welcome! Basically it will be betetr organized and there will be comments possible. This repo will stay active for the code, issues, pull requests and collaborations.

Deep-Learning

Presentation

How to Learn from this Tutorial

The Toothbrush Technique

Disclaimer

Presentation

This repository contains Deep Learning implementations tutorials. For more general knowledge regarding Machine/Deep Learning, have a look at useful_resources.md.

Lasagne and Keras are Theano-based so I recommend you get familiar with Theano before starting these ones.

However Keras is way closer to usual Python than Lasagne so it requires a weaker understanding of Theano. The main thing to understand to get started with Keras is Theano's graph structure.

We concentrate, in theano.md, on a few features of Theano that will be needed in Lasagne mostly and just a little in Keras. You will not learn Theano there but get a glimpse at how it works and how it can be used in a Deep Learning context. Theano is about much more than this, especially regarding GPU calculation and Automatic Differentiation. If you use TensorFlow (as a backend for Keras for instance) you can still go through the Theano introduction as Tensorflow's phylosophy on using graphs is very similar : you declare "sessions" instead of "compiling" the graph but the underlying process is conceptually the same.

See the official Theano tutorial here.

I have not worked a lot with Convolutional Networks so I won't mention them here, for now.

Set up	Theano	Keras	Lasagne	Recurrent	Resources	AWS + GPU	Lose Time

Amazon Instances

You will find that Neural Network computations are very expensive and slow on CPUs. This is why all(?) such frameworks are GPU-accelerated. What if you don't have access to a GPU? You can still use Amazon's computers for ~1€/hour. See my attempt to a tutorial here.

How to Learn from this Tutorial

Machine learning is a vast area. Time and concentration are the two things you need the most to get into it. Don't jump to the next step if you're not sure you're clear with the present one's outcomes.

Learn about Machine Learning -> Resources -> Starting with Machine Learning
- Requirements: None. Except basic knowledge in maths
- Outcomes: Understand what ((un)supervised)learning and training mean, what are some of the most famous techniques and the importance of data (feature selection/extraction, overfitting).
Learn about Deep Learning Theory and feedforward networks (your best bet may very well be M. Nielsen's blog) -> Starting with Deep Learning
- Requirements: Python, very basic linear algebra and analysis (vector products and differenciation basically) + outcome (1)
- Outcomes: Understand how neural networks are built, trained, improved. Both on the theory and the implementation side. You'll also understand how networks are coded to get a sense of how frameworks work.
Get familiar with Theano -> Theano
- Requirements: Python
- Outcomes: Be able to understand Theano code and write functions relying on (shared) variables.

Get into some code

a - Start easy with Keras and feedforward networks -> Keras

* **Requirements**: Python + outcomes (1) and (2)
* **Outcomes**: Understand how the Keras framework can be used and therefore implement any dense feedforward network you like.

b - Go into the details with Lasagne (still with feedforward networks) -> Lasagne

* **Requirements**: Python + outcomes (1), (2) and (3)
* **Outcomes**: Understand how the Lasagne framework can be used and therefore implement any dense feedforward network you like. Understand the differences with Keras.

Dig into Recurrent Networks -> Resources
- Requirements: outcomes (1) and (2) (strong)
- Outcomes: Understand the core concepts and usage of recurrent nets. Get the variety of structures.
Spend some time understanding the handling of dimensions in recurrent nets -> Recurrent
- Requirements: outcomes (1), (2), one of (4), (5)
- Outcomes: Be able to create the appropriate dataset and format your data according to the task you seek.
Get back to code
- Requirements:
- Outcomes:

The Toothbrush Technique

The Toothbrush technique is used to debug code. The concept is easy: pick up your toothbrush, a pen or a spoon and walk it through your code as if they understood it. Better yet use a friend or coworker: you won't need their brain, rather their ears.

The thing is that debugging can be hard and the error might very well be silly. However looking as someone else's code is often hard and/or laborious, so asking a friend/coworker to debug it is hardly possible.
On the other hand, explaining it to your toothbrush makes you rethink the whole coding process you went through and hopefully find that (silly?) mistake or incoherence.

Contact Clément to learn more or see the Feynman technique, the Nobel prize's famous technique to understand and remember things.

Disclaimer

I am currently an MSc. student so I don't have much time to improve regularly this tutorial and solve issues but I'll do my best. If you have some experience and would like to contribute get in touch and anyway feel free to send pull requests. See more at vict0rsch.github.io

_{Icons made by Freepik and Gregor Cresnar from Flaticon licensed by CC BY 3.0 and Favicon basic license}

deep_learning's People

Contributors

Stargazers

Watchers

Forkers

nooralahzadeh krishnakalyan3 rwclarity kalyanp bigeyedestroyer ricky1203 akansal1 zhengt2015 ysahukara360 heeven826 wanjinchang little1tow jjdblast luckytina eternonq jaredchung gustavodemari lightsilver cliff007 wavelets wyx1227 stiger104 rjbashar milossimic rhythm92 yfzx diannt jz3707 vyraun hedgefair chinawanghao nategeorge wolfws taba1 grovecai shan4224 huozi07 hlchen123 ronvon metricle danielemilan mpsampat leaderwolf oasisyang balikasg sunhwan weiyegd diegslva stevenlol sophie-germain ctk0418 mansteinliliang minas1900 corneprinsloo alexey-ernest kryptonish yonatanrosen granjanrak lightatron evgeny7777 imzwz hal2001 alvinjamur azoz422 jaydeng2837 eaboelhamd hhh920406 sagar-shiroya trigrass2 zengxiaoqing knishimura785 sp2014 kkc-krish deepaksuresh arashmh wen036 jbuardcap eurismarpires grafflinz ceejay333 tsingzao dalamar66 codepyxi hakubiredwinter timecracker kevinyang007 mehrdadscomputer yfliao fitrialif huahuajhu davbzh kinect59 libardo1 devashish-khatwani xiaolei89tw kkdeng bharathrangarajan lsheiba anilneeluri vpolimenov

deep_learning's Issues

Observation of plot figure

Did you notice that the prediction looks like a time shift of the original time series? Is it the doomed pattern of applying this network to a time series?

Syntax error in np.random.seed

I m getting below error while executing the program. Please help me to solve it
/recurrent_keras_power.py: line 8: syntax error near unexpected token 1234' ./recurrent_keras_power.py: line 8: np.random.seed(1234)'

Predicting X_t as X_{t-1} Gives MSE 0.07

Thanks for making your LSTM time series code available; it reported MSE of 0.07. I tried to create a baseline for comparison, simply taking X_t as X_{t-1}. This also gives me MSE of 0.07. Is this odd, or maybe LSTM generalizes better, or my MSE computation is faulty?

Thanks,

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.read_csv('household_power_consumption.txt', sep=';')
df = df[['Global_active_power']]
df = df[df.Global_active_power != '?']
df['G2'] = df['Global_active_power'].shift(1)
print df.head()
df = df.astype(float)
df['err'] = df['G2']-df['Global_active_power']
df['err'] = np.power(df['err'],2)
print df.err.sum() / len(df)
#print np.sqrt(df['err'].sum()) / len(df)

low accuracy

I tried your code. I think it has a very low accuracy.
I try to explain

You have to predict the next one value (1 minute). Loss: 0.0854 - val_loss: 0.0721. The error is quite large.
If you start predicting a few steps forward, based on previously predicted steps, it becomes obvious.

Have you tried to predict a few steps?
Using a simple and fast multi-layer Perceptron provides a significant increase in accuracy and performance, for example "dense" with 100-200-1. I test it for my task.

dfdff

@vict0rsch Could you please delete this issue completely. I will post from my other account the question. 😃

Thank you!

show accuracy is deprecated

Thanks, very well explained for a beginner like me.

Minor note: I get

The show accuracy argument is deprecated, instead you should pass the accuracy metric to the model at compile time: model.compile(optimizer, loss, metrics=["accuracy"])

some typos and broken links in the lessons

I'm going through this and finding a few typos and broken links. Fixing them and including in this PR.

what if i change the y dimension to 2, should i just change the output_dim of the last layer

Hi,your original post code is to use 49 dimension X to predict the 50th.how about I want to use 48 dimension to predict 49th and 50th. under such condition : does that mean i just change the output_dime of the last output layer :

model.add(Dense(
output_dim=2))
model.add(Activation("linear"))
Is that right?
Sorry for bothering you again..

Issues in keras installation via anaconda on ubuntu

Hi Sir,

Can you please help me in solving the below error in keras installation.

I want to install keras via anaconda in ubuntu.

UnsatisfiableError: The following specifications were found to be in conflict:

keras -> python 2.7* -> openssl 1.0.1*
python 3.6*
Use "conda info " to see the dependencies for each package.

Hyperparameters and Error Metric

Just a few basic questions:

Why did you choose this particular network structure? Since you have two LSTM layers, would you get better performance from using the "relu" activation function?
Why are you only running 1 epoch? Is that just for testing purposes?
How are you measuring errors for forecasting, what is your chosen error metric? Most other results use RMSE, however, is there a better metric? RMSE feels like a necessary but not sufficient condition for measuring the "learning quality" of a neural net.

how to predict the next n steps not one？

the tutorial's output is the next one value,but how to use the same time series data to predict the next n steps?
how to design the model to fit it?i have know the 'TimeDistributed()'can make the 3D output,but the output shape is the same as input.shape ,i can only change the output_dim ,so the problem is how to change the time_step params in the output?

the problem in running

Thanks,your tutorial and code very well explained LSTM for a beginner like me,however，when I run it on pycharm(based on python3.6,keras1.0.7),it shows as follows:
File "C:/Users/Guo/Desktop/household_power_consumption/predict.py", line 132, in
run_network()
File "C:/Users/Guo/Desktop/household_power_consumption/predict.py", line 105, in run_network
model = build_model()
File "C:/Users/Guo/Desktop/household_power_consumption/predict.py", line 67, in build_model
return_sequences=True))
TypeError: Expected int32, got <tf.Variable 'lstm_1_W_i:0' shape=(1, 50) dtype=float32_ref> of type 'Variable' instead.

I wonder if it's the version-compatibility,would you give me some advice?
really,thank you again

Diagram of architecture

Could you please update your time-series Readme with a diagram of the architecture? You tried to explain a great detail about what return_sequences do but a simple diagram would be more helpful.

Cheers

image inputs, conv w LSTM

Thanks for your examples and explanation regarding use of RNN with a LSTM.

I have a robotics application that takes as input an image and vector of imu measurements. I'm wondering if you have an idea about how to incorporate a time sequence of images, imu, into a network that uses conv2d layers and dense layers as input?

train error

I used 1,000 samples to do a test of your example, but get the following error:
what does "too many indices for array" mean ?

Compilation Time :  0.00799989700317
Train on 418 samples, validate on 23 samples
Epoch 1/1
418/418 [==============================] - 228s - loss: 0.6930 - val_loss: 0.5179
too many indices for array
Training duration (s) :  279.796000004

why you reshape the X into 3D shape

Hi, the code runs well, thanks.
I got one question that confuse me , why you reshape the X into 3 dimensions : X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1)) ?does these represent: nb_samples , timestep and features ?
Many thanks

Original Dataset availability

I would love to recreate that example with the same data that has been used.

However, the dataset is no longer available for download for me.
Is there any way to retrieve it, or can you upload it some place where i can download it?

Thanks :)

Broken link

This page has a broken link to the recurrent tutorial. It should point here instead.