Giter Club home page Giter Club logo

deep_learning's Introduction

MOVING

I'm moving the tutorial to vict0rsch.github.io to make it nicer for you guys. This is an ongoing process and your feedbacks are welcome! Basically it will be betetr organized and there will be comments possible. This repo will stay active for the code, issues, pull requests and collaborations.

Deep-Learning

Table Of Contents

Presentation

How to Learn from this Tutorial

The Toothbrush Technique

Disclaimer

Presentation

This repository contains Deep Learning implementations tutorials. For more general knowledge regarding Machine/Deep Learning, have a look at useful_resources.md.

Lasagne and Keras are Theano-based so I recommend you get familiar with Theano before starting these ones.

However Keras is way closer to usual Python than Lasagne so it requires a weaker understanding of Theano. The main thing to understand to get started with Keras is Theano's graph structure.

We concentrate, in theano.md, on a few features of Theano that will be needed in Lasagne mostly and just a little in Keras. You will not learn Theano there but get a glimpse at how it works and how it can be used in a Deep Learning context. Theano is about much more than this, especially regarding GPU calculation and Automatic Differentiation. If you use TensorFlow (as a backend for Keras for instance) you can still go through the Theano introduction as Tensorflow's phylosophy on using graphs is very similar : you declare "sessions" instead of "compiling" the graph but the underlying process is conceptually the same.

See the official Theano tutorial here.

I have not worked a lot with Convolutional Networks so I won't mention them here, for now.

Set up Theano Keras Lasagne Recurrent Resources AWS + GPU Lose Time
i1 i2 i3 i4 i6 i5 i7 i6

Amazon Instances

You will find that Neural Network computations are very expensive and slow on CPUs. This is why all(?) such frameworks are GPU-accelerated. What if you don't have access to a GPU? You can still use Amazon's computers for ~1€/hour. See my attempt to a tutorial here.

How to Learn from this Tutorial

Machine learning is a vast area. Time and concentration are the two things you need the most to get into it. Don't jump to the next step if you're not sure you're clear with the present one's outcomes.

  1. Learn about Machine Learning -> Resources -> Starting with Machine Learning

    • Requirements: None. Except basic knowledge in maths
    • Outcomes: Understand what ((un)supervised)learning and training mean, what are some of the most famous techniques and the importance of data (feature selection/extraction, overfitting).
  2. Learn about Deep Learning Theory and feedforward networks (your best bet may very well be M. Nielsen's blog) -> Starting with Deep Learning

    • Requirements: Python, very basic linear algebra and analysis (vector products and differenciation basically) + outcome (1)
    • Outcomes: Understand how neural networks are built, trained, improved. Both on the theory and the implementation side. You'll also understand how networks are coded to get a sense of how frameworks work.
  3. Get familiar with Theano -> Theano

    • Requirements: Python
    • Outcomes: Be able to understand Theano code and write functions relying on (shared) variables.
  4. Get into some code

    a - Start easy with Keras and feedforward networks -> Keras

    * **Requirements**: Python + outcomes (1) and (2)
    * **Outcomes**: Understand how the Keras framework can be used and therefore implement any dense feedforward network you like.
    

    b - Go into the details with Lasagne (still with feedforward networks) -> Lasagne

    * **Requirements**: Python + outcomes (1), (2) and (3)
    * **Outcomes**: Understand how the Lasagne framework can be used and therefore implement any dense feedforward network you like. Understand the differences with Keras.
    
  5. Dig into Recurrent Networks -> Resources

    • Requirements: outcomes (1) and (2) (strong)
    • Outcomes: Understand the core concepts and usage of recurrent nets. Get the variety of structures.
  6. Spend some time understanding the handling of dimensions in recurrent nets -> Recurrent

    • Requirements: outcomes (1), (2), one of (4), (5)
    • Outcomes: Be able to create the appropriate dataset and format your data according to the task you seek.
  7. Get back to code

    • Requirements:
    • Outcomes:

The Toothbrush Technique

The Toothbrush technique is used to debug code. The concept is easy: pick up your toothbrush, a pen or a spoon and walk it through your code as if they understood it. Better yet use a friend or coworker: you won't need their brain, rather their ears.

The thing is that debugging can be hard and the error might very well be silly. However looking as someone else's code is often hard and/or laborious, so asking a friend/coworker to debug it is hardly possible.
On the other hand, explaining it to your toothbrush makes you rethink the whole coding process you went through and hopefully find that (silly?) mistake or incoherence.

Contact Clément to learn more or see the Feynman technique, the Nobel prize's famous technique to understand and remember things.

Disclaimer

I am currently an MSc. student so I don't have much time to improve regularly this tutorial and solve issues but I'll do my best. If you have some experience and would like to contribute get in touch and anyway feel free to send pull requests. See more at vict0rsch.github.io


Icons made by Freepik and Gregor Cresnar from Flaticon licensed by CC BY 3.0 and Favicon basic license

deep_learning's People

Contributors

callofdutyops avatar cwmoo740 avatar mehrdadscomputer avatar nategeorge avatar slasnista avatar stiger104 avatar vict0rsch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deep_learning's Issues

Observation of plot figure

Did you notice that the prediction looks like a time shift of the original time series? Is it the doomed pattern of applying this network to a time series?

Syntax error in np.random.seed

I m getting below error while executing the program. Please help me to solve it
/recurrent_keras_power.py: line 8: syntax error near unexpected token 1234' ./recurrent_keras_power.py: line 8: np.random.seed(1234)'

Predicting X_t as X_{t-1} Gives MSE 0.07

Thanks for making your LSTM time series code available; it reported MSE of 0.07. I tried to create a baseline for comparison, simply taking X_t as X_{t-1}. This also gives me MSE of 0.07. Is this odd, or maybe LSTM generalizes better, or my MSE computation is faulty?

Thanks,

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.read_csv('household_power_consumption.txt', sep=';')
df = df[['Global_active_power']]
df = df[df.Global_active_power != '?']
df['G2'] = df['Global_active_power'].shift(1)
print df.head()
df = df.astype(float)
df['err'] = df['G2']-df['Global_active_power']
df['err'] = np.power(df['err'],2)
print df.err.sum() / len(df)
#print np.sqrt(df['err'].sum()) / len(df)

low accuracy

I tried your code. I think it has a very low accuracy.
I try to explain

  1. You have to predict the next one value (1 minute). Loss: 0.0854 - val_loss: 0.0721. The error is quite large.
    If you start predicting a few steps forward, based on previously predicted steps, it becomes obvious.
    predict 50
    Have you tried to predict a few steps?
  2. Using a simple and fast multi-layer Perceptron provides a significant increase in accuracy and performance, for example "dense" with 100-200-1. I test it for my task.

dfdff

@vict0rsch Could you please delete this issue completely. I will post from my other account the question. 😃

Thank you!

show accuracy is deprecated

Thanks, very well explained for a beginner like me.

Minor note: I get

The show accuracy argument is deprecated, instead you should pass the accuracy metric to the model at compile time: model.compile(optimizer, loss, metrics=["accuracy"])

Issues in keras installation via anaconda on ubuntu

Hi Sir,

Can you please help me in solving the below error in keras installation.

I want to install keras via anaconda in ubuntu.

UnsatisfiableError: The following specifications were found to be in conflict:

  • keras -> python 2.7* -> openssl 1.0.1*
  • python 3.6*
    Use "conda info " to see the dependencies for each package.

Hyperparameters and Error Metric

Just a few basic questions:

  1. Why did you choose this particular network structure? Since you have two LSTM layers, would you get better performance from using the "relu" activation function?
  2. Why are you only running 1 epoch? Is that just for testing purposes?
  3. How are you measuring errors for forecasting, what is your chosen error metric? Most other results use RMSE, however, is there a better metric? RMSE feels like a necessary but not sufficient condition for measuring the "learning quality" of a neural net.

how to predict the next n steps not one?

the tutorial's output is the next one value,but how to use the same time series data to predict the next n steps?
how to design the model to fit it?i have know the 'TimeDistributed()'can make the 3D output,but the output shape is the same as input.shape ,i can only change the output_dim ,so the problem is how to change the time_step params in the output?

the problem in running

Thanks,your tutorial and code very well explained LSTM for a beginner like me,however,when I run it on pycharm(based on python3.6,keras1.0.7),it shows as follows:
File "C:/Users/Guo/Desktop/household_power_consumption/predict.py", line 132, in
run_network()
File "C:/Users/Guo/Desktop/household_power_consumption/predict.py", line 105, in run_network
model = build_model()
File "C:/Users/Guo/Desktop/household_power_consumption/predict.py", line 67, in build_model
return_sequences=True))
TypeError: Expected int32, got <tf.Variable 'lstm_1_W_i:0' shape=(1, 50) dtype=float32_ref> of type 'Variable' instead.

I wonder if it's the version-compatibility,would you give me some advice?
really,thank you again

Diagram of architecture

Could you please update your time-series Readme with a diagram of the architecture? You tried to explain a great detail about what return_sequences do but a simple diagram would be more helpful.

Cheers

image inputs, conv w LSTM

Thanks for your examples and explanation regarding use of RNN with a LSTM.

I have a robotics application that takes as input an image and vector of imu measurements. I'm wondering if you have an idea about how to incorporate a time sequence of images, imu, into a network that uses conv2d layers and dense layers as input?

train error

I used 1,000 samples to do a test of your example, but get the following error:
what does "too many indices for array" mean ?

Compilation Time :  0.00799989700317
Train on 418 samples, validate on 23 samples
Epoch 1/1
418/418 [==============================] - 228s - loss: 0.6930 - val_loss: 0.5179
too many indices for array
Training duration (s) :  279.796000004

why you reshape the X into 3D shape

Hi, the code runs well, thanks.
I got one question that confuse me , why you reshape the X into 3 dimensions : X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1)) ?does these represent: nb_samples , timestep and features ?
Many thanks

Original Dataset availability

I would love to recreate that example with the same data that has been used.

However, the dataset is no longer available for download for me.
Is there any way to retrieve it, or can you upload it some place where i can download it?

Thanks :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.