Comments (4)
Odd. I use the TensorFlow backend for Keras and whilst it shouldn't matter I was able to reproduce your exception using the Theano backend. Given that the actual RNNs (GRU / LSTM) work, this suggests the issues lies with:
SumEmbeddings = keras.layers.core.Lambda(lambda x: K.sum(x, axis=1))
Turns out that Theano requires output_shape
be provided by the Lambda layer whilst TensorFlow can infer it.
I'd also note to try both backends. I may have a buggy setup but the Theano backend is far slower for the summation of word embeddings, which is strange given the relative simplicity.
Thanks for noting the bug! Hugely important if I want to merge this in as a Keras example! :)
from keras_snli.
Confirmed working now on my setup. Will try both backends as you suggest. Thanks so much for the quick turnaround; this example is very useful (to me at least) from a pedagogical point of view and would be great merged into Keras.
from keras_snli.
Just to follow up, FWIW I ran both Sum and GRU on my setup (AWS EC2 g2.2xlarge, Bitfusion Ubuntu 14 Theano AMI, Theano 0.8.2, TensorFlow 0.10.0rc0, Keras 1.0.8) using both backends. Results were:
Metric | sum(word vectors) using Theano | sum(word vectors) using TensorFlow | GRU using Theano | GRU using TensorFlow |
---|---|---|---|---|
average secs/epoch | 86 | 47.5 | 547 | 442.5 |
test accuracy | 0.8313 | 0.8243 | 0.8165 | 0.8118 |
... which is consistent with your observation, with the caveat that, due to an incompatibility between Keras 1.0.8 and the package structure of TensorFlow 0.7.1, I had to use the bleeding edge TensorFlow build (0.10.0rc0) in comparison with the current but older Theano 0.8.2 release.
from keras_snli.
Thanks for the follow up - and glad it's working for you now! ^_^
I'd be curious where the slowdown comes from. For sum(word vectors)
I could easily imagine that they way Keras's Lambda is implemented between backends is the difference.
Also, +1 for the 83.1 test accuracy! I got some higher numbers like that for sum(word vectors)
on a few runs but decided to report 82.4 as that was the consistent average over runs.
from keras_snli.
Related Issues (6)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from keras_snli.