Comments (1)
(I assume you are talking about output projection layer, i.e. softmax layer)
embedding_size
is the dimensionality of word vectors, not amount of different words decoder can produce. In the end you need to output token IDs, not embeddings, so you have to get from decoder's hidden state to probability distribution over possible words, i.e. vocabulary. That is, softmax layer.
I imagine you could try decoding embeddings directly, then searching for a word via cosine similarity vs whole vocabilary, but this would be waaaaay more computationally expensive then softmax. Softmax is already the most expensive operation in this architecture.
from tensorflow-seq2seq-tutorials.
Related Issues (20)
- new changes in tensorflow HOT 4
- Multilayered Bidirectional seq2seq HOT 6
- Potential bug of the dynamic_decoder HOT 2
- OOM error after running some batches HOT 3
- Beam search toy example
- Using the same weights in two different positions seems wrong
- module 'tensorflow.contrib.seq2seq' has no attribute 'prepare_attention' HOT 2
- there is no input argument when loop_fn method is called. HOT 1
- Increasing max sequence length and vocab size HOT 1
- How does inference work? Always getting same predictions HOT 2
- Duplicated computation of decoder outputs?
- toy task-example1-error
- testing
- Attempted an implementation
- good job
- Logits and labes have different shapes when computing cross-entropy loss HOT 1
- decoder input
- ask for help
- How to connect multilayered encoder to decoder?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tensorflow-seq2seq-tutorials.