shtoshni / g2p Goto Github PK
View Code? Open in Web Editor NEWCode for SLT 2016 paper on Grapheme-to-Phoneme conversion using attention based encoder-decoder models
Code for SLT 2016 paper on Grapheme-to-Phoneme conversion using attention based encoder-decoder models
Excuse me!
I cannot find any operation for the UNK word.
Can you tell me the process method for UNK?
Thank you very much!
Best Regards!
Although CMUDict setup doesn't raise an exception, I tried it with other dataset and I believe there is a bug in seq2seq_model.py
get_batch(self, data, bucket_id=None)
method. Specifically I believe there is a case when the decoder_pad_size
becomes < 0 when self.isTraining
is false at the following code:
decoder_pad_size = max_len_target - (len(decoder_input) + 1)
When decoder_pad_size
is negative the following error is raised:
decoder_inputs = np.asarray(decoder_inputs, dtype=np.int32).T
File "/home/XXXX/.local/share/virtualenvs/G2P-MASTER-mSKoJG47/lib/python2.7/site-packages/numpy/core/numeric.py", line 538, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.
I believe the following is the culprit. It should be with + 1
seq_len_target[i] = decoder_size # Original Code
seq_len_target[i] = decoder_size + 1 # Fixed Code
For your information, here are the conditions when ValueError occurs:
With above condition, the original code creates decoder_inputs
with the shape of 256 [FLAGS.batch_size] x 36,
when it should be 256 [FLAGS.batch_size] x 37 ([data_utils.GO_ID] + decoder_input + [data_utils.EOS_ID] + [data_utils.PAD_ID]
* 0)
As an extra information, my G2P dataset is not English and has a Maximum input sequence length of 65 and Maximum output sequence length of 97. While the above fix of + 1 seem to do the trick (no more ValueError), should I be concerned with other parameters (e.g. _buckets = [(35, 35)]
in data_utils.py
? I read your comment regarding the bucket but the link you mention is broken: http://goo.gl/d8ybpl).
Buckets are useful to limit the amount of padding required since we use
minibatch processing. For more detail refer this: http://goo.gl/d8ybpl
Bucket sizes are specified as a list with each entry of the form -
(Max input sequence length, Max output sequence length)
Since this project is about Grapheme-to-Phoneme conversion, where the input
sequence is characters in a word and output sequence is phonemes in word
pronunciation, we use a single bucket to merely denote the max word length
and max pronunciation length
There is a link in the recent paper titled "Jointly Learning to Align and Convert Graphemes to Phonemes with Neural Attention Models" that connects to this repository, but it seems that the code was not committed.
Hi!
I'm trying to replicate the results from your paper for the CMUDict dataset, but I can't get it above 44% WER 12% PER.
You mention the cmusphinx version, but did you use this dataset unprocessed? In the paper it seems like the stress suffices are removed from the vowels, therefore making the task much easier.
Thanks!
Hey, I can't seem to find the splits you used to train on the CMU dictionary. Wondering where I might find them/if you can provide them.
Excuse me!
When will you release your code about "attention mechanism"?
Thank you very much!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.