The g2p from shtoshni

UNK problem

Excuse me！
I cannot find any operation for the UNK word.

Can you tell me the process method for UNK?

Thank you very much!

Best Regards!

seq2seq_model.py's get_batch(...) raises "ValueError: setting an array element with a sequence."

Although CMUDict setup doesn't raise an exception, I tried it with other dataset and I believe there is a bug in seq2seq_model.py get_batch(self, data, bucket_id=None) method. Specifically I believe there is a case when the decoder_pad_size becomes < 0 when self.isTraining is false at the following code:

decoder_pad_size = max_len_target - (len(decoder_input) + 1)

When decoder_pad_size is negative the following error is raised:

decoder_inputs = np.asarray(decoder_inputs, dtype=np.int32).T
File "/home/XXXX/.local/share/virtualenvs/G2P-MASTER-mSKoJG47/lib/python2.7/site-packages/numpy/core/numeric.py", line 538, in asarray
    return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.

I believe the following is the culprit. It should be with + 1

seq_len_target[i] = decoder_size        # Original Code
seq_len_target[i] = decoder_size + 1    # Fixed Code

For your information, here are the conditions when ValueError occurs:

decoder_size = 35
max_len_target = 35
self.isTraining = false
len(decoder_input) = 35
decoder_pad_size = -1

With above condition, the original code creates decoder_inputs with the shape of 256 [FLAGS.batch_size] x 36,
when it should be 256 [FLAGS.batch_size] x 37 ([data_utils.GO_ID] + decoder_input + [data_utils.EOS_ID] + [data_utils.PAD_ID]
* 0)

As an extra information, my G2P dataset is not English and has a Maximum input sequence length of 65 and Maximum output sequence length of 97. While the above fix of + 1 seem to do the trick (no more ValueError), should I be concerned with other parameters (e.g. _buckets = [(35, 35)] in data_utils.py? I read your comment regarding the bucket but the link you mention is broken: http://goo.gl/d8ybpl).

Buckets are useful to limit the amount of padding required since we use
minibatch processing. For more detail refer this: http://goo.gl/d8ybpl
Bucket sizes are specified as a list with each entry of the form -
(Max input sequence length, Max output sequence length)
Since this project is about Grapheme-to-Phoneme conversion, where the input
sequence is characters in a word and output sequence is phonemes in word
pronunciation, we use a single bucket to merely denote the max word length
and max pronunciation length

Code not available?

There is a link in the recent paper titled "Jointly Learning to Align and Convert Graphemes to Phonemes with Neural Attention Models" that connects to this repository, but it seems that the code was not committed.

Dataset preparation

Hi!

I'm trying to replicate the results from your paper for the CMUDict dataset, but I can't get it above 44% WER 12% PER.
You mention the cmusphinx version, but did you use this dataset unprocessed? In the paper it seems like the stress suffices are removed from the vowels, therefore making the task much easier.

Thanks!

Splits for CMU dictionary

Hey, I can't seem to find the splits you used to train on the CMU dictionary. Wondering where I might find them/if you can provide them.

Attention problem

Excuse me!
When will you release your code about "attention mechanism"?

Thank you very much!

shtoshni / g2p Goto Github PK

g2p's People

Contributors

Stargazers

Watchers

Forkers

g2p's Issues

UNK problem

seq2seq_model.py's get_batch(...) raises "ValueError: setting an array element with a sequence."

Code not available?

Dataset preparation

Splits for CMU dictionary

Attention problem

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent