alexa / alexa-end-to-end-slu Goto Github PK
View Code? Open in Web Editor NEWThis setup allows to train end-to-end neural models for spoken language understanding (SLU).
License: Apache License 2.0
This setup allows to train end-to-end neural models for spoken language understanding (SLU).
License: Apache License 2.0
Hi, could you send me the dataset: complete.csv, I cannot run the code
Running the code with '--distributed' flag raises an error because in experiments/experiment_triplet.py, the default DataParallel is used as a wrapper around 'model':
(line 67-68)
if args.distributed and torch.cuda.is_available() and torch.cuda.device_count() > 1:
self.model = torch.nn.DataParallel(self.model)
However, later when self.model.bert is called, the DataParallel object cannot access a model's attributes. I think a custom wrapper has to be implemented, that will called self.model.module.{bert}.
Hi,
I am running your code with some custom-created splits for the FSC dataset. Running lines 123 and 135 of forward() and forward_text(), respectively, in the models/model.py file causes the following error message:
Traceback (most recent call last):
File "train.py", line 35, in <module>
runner.train()
File ".../alexa-end-to-end-slu/experiments/experiment_base.py", line 68, in train
train_loss, train_acc = self.train_step(batch)
File ".../alexa-end-to-end-slu/experiments/experiment_base.py", line 121, in train_step
metrics = self.compute_loss(batch)
File ".../alexa-end-to-end-slu/experiments/experiment_triplet.py", line 111, in compute_loss
output_pos = self.model(input_text=batch['encoded_text2'],
File "/home/aostapen/miniconda3/envs/cs2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File ".../alexa-end-to-end-slu/models/model.py", line 108, in forward
return **self.forward_text(input_text, text_lengths)**
File ".../alexa-end-to-end-slu/models/model.py", line 137, in forward_text
**text_logits = self.classifier(text_embedding)**
File "/home/aostapen/miniconda3/envs/cs2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/aostapen/miniconda3/envs/cs2/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "/home/aostapen/miniconda3/envs/cs2/lib/python3.8/site-packages/torch/nn/functional.py", line 1368, in linear
**if input.dim() == 2 and bias is not None:**
**AttributeError: 'str' object has no attribute 'dim'**
This is because text_embedding is actually saved as the key 'pooler_output' returned in the BertModel forward(). Perhaps these lines should be:
_, text_embedding = self.bert(input_ids=input_text, attention_mask=attn_mask)[:2]
Thanks in advance.
We are a group of NYU MS in Data Science students who are working on developing an end-to-end speech-to-intent model. We have read your paper and replicated your code and would love to ask you some questions.
Paper vs. Github Results Discrepancy
We notice that the final test accuracies for both FSC and SNIPS are different in your paper (ie. 97.65% for FSC, 73.49% for SNIPS) and the github repo (ie. 95.65% for FSC, 69.88% for SNIPS). Can you share some thoughts on the difference between the number in the paper and git repo?
SNIPS Data Partition Ambiguity
In prepare_snips.py, we notice that you split complete.csv into train-val-test. However, since we don’t have this complete.csv that you used, we can’t replicate the exact same partitions. Our results from running your code on our SNIPS dataset using our own splits are significantly higher on average: we ran 4 times (each time using our own splits of shuffled complete.csv), and average accuracy is 81.17% though we use the same environment mentioned in your git repo. We’d love to double check with you on these points.
Which subsets of the SNIPS dataset did you use to create the complete.csv? Our guess is that you used smartLight close-field and far-field (3320 observations) for your experiments (ie. are these the data listed in your complete.csv). Please let us know if that’s incorrect.
Would you mind sharing your complete.csv and intents.json for SNIPS with us? We believe having the input data in the same format/split is important to draw a fair comparison between yours and our future work.
BERT Embeddings Fine-tuned or Not.
Section 2.1 of your paper says “we back-propagate the embedding and SLU task losses only to the acoustic branch” because you think fine-tuning BERT will lead to overfitting. From this line, our understanding was that the BERT embeddings would be frozen. However, we’ve noticed this piece in the code where the parameters were passed into the Adam optimizer with the learning rate 2e-5 (line 63 in experiment_triplet.py), implying that BERT embeddings would be fine-tuned.
self.optimizer = torch.optim.Adam([ {'params': self.model.bert.parameters(), 'lr':args.learning_rate_bert}, {'params': self.model.speech_encoder.parameters()}, {'params': self.model.classifier.parameters()} ], lr=args.learning_rate)
We would appreciate it if you can give us clarification on whether BERT is fine-tuned and, if so, the reason you chose to fine-tune BERT. Furthermore, in the case where BERT’s parameters are not frozen, could you share some thoughts on fine-tuning BERT for 20 epochs (default in the code), which may lead to overfitting and hurting the text embeddings? As mentioned in other papers about BERT, the typical number of epochs for fine-tuning BERT is 5 at max.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.