sosuperic / meansum Goto Github PK
View Code? Open in Web Editor NEWLicense: Other
License: Other
hi , when i train the No pre-trained language model ,why the nll loss is Nan sometimes ?
hello , i have a question. I can't find the model path in project_settings.py and the class Discriminator that appears in train_sum.py also not found
Is there a script that allows to evaluate against YELP reference summaries?
Hello, I want to train this model on my custom data which is not as large as the yelp one. So, I wanted to use fasttext to build the word encoder instead of initialising them with zero. How can I use that in this code?
When i change the batch_size = 16 , emb_size = 32 , and hidden_size = 64 , the text_cnn.py throughs a error : RuntimeError: Expected 4-dimensional input for 4-dimensional weight [128, 1, 3, 32], but got 5-dimensional input of size [16, 1, 150, 31688, 32] instead
How can i solve it ?
How to run docker image??
Hi, I am trying to run MeanSum on a cluster where cuda is not available, so when loading the language model, it fails when trying to deserialize and gives `AttributeError: module 'torch._C' has no attribute '_cuda_getDevice', which I guess is expected since cuda is not enabled. Is it possible to run MeanSum with just CPUs?
Hi@sosuperic , According to the paper, when we back propagates the model, there should be an autoencoder reconstruction loss and average summary similarity loss, but I couldn't find the code part of autoenc_loss in train_sum.py(only see the backward part and the file address of the autoencoder is none in project_setting.py), please tell me where are they, thank you.
Another problem is, i think, cycle_loss should be the average similarity loss, but I find the parameter
self.cycle_loss='enc' # When 'rec', reconstruct original texts. When 'enc', compare rev_enc and sum_enc embs
in project_settings.py. now, I am confused about the meaning of cycle loss? Dose it mean we could only choose one of those two losses to backward or what is the actual meaning of the parameter. Looking forward to your reply! Thank you again.
I got an error which says
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/yelp_dataset/business.json'
Where can I get business.json?
Hi, very much thanks for sharing your implementation.
I tried to apply this model to the Chinese corpus without labels. I reused the default parameter settings but only decreased the batch size to be 8 because the "out of memory" problem occurred when the batch size is larger. But the generation turned out to be poor after training for 20 epochs. By the way, I pretrained the language model with the "pretrained_lm" code.
Is there anything I need to pay special attention to for training? such as the number of documents? the length of each document? the size of vocabulary?
Looking forward to your reply.
Thanks for the interesting work and code
I was trying to get my head around the code and I couldn't understand something:
When training the mlstm
model If we try the following set of parameters:
In the line mlstm #L291
The logits_to_prob
function will return a strict one hot vector according to the Torch gumbel softmax implementation F.gumbel_softmax
Afterwards, this prob vector is sent to prob_to_vocab_id
method which is supposed to apply either torch.top_k
(beam search) or torch.multinomial
(top k sampling).
Implementation wise this shouldn't show any errors in beam search because of the torch.topk
function ability to handle draws, however, the top k you get aren't the actual top k probabilities e.g.
But if you try to sample multinomial
from 1 hot vector where K > 1 you get a runtime error:
Am I missing something here?
Hello , i have a question. Is the language model provided for download generated by pretrain_lm.py ?
Hey!
I'm testing your model in another dataset. The WO and Sent Acc were used as metric in your paper. Did you provide them in this repo ?
datasets/amazon_dataset/processed/subwordenc_32000_secondpass.pkl' not found
There is a bug in the implementation of the extractive baseline you are using, mentioned here: gaetangate/text-summarizer#5 Did you fix that bug before evaluating this model?
hello , i want to reduce the size of the vocabulary ๏ผ can you provied the python file that produced the "subwordenc_32000_maxrevs260_fixed.pkl'"
The model has been trained. Which file is used to measure performance ? Is the file run_evaluations.py ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.