sosuperic / meansum Goto Github PK

View Code? Open in Web Editor NEW

111.0 111.0 51.0 364 KB

License: Other

Python 99.59% Shell 0.41%

meansum's People

Contributors

Stargazers

Watchers

meansum's Issues

About the nll Loss

hi , when i train the No pre-trained language model ,why the nll loss is Nan sometimes ?

About the Discriminator model

hello , i have a question. I can't find the model path in project_settings.py and the class Discriminator that appears in train_sum.py also not found

Script for evaluation against reference summaries?

Is there a script that allows to evaluate against YELP reference summaries?

Hello, I want to train this model on my custom data which is not as large as the yelp one. So, I wanted to use fasttext to build the word encoder instead of initialising them with zero. How can I use that in this code?

About the file: text_cnn.py

When i change the batch_size = 16 , emb_size = 32 , and hidden_size = 64 , the text_cnn.py throughs a error : RuntimeError: Expected 4-dimensional input for 4-dimensional weight [128, 1, 3, 32], but got 5-dimensional input of size [16, 1, 150, 31688, 32] instead

How can i solve it ?

Docker image

How to run docker image??

Running MeanSum without CUDA

Hi, I am trying to run MeanSum on a cluster where cuda is not available, so when loading the language model, it fails when trying to deserialize and gives `AttributeError: module 'torch._C' has no attribute '_cuda_getDevice', which I guess is expected since cuda is not enabled. Is it possible to run MeanSum with just CPUs?

Cant Find the reconstruction part

Hi@sosuperic , According to the paper, when we back propagates the model, there should be an autoencoder reconstruction loss and average summary similarity loss, but I couldn't find the code part of autoenc_loss in train_sum.py(only see the backward part and the file address of the autoencoder is none in project_setting.py), please tell me where are they, thank you.

Another problem is, i think, cycle_loss should be the average similarity loss, but I find the parameter

self.cycle_loss='enc' # When 'rec', reconstruct original texts. When 'enc', compare rev_enc and sum_enc embs

in project_settings.py. now, I am confused about the meaning of cycle loss? Dose it mean we could only choose one of those two losses to backward or what is the actual meaning of the parameter. Looking forward to your reply! Thank you again.

Where can I get business.json?

I got an error which says
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/yelp_dataset/business.json'
Where can I get business.json?

Issues about training a model on Chinese corpus

Hi, very much thanks for sharing your implementation.
I tried to apply this model to the Chinese corpus without labels. I reused the default parameter settings but only decreased the batch size to be 8 because the "out of memory" problem occurred when the batch size is larger. But the generation turned out to be poor after training for 20 epochs. By the way, I pretrained the language model with the "pretrained_lm" code.
Is there anything I need to pay special attention to for training? such as the number of documents? the length of each document? the size of vocabulary?
Looking forward to your reply.

Possible bug when setting k > 1 & Gumbel_hard = True

Thanks for the interesting work and code

I was trying to get my head around the code and I couldn't understand something:

When training the mlstm model If we try the following set of parameters:

gumbel_hard = true
sampling method = "greedy" or "sample"
k > 1

In the line mlstm #L291
The logits_to_prob function will return a strict one hot vector according to the Torch gumbel softmax implementation F.gumbel_softmax

Afterwards, this prob vector is sent to prob_to_vocab_id method which is supposed to apply either torch.top_k (beam search) or torch.multinomial (top k sampling).

Implementation wise this shouldn't show any errors in beam search because of the torch.topk function ability to handle draws, however, the top k you get aren't the actual top k probabilities e.g.

But if you try to sample multinomial from 1 hot vector where K > 1 you get a runtime error:

Am I missing something here?

sosuperic / meansum Goto Github PK

meansum's People

Contributors

Stargazers

Watchers

Forkers

meansum's Issues

Recommend Projects

Recommend Topics

Recommend Org