Giter Club home page Giter Club logo

meansum's People

Contributors

ayushoriginal avatar sosuperic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

meansum's Issues

About the nll Loss

hi , when i train the No pre-trained language model ,why the nll loss is Nan sometimes ?

About the Discriminator model

hello , i have a question. I can't find the model path in project_settings.py and the class Discriminator that appears in train_sum.py also not found

Use word embeddings

Hello, I want to train this model on my custom data which is not as large as the yelp one. So, I wanted to use fasttext to build the word encoder instead of initialising them with zero. How can I use that in this code?

About the file: text_cnn.py

When i change the batch_size = 16 , emb_size = 32 , and hidden_size = 64 , the text_cnn.py throughs a error : RuntimeError: Expected 4-dimensional input for 4-dimensional weight [128, 1, 3, 32], but got 5-dimensional input of size [16, 1, 150, 31688, 32] instead

How can i solve it ?

Running MeanSum without CUDA

Hi, I am trying to run MeanSum on a cluster where cuda is not available, so when loading the language model, it fails when trying to deserialize and gives `AttributeError: module 'torch._C' has no attribute '_cuda_getDevice', which I guess is expected since cuda is not enabled. Is it possible to run MeanSum with just CPUs?

Cant Find the reconstruction part

Hi@sosuperic , According to the paper, when we back propagates the model, there should be an autoencoder reconstruction loss and average summary similarity loss, but I couldn't find the code part of autoenc_loss in train_sum.py(only see the backward part and the file address of the autoencoder is none in project_setting.py), please tell me where are they, thank you.

Another problem is, i think, cycle_loss should be the average similarity loss, but I find the parameter

self.cycle_loss='enc' # When 'rec', reconstruct original texts. When 'enc', compare rev_enc and sum_enc embs

in project_settings.py. now, I am confused about the meaning of cycle loss? Dose it mean we could only choose one of those two losses to backward or what is the actual meaning of the parameter. Looking forward to your reply! Thank you again.

Where can I get business.json?

I got an error which says
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/yelp_dataset/business.json'
Where can I get business.json?

Issues about training a model on Chinese corpus

Hi, very much thanks for sharing your implementation.
I tried to apply this model to the Chinese corpus without labels. I reused the default parameter settings but only decreased the batch size to be 8 because the "out of memory" problem occurred when the batch size is larger. But the generation turned out to be poor after training for 20 epochs. By the way, I pretrained the language model with the "pretrained_lm" code.
Is there anything I need to pay special attention to for training? such as the number of documents? the length of each document? the size of vocabulary?
Looking forward to your reply.

Possible bug when setting k > 1 & Gumbel_hard = True

Thanks for the interesting work and code

I was trying to get my head around the code and I couldn't understand something:

When training the mlstm model If we try the following set of parameters:

  • gumbel_hard = true
  • sampling method = "greedy" or "sample"
  • k > 1

In the line mlstm #L291
The logits_to_prob function will return a strict one hot vector according to the Torch gumbel softmax implementation F.gumbel_softmax

Afterwards, this prob vector is sent to prob_to_vocab_id method which is supposed to apply either torch.top_k (beam search) or torch.multinomial (top k sampling).

Implementation wise this shouldn't show any errors in beam search because of the torch.topk function ability to handle draws, however, the top k you get aren't the actual top k probabilities e.g.

image

But if you try to sample multinomial from 1 hot vector where K > 1 you get a runtime error:
image

Am I missing something here?

About the language model

Hello , i have a question. Is the language model provided for download generated by pretrain_lm.py ?

How to evaluate the model?

The model has been trained. Which file is used to measure performance ? Is the file run_evaluations.py ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.