Giter Club home page Giter Club logo

dyle's People

Contributors

chenwu98 avatar jinhyeong-lim avatar maoziming avatar shichaosun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dyle's Issues

Questions about window size settings

Hi, thanks for your outstanding work!
I noticed that when preprocessing data for the generator, the length of the content of a turn of conversations is determined by the window_size and max_source_len. You set the window size to 0, which means that the content of the conversation is limited to the sentence itself, without any contextual information (assuming that the sentence contains less than the maximum length of words).
May I ask what is the reason for setting the window size to 0? Is completing some contextual information not beneficial in terms of performance。

not achieve the results

Thank you for your outstanding work, but when I downloaded the qmsum checkpoint you provided, I did not achieve the results of my paper on my server, which was 1-2 points lower than the results in the paper. May I ask what the reason is? The server is 3080

About oracles of arxiv

Hello, sorry to bother you. I have difficulty in processing oracles of arxiv training set. I just run arxiv_oracle.py and wait. I trained for 4 days and it only produces 4300 files in index_train. Can you check this python files? I'm really confused now.

Minimum GPU requirement

I was wondering what's the minimum amount of VRAM is required even to test the models? also for training? I have 2 16 GB cards but I couldn't make it work even with batch size 1.

CUDA out of memory.

Thanks for your outstanding work!

When I was training your model, I encountered an out-of-memory issue. I am using a Tesla V100S 32G GPU, and although I tried reducing the batch size to 1, the problem still persists. Is there any way to reduce the memory consumption during training?

About dynamic weight and irrelevant snippets

Hello, I have read your paper and I am trying to run your code. You say this model can denoise by down-weighting irrelevant snippets but I can't find this filtering in your code. So how does it recogonize irrelevant snippets?By scoring similarity?

About detach_generator_consistency

Hi, thanks for your outstanding work!
I saw all the detach_generator_consistency in the source code is set to False. In this setting, the dynamic_mlp can only be optimized through the seq_loss path after marginalizing.

I wonder why we do not set detach_generator_consistency to True. Does it because the actual training results were worse? But I did not see it reported in the paper.

If it got worse, maybe the model collapsed because the dynamic scores and doc scores are easily predicted to all zeros or there were other reasons. Could you give me some explanations?
Thanks a lot! @MaoZiming

About the method of learning rate decay

Hello! Sorry to bother you again. In Experiment.py, it writes "no_improvement = self.seq_evaluate_gen(test=False, beam_size=beam_size)" and " if no_improvement and self.iter_num > config.start_decay:" to achieve learning rate decay. But self.seq_evaluate_gen doesn't return anything, personally I set the "no_improvement = True" if the evaluation rouge scores don't improve, which means if "metric > self.best_metric", then "no_improvement = False". Am I right?
I am trying to achieve the expected scores mentioned in your paper, but my scores are much lower and it doesn't improve much in training iteration.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.