Giter Club home page Giter Club logo

k2t's Introduction

Keyword2Text

This repository contains the code of the paper: "A Plug-and-Play Method for Controlled Text Generation", if you find this useful and use it for your own research, please cite us.

Setup

  1. Download and unzip the repository.
  2. Create a new conda environment and install the required libraries from the requirements.txt file.
conda create -n k2t python=3.6
conda activate k2t
pip install -r requirements.txt

A GPU will be required to run the experiments. Make sure you have a results folder.

Run Model

Hyperparameter Study

Uncomment the appropriate lines of run.sh to run the hyperparameter experiments from the paper. For example,

python main.py -mode='next' -file_name=/data/50_keywordsets_eval/word_sets.txt -results_subfolder=guide_vs_no_guide_beams -weight=10.0 -top_p=0.9 -n_generated_sentences=90 -do_guarantee=True

runs K2T with ordered guide words (mode='next') on the random keywords dataset. It runs with lambda=weight=10, nucleus sampling with top-p=0.9, number of generated tokens = 90, and no weight annealing to guarantee word appearance. The results are saved in results/tmp

ROC Story dataset

Uncomment the appropriate line of run.sh to run the model on the ROC story dataset:

python main.py -mode='max' -file_name=/data/ROC/ROCStories_20_storylines_500_0.txt -results_subfolder=final4_ -weight=5.0 -top_p=0.9 -n_generated_sentences=-7 -n_beams=4 -do_guarantee=True -task='ROC'

News Article dataset

Uncomment the appropriate line of run.sh to run the model on the News Article story dataset:

python main_DBS.py -mode='max' -file_name=/data/keyword_to_articles -results_subfolder=tmp -weight=5.0 -top_p=0.9 -n_generated_sentences=-15 -n_beams=4 -do_guarantee=True -task='key2article'

Contents

├── data
│   ├── 50_keywordsets_eval
│   │   └── word_sets.txt
│   ├── keyword_to_articles
│   │   ├── test_10.txt
│   │   ├── test_12.txt
│   │   ├── test_13.txt
│   │   ├── test_14.txt
│   │   ├── test_15.txt
│   │   ├── test_16.txt
│   │   ├── test_4.txt
│   │   ├── test_5.txt
│   │   ├── test_8.txt
│   │   └── test_9.txt
│   └── ROC
│       └── ROCStories_20_storylines_500_0.txt
├── encode_keywords.py
├── encode_keywords_word2vec.py
├── main.py
├── metrics_degen.py
├── metrics_degen_run.sh
├── perplexity.py
├── README.md
├── requirements.txt
├── results
├── run.sh
└── utility_gpt.py


k2t's People

Contributors

dapascual avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

k2t's Issues

Generate with context sentence as input

Hi thank you for great work,

I have a question on how to generate with context sentence as input of your model as you did in your paper appendix.

For example:

Text 1

Context: Johnny Depp will be asked to make a fifth Pirates Of The Caribbean film if the fourth
instalment is a success. Producer Jerry Bruckheimer said he already has a screenplay

Keywords: fifth, audience, embraces, character, original, trilogy, continue, story, digital, cameras,
pirates, fresh, new, carry

OURS + GPT2-774M: Johnny Depp will be asked to make a fifth Pirates Of The Caribbean film if
the fourth instalment is a success. Producer Jerry Bruckheimer said he already has a screenplay with
Pirates new trilogy character Johnny Depp and digital animation studio Story Studio. The fifth film
will continue the original story from the first three films and will be set in the Caribbean. "We are very
excited to carry on the story of the audience’s favourite Pirates of the Caribbean characters," said
Bruckheimer. "We are fresh off the success of Pirates of the Caribbean: On Stranger Tides and we
are looking forward to embracing the new generation of fans with a new film that will be even more
exciting than the first three films." The Pirates of the Caribbean: On Stranger Tides cameras have been
filming in Barranquilla since April 11. Filming will continue for two months to reach completion by
May 23

Thank you again for your great work!

Inference time

Hi, I am testing your code, the result looks very good, however when I tried it (several times), your code need around 1 to 4 minutes to generate sentence for one article "data/keyword_to_articles/test_5.txt" (similar times for other articles). Is it expected? I use and old gpu (nvidia 1080), but I think it wouldn't make a big different with newer gpu.

License

Hi,
thanks for the clean and easy to read implementation! I would like to use and modify the code and add custom conditions for deleting branches, masking logits etc.
As far as I can see there is not any license applied to your codes, so officially I would not be allowed to do any of these and my university is strict in this concern. Would I be allowed to do above and present our work - of course with proper citation? It would be a building brick in a larger model.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.