Giter Club home page Giter Club logo

scpn's Introduction

scpn

Code to train models from "Adversarial Example Generation with Syntactically Controlled Paraphrase Networks".

The code is written in python and requires Pytorch 3.1.

To get started, download trained models (scpn.pt and parse_generator.pt) from https://drive.google.com/file/d/1AuH1aHrE9maYttuSJz_9eltYOAad8Mfj/view?usp=sharing and place them in the models directory.

To train, download the training data from https://drive.google.com/file/d/1x1Xz3KQP_Ncu3DVPhOCsAwwOlFgcre5H/view?usp=sharing and move it to the data folder. Check train_scpn.py for training command line options. To train a model from scratch with default settings run train.sh.

There is also a demo script (run demo.sh) that will generate paraphrases from a set of templates (check the script to see available choices).

run your own data through our pretrained model

To run your own sentences through the pretrained model, first parse them using Stanford CoreNLP. We used CoreNLP version 3.7.0 for the results in the paper, along with the Shift-Reduce Parser models (https://nlp.stanford.edu/software/stanford-srparser-2014-10-23-models.jar). More concretely, the command we ran to parse the ParaNMT dataset used in our paper is:

java -Xmx12g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -threads 1 -annotators tokenize,ssplit,pos,parse -ssplit.eolonly -filelist filenames.txt -outputFormat text -parse.model edu/stanford/nlp/models/srparser/englishSR.ser.gz -outputDirectory /outputdir/

After parsing, run the extract_parses function in read_paranmt_parses.py and write the results to a CSV. There are also some alternate parse labeling functions in read_paranmt_parses.py if you are interested in exploring them. We plan to clean up the code in this file in the future :)

If you use our code or models for your work please cite:

@inproceedings{iyyer-2018-controlled, author = {Iyyer, Mohit and Wieting, John and Gimpel, Kevin and Zettlemoyer, Luke}, title = {Adversarial Example Generation with Syntactically Controlled Paraphrase Networks}, booktitle={Proceedings of NAACL}, year = {2018} }

If you use the data please cite the above and:

@inproceedings{wieting-17-millions, author = {John Wieting and Kevin Gimpel}, title = {Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations}, booktitle = {Proceedings of ACL}, year = {2018} }

scpn's People

Contributors

jwieting avatar miyyer avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.