Giter Club home page Giter Club logo

wice's Introduction

I am a Ph.D. student at Penn State University advised by Dr. Rui Zhang. Iโ€™m interested in building reliable and trustworthy NLP systems.

[Personal Website] [Google Scholar] [Semantic Scholar]

Datasets

Other Resources

wice's People

Contributors

ryokamoi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

stanleyjacob

wice's Issues

Train data

Hello!
It seems that there is no train dataset. Should I just run "run_dataset_preprocessing.sh" to get train data by myself?

and where is the test data--"a paired bootstrap test" described in the paper? Do I need to sample data by myself?

and what's the function of the code below, why "add oracle sentences to training data". Does the Table 5 in the paper use the code below to get more train data?

##add oracle sentences to training data
if "train" in args.split:
chunks_output_list.append(chunks_output_dict)

Thanks!

finetune T5 model on WICE's subclaim data

Hello!
I have some questions about the process of finetuning T5 on subclaim data:

  1. train on the three-way classification
  2. use the MAX entailment strategy to get the classification probabilities of the subclaim data such as "test00561-0", "test00561-1".....
  3. use the harmonic mean to calculate the classification probabilities of the claim data such as "test00561"( or if the results of Tabel 5 does not calculate the classification probabilities of the claim data? Just 1, 2 step? )
  4. the results of Tabel 5 : if the classification probabilities of T5 > 0.5 then we classify it as "e"( supported claim ) ?

Because my results on subclaim are lower than that in paper(85.1, 82.7 Table 5), I want to know if the choices above are correct?
Thanks!

Request for Release of Remaining Dataset

Hi there,

I would like to express my appreciation for your work and the release of the WICE. I believe it is a great resource for the research community.

I'm also interested in the remaining part (the sub-claims of VitaminC, PAWS, and FRANK). Would you mind releasing them as well?

Thank you for your attention, and I look forward to hearing back from you soon.

subclaim test data

Hello!
the subclaim json data in path "code_and_resources/model_outputs/entailment_classification/oracle_chunks/subclaim/retrieval=oracle,entailment=t5-3b-anli-wice/test.entailment_score.json" are only part of claim test data?
Is it wrong?
The test data on claim level are 100 samples, but the data on subclaim level should be more than claim data(the file has 100 samples too).
in the paper , does the experiment on subclaim level uses the 100 samples as claim data or just uses the data shown in the file( i think it may be wrong?)
Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.