Peer Review and Rebuttal Counter-Arguments (PRRCA) Dataset

Dataset for the paper Incorporating Peer Reviews and Rebuttal Counter-Arguments for Meta-Review Generation.

Content

Folder Structure
ICLR_Submission
- Dataset Structure
- Instance
MetaReview_Generation_Corpus

Folder Structure

├── ICLR_Submission (separated by years)
│     ├── ICLR_2017.json
│     ├── ICLR_2018.json
│     ├── ICLR_2019.json
│     ├── ICLR_2020.json
│     ├── ICLR_2021.json
│     └── ICLR_2022.json
│
├── MetaReview_Generation_Corpus (access by each submission year and forum id)
│     ├──  2020_H1gBhkBFDH.json
│     ├──  2019_rket4i0qtX.json
│     ├──           ...
│     ├──           ...
│     ├──           ...
│     └── 2021_ASAJvUPWaDI.json

Do not upload ICLR_Submission since it exceeds the maximum file size limit. Can download it via the google drive link

ICLR_Submission

This folder contains all the submission related data we crawl from OpenReview platform.

The raw data is available as json files separated by its submission year.

Each submission can be accessed by its forum id

Dataset Structure

Instance

example of item access by the json dictionary

├── forum (Sy0GnUxCb - Unique id from Openreview)
│
├── submission_title (paper title)
│
├── reviews (subdict access with review id)
│    │ 
│    ├──(key) Sy0GnUxCb - 0
│    │    ├── review_id (Sy0GnUxCb - 0)
│    │    ├── review_title
│    │    ├── review (review content)
│    │    ├── rating (review score from 0 to 9)
│    │    │
│    │    ├── first_reply (rebuttal content)
│    │    │     ├── title
│    │    │     ├── tcdate (create time)
│    │    │     ├── tmdate (last modified time)
│    │    │     ├── number (thread order sorted by tcdate)
│    │    │     ├── id (thread id)
│    │    │     ├── replyto (reply content id)
│    │    │     ├── writer
│    │    │     ├── content
│    │    │     └── aspect_labels
│    │    │
│    │    ├── tcdate (review create time)
│    │    ├── tmdate (review last modified time)
│    │    │
│    │    ├── discussion_thread (list of discussion of the reviews)
│    │    │      └──list of discussion that same as first_reply structure
│    │    │
│    │    ├── conformity (review quality) (list of conformity score range from 1 to 4)
│    │    │     ├── WorkerId
│    │    │     └── rating (1 to 4)
│    │    │
│    │    ├── aspect_labels (list of aspect polarity)
│    │    │     ├── start position (character index)
│    │    │     ├── end position (character index)
│    │    │     └── polarity (motivation_positive)
│    │    │
│    │    ├── has_RR_pair (True, False) (Whether have RR alignment pair)
│    │    │
│    │    ├── Review_ADU (List of Review's ADUs with label)
│    │    │     ├── start (start index of ADU)
│    │    │     ├── end (end index of ADU)
│    │    │     ├── label (ADU label align with Reply label)
│    │    │     └── sent (ADU span)
│    │    │
│    │    └── Reply_ADU (List of Reply's ADUs with label similar to Review_ADU)
│    ├──
│    
├── Decision (one of four)
│     ├──Accept (Poster)
│     ├──Accept (Spotlight)
│     ├──Accept (Oral)
│     └──Reject
│    
└── MetaReview

MetaReview_Generation_Corpus

The data we used to generated MetaReview

For each submission, we collect the review, rebuttal content, reviewers ratings, and the final decision.

The raw data is available as json files separated by each submission with its ++submission year and forum id++.

Corpus Instance

key value

year 2020 (Submission year)

forum HkxlcnVFwB (Unique id from Openreview)

title GenDICE: Generalized Offline Estimation of Stationary Values (Submission Paper Title)

decision Accept (Oral)

meta_review The authors develop a framework for off-policy value estimation for infinite horizon RL tasks, for estimating the stationary distribution of a Markov chain. Reviewers were uniformly impressed by the work, and satisfied by the author response. Congratulations!

reviews

key	value
review_id	`HkxlcnVFwB-0 (Review index)`
review_text	`This paper proposes a new estimator to infer the stationary distribution of a Markov chain, with data from another Markov chain. This paper tackles an interesting problem with an increasing number of studies in the reinforcement learning community and gives a practical algorithm with strong empirical justification, as well as theoretical justification. I think this paper should be accepted.`
reply_text	`Thanks for the encouraging comments. We will keep improving the draft. We have refined the paper as listed above in the summary of revisions.Best,Authors.`
rating	`8: Accept`

Dataset Analytic

Year	# Submissions	Avg Rating	Acceptance	Avg Meta-review Len
2017	293	5.94	45.39%	114.83
2018	677	5.70	43.72%	104.56
2019	1153	5.69	41.63%	147.11
2020	1807	4.68	34.86%	128.92
2021	2208	5.62	35.73%	182.96
Total	6138	5.40	37.93%	148.42

Cited Corpus

Aspect Typology Label
- Paper - Can We Automate Scientific Reviewing?
- Link - Dataset (Aspect Tagger)
Argumentative Structure
- Paper - APE: Argument Pair Extraction from Peer Review and Rebuttal via Multi-task Learning
- Link - Review Rebuttal Dataset

ntunlplab / prrca Goto Github PK

prrca's Introduction

Peer Review and Rebuttal Counter-Arguments (PRRCA) Dataset

Content

Folder Structure

ICLR_Submission

Dataset Structure

Instance

MetaReview_Generation_Corpus

Corpus Instance

Dataset Analytic

Cited Corpus

prrca's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent