ReviewData Dataset
This is a relational dataset of academic peer-review. It consists of four main tables and relations between them. The data covers two thousand submissions to ten conferences and workshops in computer science. The years 2017—2019 are represented. Importantly, it contains both accepted and rejected submissions.
ReviewData was created by compiling data from OpenReview, Scopus and the Shanghai University Rankings.
If you use this dataset, kindly cite the paper “Causal Relational Learning’’ as follows:
@inproceedings{DBLP:conf/sigmod/SalimiPKGRS20,
author = {Babak Salimi and
Harsh Parikh and
Moe Kayali and
Lise Getoor and
Sudeepa Roy and
Dan Suciu},
title = {Causal Relational Learning},
booktitle = {{SIGMOD} Conference},
pages = {241--256},
publisher = {{ACM}},
year = {2020}
}
Download .xz format (2.9 MiB zipped, 18.4 MiB unzipped)
Download .zip format (4.3 MiB zipped)
Unzipped SHA1 checksum: 8288413d7e5f9803708ea2244ee3c742e1df6176
.
Data description
Data is provided in SQLite 3 format. Schemas for the four tables—Authors
, Conferences
, Contributed
, Papers
, and Reviews
—are provided below.
Authors
Table Schema
Attribute | Description | Type |
---|---|---|
aid |
Author ID, primary key | integer |
name |
Full name of the author | string |
email |
Author’s email | string |
inst_guess |
Guess of the author’s main affiliation (whois lookup against email domain) | string |
world_rank |
Ranking of the author’s main affiliation | integer |
document_count |
Count of papers the author has published | integer |
citation_count |
Sum of citations this author has recieved | integer |
h_index |
h-index of author | integer |
coauthor_count |
Total count of lifetime collaborators | integer |
year_experience |
Length of academic publication career | integer |
Conferences
Table Schema
Attribute | Description | Type |
---|---|---|
cid |
Conference ID, primary key | integer |
name |
Name and year of the conference | string |
accept_count |
Count of papers accepted at the conference | integer |
reject_count |
Count of papers rejected from conference | integer |
selectivity |
Synthetic, accept_count / reject_count |
real |
is_workshop |
True if a workshop, false if a conference | bool |
double_blind |
True if double-blind reviewing used, false if single-blind reviewing used | bool |
Contributed
Table Schema
Attribute | Description | Type |
---|---|---|
ctr_aid |
ID of author that contributed, primary key (1/2) | integer |
ctr_pid |
ID of paper that was contributed to, primary key (2/2) | string |
Papers
Table Schema
Attribute | Description | Type |
---|---|---|
pid |
Paper ID, primary key | string |
title |
Paper Title | string |
abstract |
Paper Abstract | string |
decision |
True if accepted, false if rejected | bool |
submitted_to |
Conference ID of venue paper was submitted to | integer |
Reviews
Table Schema
Attribute | Description | Type |
---|---|---|
rid |
Review ID, primary key | integer |
review_of |
ID of the paper this review is about | string |
title |
Review title | string |
review |
Review body text | string |
rating |
Rating on [0, 1] of paper quality, where 1 is a perfect score, normalized across conferences | real |
confidence |
Rating on [0, 1] of reviewer confidence, where 1 is total certainty, normalized accross conferences | real |
raw_rating |
Raw rating string, not comparable across conferences | string |
raw_confidence |
Raw confidence string, not comparable across conferences | string |