Select an event and see our annotations (or type to search):
Importantly, some events invoke highly selective commonsense anticipations, while others invoke much more diverse anticipations. Knowledge about this varying degree of uncertainty (i.e., a relatively flat distribution over diverse inference) is a natural and important part of our commonsense knowledge. Thus, for some events, it is correct to see diverse annotations.
Yes! the reason why this is possible is the same as the reason why it is possible to train a "language model". Despite the high variation in language, it is possible to learn the generalizable patterns in language as probabilistic models. We view commonsense also as a stochastic modeling problem.
To shed light on data quality for all dimensions, we run a separate data quality verification study on a random subset of 100 events, asking five MTurkers to validate whether an individual annotation is correct given an event and dimension. We find that on average, annotations are valid 86% of the time, with a breakdown per dimension shown below.
Disclaimer/Content warning: the events in atomic have been automatically extracted from blogs, stories and books written at various times. The events might depict violent or problematic actions, which we left in the corpus for the sake of learning the (probably negative but still important) commonsense implications associated with the events. We removed a small set of truly out-dated events, but might have missed some so please email us if you have any concerns.