Computer Science > Computation and Language

arXiv:1803.05355 (cs)

[Submitted on 14 Mar 2018 (v1), last revised 18 Dec 2018 (this version, v3)]

Title:FEVER: a large-scale dataset for Fact Extraction and VERification

Authors:James Thorne, Andreas Vlachos, Christos Christodoulopoulos, Arpit Mittal

Abstract:In this paper we introduce a new publicly available dataset for verification against textual sources, FEVER: Fact Extraction and VERification. It consists of 185,445 claims generated by altering sentences extracted from Wikipedia and subsequently verified without knowledge of the sentence they were derived from. The claims are classified as Supported, Refuted or NotEnoughInfo by annotators achieving 0.6841 in Fleiss $\kappa$. For the first two classes, the annotators also recorded the sentence(s) forming the necessary evidence for their judgment. To characterize the challenge of the dataset presented, we develop a pipeline approach and compare it to suitably designed oracles. The best accuracy we achieve on labeling a claim accompanied by the correct evidence is 31.87%, while if we ignore the evidence we achieve 50.91%. Thus we believe that FEVER is a challenging testbed that will help stimulate progress on claim verification against textual sources.

Comments:	Updated version of NAACL2018 paper. Data is released on this http URL
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1803.05355 [cs.CL]
(or arXiv:1803.05355v3 [cs.CL] for this version)
https://doi.org/10.48550/arXiv.1803.05355 arXiv-issued DOI via DataCite

Submission history

From: James Thorne [view email]
[v1] Wed, 14 Mar 2018 15:30:37 UTC (2,165 KB)
[v2] Mon, 16 Apr 2018 23:08:25 UTC (2,497 KB)
[v3] Tue, 18 Dec 2018 10:58:20 UTC (2,497 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2018-03

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

James Thorne
Andreas Vlachos
Christos Christodoulopoulos
Arpit Mittal

Bookmark

👁 BibSonomy
👁 Reddit

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)

Connected Papers (What is Connected Papers?)

Litmaps (What is Litmaps?)

scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub (What is DagsHub?)

Gotit.pub (What is GotitPub?)

Hugging Face (What is Huggingface?)

ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)

Hugging Face Spaces (What is Spaces?)

TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)

CORE Recommender (What is CORE?)

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

URL: https://arxiv.org/abs/1803.05355

⇱ [1803.05355] FEVER: a large-scale dataset for Fact Extraction and VERification

Computer Science > Computation and Language

Title:FEVER: a large-scale dataset for Fact Extraction and VERification

Submission history

Access Paper:

Current browse context:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

URL: https://arxiv.org/abs/1803.05355

⇱ [1803.05355] FEVER: a large-scale dataset for Fact Extraction and VERification

Computer Science > Computation and Language

Title:FEVER: a large-scale dataset for Fact Extraction and VERification

Submission history

Access Paper:

Current browse context:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators