Fact-checking with Generative AI: A Systematic Cross-Topic Examination of LLMs Capacity to Detect Veracity of Political Information

Bibliographic Details
Title: Fact-checking with Generative AI: A Systematic Cross-Topic Examination of LLMs Capacity to Detect Veracity of Political Information
Authors: Kuznetsova, Elizaveta, Vitulano, Ilaria, Makhortykh, Mykola, Stolze, Martha, Nagy, Tomas, Vziatysheva, Victoria
Publication Year: 2025
Collection: Computer Science
Subject Terms: Computer Science - Computation and Language, Computer Science - Computers and Society
More Details: The purpose of this study is to assess how large language models (LLMs) can be used for fact-checking and contribute to the broader debate on the use of automated means for veracity identification. To achieve this purpose, we use AI auditing methodology that systematically evaluates performance of five LLMs (ChatGPT 4, Llama 3 (70B), Llama 3.1 (405B), Claude 3.5 Sonnet, and Google Gemini) using prompts regarding a large set of statements fact-checked by professional journalists (16,513). Specifically, we use topic modeling and regression analysis to investigate which factors (e.g. topic of the prompt or the LLM type) affect evaluations of true, false, and mixed statements. Our findings reveal that while ChatGPT 4 and Google Gemini achieved higher accuracy than other models, overall performance across models remains modest. Notably, the results indicate that models are better at identifying false statements, especially on sensitive topics such as COVID-19, American political controversies, and social issues, suggesting possible guardrails that may enhance accuracy on these topics. The major implication of our findings is that there are significant challenges for using LLMs for factchecking, including significant variation in performance across different LLMs and unequal quality of outputs for specific topics which can be attributed to deficits of training data. Our research highlights the potential and limitations of LLMs in political fact-checking, suggesting potential avenues for further improvements in guardrails as well as fine-tuning.
Comment: 15 pages, 2 figures
Document Type: Working Paper
Access URL: http://arxiv.org/abs/2503.08404
Accession Number: edsarx.2503.08404
Database: arXiv
FullText Text:
  Availability: 0
CustomLinks:
  – Url: http://arxiv.org/abs/2503.08404
    Name: EDS - Arxiv
    Category: fullText
    Text: View this record from Arxiv
    MouseOverText: View this record from Arxiv
  – Url: https://resolver.ebsco.com/c/xy5jbn/result?sid=EBSCO:edsarx&genre=article&issn=&ISBN=&volume=&issue=&date=20250311&spage=&pages=&title=Fact-checking with Generative AI: A Systematic Cross-Topic Examination of LLMs Capacity to Detect Veracity of Political Information&atitle=Fact-checking%20with%20Generative%20AI%3A%20A%20Systematic%20Cross-Topic%20Examination%20of%20LLMs%20Capacity%20to%20Detect%20Veracity%20of%20Political%20Information&aulast=Kuznetsova%2C%20Elizaveta&id=DOI:
    Name: Full Text Finder (for New FTF UI) (s8985755)
    Category: fullText
    Text: Find It @ SCU Libraries
    MouseOverText: Find It @ SCU Libraries
Header DbId: edsarx
DbLabel: arXiv
An: edsarx.2503.08404
RelevancyScore: 1147
AccessLevel: 3
PubType: Report
PubTypeId: report
PreciseRelevancyScore: 1146.57763671875
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Fact-checking with Generative AI: A Systematic Cross-Topic Examination of LLMs Capacity to Detect Veracity of Political Information
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Kuznetsova%2C+Elizaveta%22">Kuznetsova, Elizaveta</searchLink><br /><searchLink fieldCode="AR" term="%22Vitulano%2C+Ilaria%22">Vitulano, Ilaria</searchLink><br /><searchLink fieldCode="AR" term="%22Makhortykh%2C+Mykola%22">Makhortykh, Mykola</searchLink><br /><searchLink fieldCode="AR" term="%22Stolze%2C+Martha%22">Stolze, Martha</searchLink><br /><searchLink fieldCode="AR" term="%22Nagy%2C+Tomas%22">Nagy, Tomas</searchLink><br /><searchLink fieldCode="AR" term="%22Vziatysheva%2C+Victoria%22">Vziatysheva, Victoria</searchLink>
– Name: DatePubCY
  Label: Publication Year
  Group: Date
  Data: 2025
– Name: Subset
  Label: Collection
  Group: HoldingsInfo
  Data: Computer Science
– Name: Subject
  Label: Subject Terms
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Computer+Science+-+Computation+and+Language%22">Computer Science - Computation and Language</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Science+-+Computers+and+Society%22">Computer Science - Computers and Society</searchLink>
– Name: Abstract
  Label: Description
  Group: Ab
  Data: The purpose of this study is to assess how large language models (LLMs) can be used for fact-checking and contribute to the broader debate on the use of automated means for veracity identification. To achieve this purpose, we use AI auditing methodology that systematically evaluates performance of five LLMs (ChatGPT 4, Llama 3 (70B), Llama 3.1 (405B), Claude 3.5 Sonnet, and Google Gemini) using prompts regarding a large set of statements fact-checked by professional journalists (16,513). Specifically, we use topic modeling and regression analysis to investigate which factors (e.g. topic of the prompt or the LLM type) affect evaluations of true, false, and mixed statements. Our findings reveal that while ChatGPT 4 and Google Gemini achieved higher accuracy than other models, overall performance across models remains modest. Notably, the results indicate that models are better at identifying false statements, especially on sensitive topics such as COVID-19, American political controversies, and social issues, suggesting possible guardrails that may enhance accuracy on these topics. The major implication of our findings is that there are significant challenges for using LLMs for factchecking, including significant variation in performance across different LLMs and unequal quality of outputs for specific topics which can be attributed to deficits of training data. Our research highlights the potential and limitations of LLMs in political fact-checking, suggesting potential avenues for further improvements in guardrails as well as fine-tuning.<br />Comment: 15 pages, 2 figures
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: Working Paper
– Name: URL
  Label: Access URL
  Group: URL
  Data: <link linkTarget="URL" linkTerm="http://arxiv.org/abs/2503.08404" linkWindow="_blank">http://arxiv.org/abs/2503.08404</link>
– Name: AN
  Label: Accession Number
  Group: ID
  Data: edsarx.2503.08404
PLink https://login.libproxy.scu.edu/login?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsarx&AN=edsarx.2503.08404
RecordInfo BibRecord:
  BibEntity:
    Subjects:
      – SubjectFull: Computer Science - Computation and Language
        Type: general
      – SubjectFull: Computer Science - Computers and Society
        Type: general
    Titles:
      – TitleFull: Fact-checking with Generative AI: A Systematic Cross-Topic Examination of LLMs Capacity to Detect Veracity of Political Information
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Kuznetsova, Elizaveta
      – PersonEntity:
          Name:
            NameFull: Vitulano, Ilaria
      – PersonEntity:
          Name:
            NameFull: Makhortykh, Mykola
      – PersonEntity:
          Name:
            NameFull: Stolze, Martha
      – PersonEntity:
          Name:
            NameFull: Nagy, Tomas
      – PersonEntity:
          Name:
            NameFull: Vziatysheva, Victoria
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 11
              M: 03
              Type: published
              Y: 2025
ResultId 1