Improving Reasoning Performance in Large Language Models via Representation Engineering

Bibliographic Details
Title: Improving Reasoning Performance in Large Language Models via Representation Engineering
Authors: Højer, Bertram, Jarvis, Oliver, Heinrich, Stefan
Publication Year: 2025
Collection: Computer Science
Subject Terms: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
More Details: Recent advancements in large language models (LLMs) have resulted in increasingly anthropomorphic language concerning the ability of LLMs to reason. Whether reasoning in LLMs should be understood to be inherently different is, however, widely debated. We propose utilizing a representation engineering approach wherein model activations are read from the residual stream of an LLM when processing a reasoning task. The activations are used to derive a control vector that is applied to the model as an inference-time intervention, modulating the representational space of the model, to improve performance on the specified task. We publish the code for deriving control vectors and analyzing model representations. The method allows us to improve performance on reasoning benchmarks and assess how control vectors influence the final logit distribution of a model via metrics such as KL divergence and entropy. We apply control vectors to Mistral-7B-Instruct and a range of Pythia models on an inductive, a deductive and mathematical reasoning task. We show that an LLM can, to a certain degree, be controlled to improve its perceived reasoning ability by modulating activations. The intervention is dependent upon the ability to reliably extract the model's typical state when correctly solving a task. Our results suggest that reasoning performance can be modulated in the same manner as other information-processing tasks performed by LLMs and demonstrate that we are capable of improving performance on specific tasks via a simple intervention on the residual stream with no additional training.
Comment: Has been accepted at "The Thirteenth International Conference on Learning Representations (ICLR 2025)" Link to publication: https://openreview.net/forum?id=IssPhpUsKt
Document Type: Working Paper
Access URL: http://arxiv.org/abs/2504.19483
Accession Number: edsarx.2504.19483
Database: arXiv
FullText Text:
  Availability: 0
CustomLinks:
  – Url: http://arxiv.org/abs/2504.19483
    Name: EDS - Arxiv
    Category: fullText
    Text: View this record from Arxiv
    MouseOverText: View this record from Arxiv
  – Url: https://resolver.ebsco.com/c/xy5jbn/result?sid=EBSCO:edsarx&genre=article&issn=&ISBN=&volume=&issue=&date=20250428&spage=&pages=&title=Improving Reasoning Performance in Large Language Models via Representation Engineering&atitle=Improving%20Reasoning%20Performance%20in%20Large%20Language%20Models%20via%20Representation%20Engineering&aulast=H%C3%B8jer%2C%20Bertram&id=DOI:
    Name: Full Text Finder (for New FTF UI) (s8985755)
    Category: fullText
    Text: Find It @ SCU Libraries
    MouseOverText: Find It @ SCU Libraries
Header DbId: edsarx
DbLabel: arXiv
An: edsarx.2504.19483
RelevancyScore: 1147
AccessLevel: 3
PubType: Report
PubTypeId: report
PreciseRelevancyScore: 1146.59326171875
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Improving Reasoning Performance in Large Language Models via Representation Engineering
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Højer%2C+Bertram%22">Højer, Bertram</searchLink><br /><searchLink fieldCode="AR" term="%22Jarvis%2C+Oliver%22">Jarvis, Oliver</searchLink><br /><searchLink fieldCode="AR" term="%22Heinrich%2C+Stefan%22">Heinrich, Stefan</searchLink>
– Name: DatePubCY
  Label: Publication Year
  Group: Date
  Data: 2025
– Name: Subset
  Label: Collection
  Group: HoldingsInfo
  Data: Computer Science
– Name: Subject
  Label: Subject Terms
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Computer+Science+-+Machine+Learning%22">Computer Science - Machine Learning</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Science+-+Artificial+Intelligence%22">Computer Science - Artificial Intelligence</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Science+-+Computation+and+Language%22">Computer Science - Computation and Language</searchLink>
– Name: Abstract
  Label: Description
  Group: Ab
  Data: Recent advancements in large language models (LLMs) have resulted in increasingly anthropomorphic language concerning the ability of LLMs to reason. Whether reasoning in LLMs should be understood to be inherently different is, however, widely debated. We propose utilizing a representation engineering approach wherein model activations are read from the residual stream of an LLM when processing a reasoning task. The activations are used to derive a control vector that is applied to the model as an inference-time intervention, modulating the representational space of the model, to improve performance on the specified task. We publish the code for deriving control vectors and analyzing model representations. The method allows us to improve performance on reasoning benchmarks and assess how control vectors influence the final logit distribution of a model via metrics such as KL divergence and entropy. We apply control vectors to Mistral-7B-Instruct and a range of Pythia models on an inductive, a deductive and mathematical reasoning task. We show that an LLM can, to a certain degree, be controlled to improve its perceived reasoning ability by modulating activations. The intervention is dependent upon the ability to reliably extract the model's typical state when correctly solving a task. Our results suggest that reasoning performance can be modulated in the same manner as other information-processing tasks performed by LLMs and demonstrate that we are capable of improving performance on specific tasks via a simple intervention on the residual stream with no additional training.<br />Comment: Has been accepted at "The Thirteenth International Conference on Learning Representations (ICLR 2025)" Link to publication: https://openreview.net/forum?id=IssPhpUsKt
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: Working Paper
– Name: URL
  Label: Access URL
  Group: URL
  Data: <link linkTarget="URL" linkTerm="http://arxiv.org/abs/2504.19483" linkWindow="_blank">http://arxiv.org/abs/2504.19483</link>
– Name: AN
  Label: Accession Number
  Group: ID
  Data: edsarx.2504.19483
PLink https://login.libproxy.scu.edu/login?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsarx&AN=edsarx.2504.19483
RecordInfo BibRecord:
  BibEntity:
    Subjects:
      – SubjectFull: Computer Science - Machine Learning
        Type: general
      – SubjectFull: Computer Science - Artificial Intelligence
        Type: general
      – SubjectFull: Computer Science - Computation and Language
        Type: general
    Titles:
      – TitleFull: Improving Reasoning Performance in Large Language Models via Representation Engineering
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Højer, Bertram
      – PersonEntity:
          Name:
            NameFull: Jarvis, Oliver
      – PersonEntity:
          Name:
            NameFull: Heinrich, Stefan
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 28
              M: 04
              Type: published
              Y: 2025
ResultId 1