Improving Reasoning Performance in Large Language Models via Representation Engineering
Title: | Improving Reasoning Performance in Large Language Models via Representation Engineering |
---|---|
Authors: | Højer, Bertram, Jarvis, Oliver, Heinrich, Stefan |
Publication Year: | 2025 |
Collection: | Computer Science |
Subject Terms: | Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language |
More Details: | Recent advancements in large language models (LLMs) have resulted in increasingly anthropomorphic language concerning the ability of LLMs to reason. Whether reasoning in LLMs should be understood to be inherently different is, however, widely debated. We propose utilizing a representation engineering approach wherein model activations are read from the residual stream of an LLM when processing a reasoning task. The activations are used to derive a control vector that is applied to the model as an inference-time intervention, modulating the representational space of the model, to improve performance on the specified task. We publish the code for deriving control vectors and analyzing model representations. The method allows us to improve performance on reasoning benchmarks and assess how control vectors influence the final logit distribution of a model via metrics such as KL divergence and entropy. We apply control vectors to Mistral-7B-Instruct and a range of Pythia models on an inductive, a deductive and mathematical reasoning task. We show that an LLM can, to a certain degree, be controlled to improve its perceived reasoning ability by modulating activations. The intervention is dependent upon the ability to reliably extract the model's typical state when correctly solving a task. Our results suggest that reasoning performance can be modulated in the same manner as other information-processing tasks performed by LLMs and demonstrate that we are capable of improving performance on specific tasks via a simple intervention on the residual stream with no additional training. Comment: Has been accepted at "The Thirteenth International Conference on Learning Representations (ICLR 2025)" Link to publication: https://openreview.net/forum?id=IssPhpUsKt |
Document Type: | Working Paper |
Access URL: | http://arxiv.org/abs/2504.19483 |
Accession Number: | edsarx.2504.19483 |
Database: | arXiv |
FullText | Text: Availability: 0 CustomLinks: – Url: http://arxiv.org/abs/2504.19483 Name: EDS - Arxiv Category: fullText Text: View this record from Arxiv MouseOverText: View this record from Arxiv – Url: https://resolver.ebsco.com/c/xy5jbn/result?sid=EBSCO:edsarx&genre=article&issn=&ISBN=&volume=&issue=&date=20250428&spage=&pages=&title=Improving Reasoning Performance in Large Language Models via Representation Engineering&atitle=Improving%20Reasoning%20Performance%20in%20Large%20Language%20Models%20via%20Representation%20Engineering&aulast=H%C3%B8jer%2C%20Bertram&id=DOI: Name: Full Text Finder (for New FTF UI) (s8985755) Category: fullText Text: Find It @ SCU Libraries MouseOverText: Find It @ SCU Libraries |
---|---|
Header | DbId: edsarx DbLabel: arXiv An: edsarx.2504.19483 RelevancyScore: 1147 AccessLevel: 3 PubType: Report PubTypeId: report PreciseRelevancyScore: 1146.59326171875 |
IllustrationInfo | |
Items | – Name: Title Label: Title Group: Ti Data: Improving Reasoning Performance in Large Language Models via Representation Engineering – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Højer%2C+Bertram%22">Højer, Bertram</searchLink><br /><searchLink fieldCode="AR" term="%22Jarvis%2C+Oliver%22">Jarvis, Oliver</searchLink><br /><searchLink fieldCode="AR" term="%22Heinrich%2C+Stefan%22">Heinrich, Stefan</searchLink> – Name: DatePubCY Label: Publication Year Group: Date Data: 2025 – Name: Subset Label: Collection Group: HoldingsInfo Data: Computer Science – Name: Subject Label: Subject Terms Group: Su Data: <searchLink fieldCode="DE" term="%22Computer+Science+-+Machine+Learning%22">Computer Science - Machine Learning</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Science+-+Artificial+Intelligence%22">Computer Science - Artificial Intelligence</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Science+-+Computation+and+Language%22">Computer Science - Computation and Language</searchLink> – Name: Abstract Label: Description Group: Ab Data: Recent advancements in large language models (LLMs) have resulted in increasingly anthropomorphic language concerning the ability of LLMs to reason. Whether reasoning in LLMs should be understood to be inherently different is, however, widely debated. We propose utilizing a representation engineering approach wherein model activations are read from the residual stream of an LLM when processing a reasoning task. The activations are used to derive a control vector that is applied to the model as an inference-time intervention, modulating the representational space of the model, to improve performance on the specified task. We publish the code for deriving control vectors and analyzing model representations. The method allows us to improve performance on reasoning benchmarks and assess how control vectors influence the final logit distribution of a model via metrics such as KL divergence and entropy. We apply control vectors to Mistral-7B-Instruct and a range of Pythia models on an inductive, a deductive and mathematical reasoning task. We show that an LLM can, to a certain degree, be controlled to improve its perceived reasoning ability by modulating activations. The intervention is dependent upon the ability to reliably extract the model's typical state when correctly solving a task. Our results suggest that reasoning performance can be modulated in the same manner as other information-processing tasks performed by LLMs and demonstrate that we are capable of improving performance on specific tasks via a simple intervention on the residual stream with no additional training.<br />Comment: Has been accepted at "The Thirteenth International Conference on Learning Representations (ICLR 2025)" Link to publication: https://openreview.net/forum?id=IssPhpUsKt – Name: TypeDocument Label: Document Type Group: TypDoc Data: Working Paper – Name: URL Label: Access URL Group: URL Data: <link linkTarget="URL" linkTerm="http://arxiv.org/abs/2504.19483" linkWindow="_blank">http://arxiv.org/abs/2504.19483</link> – Name: AN Label: Accession Number Group: ID Data: edsarx.2504.19483 |
PLink | https://login.libproxy.scu.edu/login?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsarx&AN=edsarx.2504.19483 |
RecordInfo | BibRecord: BibEntity: Subjects: – SubjectFull: Computer Science - Machine Learning Type: general – SubjectFull: Computer Science - Artificial Intelligence Type: general – SubjectFull: Computer Science - Computation and Language Type: general Titles: – TitleFull: Improving Reasoning Performance in Large Language Models via Representation Engineering Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Højer, Bertram – PersonEntity: Name: NameFull: Jarvis, Oliver – PersonEntity: Name: NameFull: Heinrich, Stefan IsPartOfRelationships: – BibEntity: Dates: – D: 28 M: 04 Type: published Y: 2025 |
ResultId | 1 |