A modular protein language modelling approach to immunogenicity prediction.

Bibliographic Details
Title: A modular protein language modelling approach to immunogenicity prediction.
Authors: Hugh O'Brien, Max Salm, Laura T Morton, Maciej Szukszto, Felix O'Farrell, Charlotte Boulton, Laurence King, Supreet Kaur Bola, Pablo D Becker, Andrew Craig, Morten Nielsen, Yardena Samuels, Charles Swanton, Marc R Mansour, Sine Reker Hadrup, Sergio A Quezada
Source: PLoS Computational Biology, Vol 20, Iss 11, p e1012511 (2024)
Publisher Information: Public Library of Science (PLoS), 2024.
Publication Year: 2024
Collection: LCC:Biology (General)
Subject Terms: Biology (General), QH301-705.5
More Details: Neoantigen immunogenicity prediction is a highly challenging problem in the development of personalised medicines. Low reactivity rates in called neoantigens result in a difficult prediction scenario with limited training datasets. Here we describe ImmugenX, a modular protein language modelling approach to immunogenicity prediction for CD8+ reactive epitopes. ImmugenX comprises of a pMHC encoding module trained on three pMHC prediction tasks, an optional TCR encoding module and a set of context specific immunogenicity prediction head modules. Compared with state-of-the-art models for each task, ImmugenX's encoding module performs comparably or better on pMHC binding affinity, eluted ligand prediction and stability tasks. ImmugenX outperforms all compared models on pMHC immunogenicity prediction (Area under the receiver operating characteristic curve = 0.619, average precision: 0.514), with a 7% increase in average precision compared to the next best model. ImmugenX shows further improved performance on immunogenicity prediction with the integration of TCR context information. ImmugenX performance is further analysed for interpretability, which locates areas of weakness found across existing immunogenicity models and highlight possible biases in public datasets.
Document Type: article
File Description: electronic resource
Language: English
ISSN: 1553-734X
1553-7358
Relation: https://doaj.org/toc/1553-734X; https://doaj.org/toc/1553-7358
DOI: 10.1371/journal.pcbi.1012511
Access URL: https://doaj.org/article/d0648697accf4d3db851dc1ee5b36eb3
Accession Number: edsdoj.0648697accf4d3db851dc1ee5b36eb3
Database: Directory of Open Access Journals
Full text is not displayed to guests.
More Details
ISSN:1553734X
15537358
DOI:10.1371/journal.pcbi.1012511
Published in:PLoS Computational Biology
Language:English