Large indel detection in region-based phased diploid assemblies from linked-reads

Bibliographic Details
Title: Large indel detection in region-based phased diploid assemblies from linked-reads
Authors: Can Luo, Brock A. Peters, Xin Maizie Zhou
Source: BMC Genomics, Vol 26, Iss S2, Pp 1-13 (2025)
Publisher Information: BMC, 2025.
Publication Year: 2025
Collection: LCC:Biotechnology
LCC:Genetics
Subject Terms: Structural variants, Phasing, Region-based, Diploid assembly, Linked-reads, Biotechnology, TP248.13-248.65, Genetics, QH426-470
More Details: Abstract Background Linked-reads improve de novo assembly, haplotype phasing, structural variant (SV) detection, and other applications through highly-multiplexed genome partitioning and barcoding. Whole genome assembly and assembly-based variant detection based on linked-reads often require intensive computation costs and are not suitable for large population studies. Here we propose an efficient pipeline, RegionIndel, a region-based diploid assembly approach to characterize large indel SVs. This pipeline only focuses on target regions (50kb by default) to extract barcoded reads as input and then integrates a haplotyping algorithm and local assembly to generate phased diploid contiguous sequences (contigs). Finally, it detects variants in the contigs through a pairwise contig-to-reference comparison. Results We applied RegionIndel on two linked-reads libraries of sample HG002, one using 10x and the other stLFR. HG002 is a well-studied sample and the Genome in a Bottle (GiaB) community provides a gold standard SV set for it. RegionIndel outperformed several assembly and alignment-based SV callers in our benchmark experiments. After assembling all indel SVs, RegionIndel achieved an overall F1 score of 74.8% in deletions and 61.8% in insertions for 10x linked-reads, and 64.3% in deletions and 36.7% in insertions for stLFR linked-reads, respectively. Furthermore, it achieved an overall genotyping accuracy of 83.6% and 80.8% for 10x and stLFR linked-reads, respectively. Conclusions RegionIndel can achieve diploid assembly and detect indel SVs in each target region. The phased diploid contigs can further allow us to investigate indel SVs with nearby linked single nucleotide polymorphism (SNPs) and small indels in the same haplotype.
Document Type: article
File Description: electronic resource
Language: English
ISSN: 1471-2164
Relation: https://doaj.org/toc/1471-2164
DOI: 10.1186/s12864-025-11398-z
Access URL: https://doaj.org/article/475b87d106c94b1cacbf19de2c60e457
Accession Number: edsdoj.475b87d106c94b1cacbf19de2c60e457
Database: Directory of Open Access Journals
Full text is not displayed to guests.
More Details
ISSN:14712164
DOI:10.1186/s12864-025-11398-z
Published in:BMC Genomics
Language:English