Academic Journal
Large indel detection in region-based phased diploid assemblies from linked-reads
Title: | Large indel detection in region-based phased diploid assemblies from linked-reads |
---|---|
Authors: | Can Luo, Brock A. Peters, Xin Maizie Zhou |
Source: | BMC Genomics, Vol 26, Iss S2, Pp 1-13 (2025) |
Publisher Information: | BMC, 2025. |
Publication Year: | 2025 |
Collection: | LCC:Biotechnology LCC:Genetics |
Subject Terms: | Structural variants, Phasing, Region-based, Diploid assembly, Linked-reads, Biotechnology, TP248.13-248.65, Genetics, QH426-470 |
More Details: | Abstract Background Linked-reads improve de novo assembly, haplotype phasing, structural variant (SV) detection, and other applications through highly-multiplexed genome partitioning and barcoding. Whole genome assembly and assembly-based variant detection based on linked-reads often require intensive computation costs and are not suitable for large population studies. Here we propose an efficient pipeline, RegionIndel, a region-based diploid assembly approach to characterize large indel SVs. This pipeline only focuses on target regions (50kb by default) to extract barcoded reads as input and then integrates a haplotyping algorithm and local assembly to generate phased diploid contiguous sequences (contigs). Finally, it detects variants in the contigs through a pairwise contig-to-reference comparison. Results We applied RegionIndel on two linked-reads libraries of sample HG002, one using 10x and the other stLFR. HG002 is a well-studied sample and the Genome in a Bottle (GiaB) community provides a gold standard SV set for it. RegionIndel outperformed several assembly and alignment-based SV callers in our benchmark experiments. After assembling all indel SVs, RegionIndel achieved an overall F1 score of 74.8% in deletions and 61.8% in insertions for 10x linked-reads, and 64.3% in deletions and 36.7% in insertions for stLFR linked-reads, respectively. Furthermore, it achieved an overall genotyping accuracy of 83.6% and 80.8% for 10x and stLFR linked-reads, respectively. Conclusions RegionIndel can achieve diploid assembly and detect indel SVs in each target region. The phased diploid contigs can further allow us to investigate indel SVs with nearby linked single nucleotide polymorphism (SNPs) and small indels in the same haplotype. |
Document Type: | article |
File Description: | electronic resource |
Language: | English |
ISSN: | 1471-2164 |
Relation: | https://doaj.org/toc/1471-2164 |
DOI: | 10.1186/s12864-025-11398-z |
Access URL: | https://doaj.org/article/475b87d106c94b1cacbf19de2c60e457 |
Accession Number: | edsdoj.475b87d106c94b1cacbf19de2c60e457 |
Database: | Directory of Open Access Journals |
Full text is not displayed to guests. | Login for full access. |
ISSN: | 14712164 |
---|---|
DOI: | 10.1186/s12864-025-11398-z |
Published in: | BMC Genomics |
Language: | English |