Robust Benchmark Structural Variant Calls of An Asian Using State-of-the-art Long-read Sequencing Technologies

Bibliographic Details
Title: Robust Benchmark Structural Variant Calls of An Asian Using State-of-the-art Long-read Sequencing Technologies
Authors: Xiao Du, Lili Li, Fan Liang, Sanyang Liu, Wenxin Zhang, Shuai Sun, Yuhui Sun, Fei Fan, Linying Wang, Xinming Liang, Weijin Qiu, Guangyi Fan, Ou Wang, Weifei Yang, Jiezhong Zhang, Yuhui Xiao, Yang Wang, Depeng Wang, Shoufang Qu, Fang Chen, Jie Huang
Source: Genomics, Proteomics & Bioinformatics, Vol 20, Iss 1, Pp 192-204 (2022)
Publisher Information: Oxford University Press, 2022.
Publication Year: 2022
Collection: LCC:Biology (General)
LCC:Computer applications to medicine. Medical informatics
Subject Terms: Asian benchmark, Reference material, Structural variation, Haplotype-resolved, Sanger validation, Biology (General), QH301-705.5, Computer applications to medicine. Medical informatics, R858-859.7
More Details: The importance of structural variants (SVs) for human phenotypes and diseases is now recognized. Although a variety of SV detection platforms and strategies that vary in sensitivity and specificity have been developed, few benchmarking procedures are available to confidently assess their performances in biological and clinical research. To facilitate the validation and application of these SV detection approaches, we established an Asian reference material by characterizing the genome of an Epstein-Barr virus (EBV)-immortalized B lymphocyte line along with identified benchmark regions and high-confidence SV calls. We established a high-confidence SV callset with 8938 SVs by integrating four alignment-based SV callers, including 109× Pacific Biosciences (PacBio) continuous long reads (CLRs), 22× PacBio circular consensus sequencing (CCS) reads, 104× Oxford Nanopore Technologies (ONT) long reads, and 114× Bionano optical mapping platform, and one de novo assembly-based SV caller using CCS reads. A total of 544 randomly selected SVs were validated by PCR amplification and Sanger sequencing, demonstrating the robustness of our SV calls. Combining trio-binning-based haplotype assemblies, we established an SV benchmark for identifying false negatives and false positives by constructing the continuous high-confidence regions (CHCRs), which covered 1.46 gigabase pairs (Gb) and 6882 SVs supported by at least one diploid haplotype assembly. Establishing high-confidence SV calls for a benchmark sample that has been characterized by multiple technologies provides a valuable resource for investigating SVs in human biology, disease, and clinical research.
Document Type: article
File Description: electronic resource
Language: English
ISSN: 1672-0229
Relation: http://www.sciencedirect.com/science/article/pii/S1672022921000462; https://doaj.org/toc/1672-0229
DOI: 10.1016/j.gpb.2020.10.006
Access URL: https://doaj.org/article/479691735ac349b781de159587e94e9d
Accession Number: edsdoj.479691735ac349b781de159587e94e9d
Database: Directory of Open Access Journals
More Details
ISSN:16720229
DOI:10.1016/j.gpb.2020.10.006
Published in:Genomics, Proteomics & Bioinformatics
Language:English