How to handle high subgenome sequence similarity in allopolyploid Fragaria x ananassa: linkage disequilibrium based variant filtering.

Bibliographic Details
Title: How to handle high subgenome sequence similarity in allopolyploid Fragaria x ananassa: linkage disequilibrium based variant filtering.
Authors: Koorevaar, Tim1,2 (AUTHOR) tim.koorevaar@wur.nl, Willemsen, Johan H.1 (AUTHOR), Hildebrand, Dominic1 (AUTHOR), Visser, Richard G.F.2 (AUTHOR), Arens, Paul2 (AUTHOR), Maliepaard, Chris2 (AUTHOR)
Source: BMC Genomics. 11/28/2024, Vol. 25 Issue 1, p1-15. 15p.
Subject Terms: *WHOLE genome sequencing, *LINKAGE disequilibrium, *STRAWBERRIES, *GENE mapping, *ALLELES
Abstract: Background: The allo-octoploid Fragaria x ananassa follows disomic inheritance, yet the high sequence similarity among its subgenomes can lead to misalignment of short sequencing reads (150 bp). This misalignment results in an increased number of erroneous variants during variant calling. To accurately associate traits with the appropriate subgenome, it is essential to filter out these erroneous variants. By classifying variants into correct (type 1) and erroneous types (homoeologous variants—type 2, and multi-locus variants—type 3), we can improve the reliability of downstream analyses. Results: Our analysis reveals that while erroneous variant types often display skewed average allele balances (AAB) for heterozygous calls, this measure alone is insufficient. To mitigate the erroneous variants further, we employed a Linkage Disequilibrium (LD) based filtering method that correlates highly (99%) with an approach that utilizes a genetic map from a biparental population. This combined filtering strategy—using both LD-based and average allele balance methods—resulted in the lowest switch error rate (0.037). Notably, our best filtering approach decreased phasing switch error rates by 44% and preserved 72% of the original dataset. Conclusions: The results indicate that identifying erroneous variants due to subgenome similarity can be effectively achieved without extensive genotyping of mapping populations. By implementing the LD-based filtering method, the phasing accuracy improved which improves the tracability of important alleles in the germplasm, paving the way for better understanding of trait associations in F. x ananassa. [ABSTRACT FROM AUTHOR]
Copyright of BMC Genomics is the property of BioMed Central and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Academic Search Complete
Full text is not displayed to guests.
More Details
ISSN:14712164
DOI:10.1186/s12864-024-10987-8
Published in:BMC Genomics
Language:English