Abstract

Identification of haplotype patterns is fundamental in genome-based inference and prediction. Widely used linkage disequilibrium (LD) measures are calculated for the whole population and thereby lead to relatively short blocks neglecting patterns over longer segments of linkage. In contrast to that, we define haplotypes for segments which are similar in subgroups of the population (“haplotype blocks”). Unlike common definitions we conceptualize a haplotype block as a sequence of alleles and only those haplotypes with that sequence are in the block. Out of these haplotype blocks we construct a haplotype library representing a large proportion of genetic variability with a limited number of blocks. The algorithm consists of first computing a cluster allowing for efficient screening of longer shared segments to identify the actual haplotype blocks. The haplotype library is then compiled by iteratively merging and extending blocks, but also eliminating less important haplotype blocks. Depending on the application, different optimization goals of the haplotype library (e.g., the identification of shared segments between different breeds) are possible. Our methods are implemented in the R-package HaploBlocker (unpublished so far). By applying this method we reduce a dataset comprising of 64 trios of commercial brown layers (102k SNPs, only chromosome 1) to 878 haplotype blocks representing 92.5% of the dataset. The average size of a haplotype block is 1621 SNPs – in contrast to the blocks identified with default settings in HaploView (Barrett et al., 2005) which have an average size of 7.2 SNPs in the same dataset. By using haplotype blocks instead of SNPs the typical p>>n-problem for genetics datasets can be reduced, allowing the application of a wide variety of new methods. Keywords: haplotype blocks, variable reduction, population genetics, genomic prediction, R-package

Torsten Pook, Martin Schlather, Gustavo de los Campos, David Cavero, Henner Simianer

Proceedings of the World Congress on Genetics Applied to Livestock Production, Volume Methods and Tools - Imputation, , 622, 2018
Download Full PDF BibTEX Citation Endnote Citation Search the Proceedings



Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.