Royal College of Surgeons in Ireland
Browse
Marked variation in predicted and observed variability of tandem.pdf (934.69 kB)

Marked variation in predicted and observed variability of tandem repeat loci across the human genome.

Download (934.69 kB)
journal contribution
posted on 2019-11-22, 16:21 authored by Colm T. O'Dushlaine, Denis C. Shields

BACKGROUND: Tandem repeat (TR) variants in the human genome play key roles in a number of diseases. However, current models predicting variability are based on limited training sets. We conducted a systematic analysis of TRs of unit lengths 2-12 nucleotides in Whole Genome Shotgun (WGS) sequences to define the extent of variation of 209,214 unique repeat loci throughout the genome.

RESULTS: We applied a multivariate statistical model to predict TR variability. Predicted heterozygosity correlated with heterozygosity in the CEPH polymorphism database (correlation rho = 0.29, p < 0.0005) better than the correlation between the CEPH and WGS data (rho = 0.17), presumably because the model smoothes noise from small sample sizes. A multivariate logistic model of 8 parameters accounted for 36% of the variation in the WGS data. Validation studies of 70 experimentally investigated TRs revealed high concordance with the model's predictions (p < 0.0001).

CONCLUSION: Variability among 2-12-mer TRs in the genome can be modeled by a few parameters, which do not markedly differ according to unit length, consistent with a common mechanism for the generation of variability among such TRs. Analysis of the distributions of observed and predicted variants across the genome showed a general concordance, indicating that the repeat variation dataset does not exhibit strong regional ascertainment biases. This revealed a deficit of variant repeats in chromosomes 19 and Y - likely to reflect a reduction in 2-mer repeats in the former and a reduced level of recombination in the latter - and excesses in chromosomes 6, 13, 20 and 21.

History

Comments

This article is also available from www.biomedcentral.com

Published Citation

O'Dushlaine CT, Shields DC. Marked variation in predicted and observed variability of tandem repeat loci across the human genome. BMC Genomics. 2008;9:175.

Publication Date

2008-04-16

PubMed ID

18416815

Department/Unit

  • School of Pharmacy and Biomolecular Sciences

Usage metrics

    Royal College of Surgeons in Ireland

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC