Journal of Probability and Statistics
Volume 2012 (2012), Article ID 375935, 19 pages
Research Article

Predicting Disease Onset from Mutation Status Using Proband and Relative Data with Applications to Huntington's Disease

1Department of Biostatistics, Mailman School of Public Health, Columbia University, 722 West 168th Street, New York, NY 10032, USA
2Department of Statistics, Texas A&M University, College Station, TX 77843, USA
3Departments of Neurology and Psychiatry and Sergievsky Center and the Taub Institute, Columbia University Medical Center, New York, NY 10032, USA
4Department of Psychiatry and Biostatistics (Secondary), University of Iowa, Iowa City, IA 52242, USA

Received 15 December 2011; Accepted 22 February 2012

Academic Editor: Yongzhao Shao

Copyright © 2012 Tianle Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Huntington's disease (HD) is a progressive neurodegenerative disorder caused by an expansion of CAG repeats in the IT15 gene. The age-at-onset (AAO) of HD is inversely related to the CAG repeat length and the minimum length thought to cause HD is 36. Accurate estimation of the AAO distribution based on CAG repeat length is important for genetic counseling and the design of clinical trials. In the Cooperative Huntington's Observational Research Trial (COHORT) study, the CAG repeat length is known for the proband participants. However, whether a family member shares the huntingtin gene status (CAG expanded or not) with the proband is unknown. In this work, we use the expectation-maximization (EM) algorithm to handle the missing huntingtin gene information in first-degree family members in COHORT, assuming that a family member has the same CAG length as the proband if the family member carries a huntingtin gene mutation. We perform simulation studies to examine performance of the proposed method and apply the methods to analyze COHORT proband and family combined data. Our analyses reveal that the estimated cumulative risk of HD symptom onset obtained from the combined data is slightly lower than the risk estimated from the proband data alone.