LEADER 02898ctm a22002897 4500
008 120216s2011 xx a b 000 0 eng d
z| 9781267026354 q| (electronic)
a| PU c| PU
a| He, Jing.
a| Statistical methods in mapping complex diseases / c| Jing He.
a| xii, 85 p. ; b| ill. ; c| 29 cm.
a| Advisers: Mingyao Li; Hongzhe Li.
a| Thesis (Ph.D. in Epidemiology and Biostatistics) -- University of Pennsylvania, 2011.
a| Includes bibliographical references.
a| Penn dissertations x| Epidemiology and Biostatistics.
a| Epidemiology and Biostatisticsy x| Penn dissertations.
a| Epidemiology and Biostatistics.
a| Dissertations, Academic.
a| Li, Mingyao, e| advisor.
a| Li, Hongzhe, e| advisor.
a| University of Pennsylvania.
a| Genome-wide association studies have become a standard tool for disease gene discovery over the past few years. These studies have successfully identified genetic variants attributed to complex diseases, such as cardiovascular disease, diabetes and cancer. Various statistical methods have been developed with the goal of improving power to find disease causing variants. The major focus of this dissertation is to develop statistical methods related to gene mapping studies with its application in real datasets to identify genetic markers associated with complex human diseases.
a| In my first project, I developed a method to detect gene-gene interactions by incorporating linkage disequilibrium (LD) information provided by external datasets such as the International HapMap or the 1000 Genomes Projects. The next two projects in my dissertation are related to the analysis of secondary phenotypes in case-control genetic association studies. In these studies, a set of correlated secondary phenotypes that may share common genetic factors with disease status are often collected. However, due to unequal sampling probabilities between cases and controls, the standard regression approach for examination of these secondary phenotype can yield inflated type I error rates when the test SNPs are associated with the disease. To solve this issue, I propose a Gaussian copula approach to jointly model the disease status and the secondary phenotype. In my second project, I consider only one marker in the model and perform a test to access whether the marker is associated with the secondary phenotype in the Gaussian copula framework. In my third project, I extend the copula-based approach to include a large number of candidate SNPs in the model. I propose a variable selection approach to select markers which are associated with the secondary phenotype by applying a lasso penalty to the log-likelihood function.