Areas of Interest
My research focuses on practical, accurate, and efficient methods for big data genome science. Methodologically, his primary focus is on developing statistical methods and computational tools for large-scale genetic studies. Scientifically, his research aims to understand the etiology of complex disease traits, including type 2 diabetes, bipolar disorder, cardiovascular diseases, and glomerular diseases. I have a comprehensive set of expertise in the analysis of ultra-high-throughput sequence data for studying genetics of complex traits. I have developed widely used statistical and computational methods for association analysis under hidden sample structure (EMMA, EMMAX), for analysis of high throughput DNA sequence data (GotCloud, EPACTS, verifyBamID, cleanCall), for haplotyping and genotype imputation (EMINIM, thunderVCF), and for analysis of expression data (ICE, MMC). My integrative expertise in statistical and computational methods for sequence-based genetic studies, in conjunction with analytic experiences from large-scale sequencing projects, empowers me to develop practical and scalable methods for addressing analytic challenges for genetic studies with ultra-high-throughput sequence reads that will be produced in an unprecedented scale in the next several years.