Understanding the genetic and molecular architecture of human disease is accelerated by robust model development and large-scale molecular profiling. I will present two studies leveraging big data analytics or automated machine learning to dissect human disease complexities: (1) Molecular and clinical signatures of SARS-CoV-2 infection in the US marines. This analysis revealed strong antiviral innate immunity set point in females contributing to sex differences in both molecular and clinical response to SARS-CoV-2 infection. A set of accurate biomarkers capable of detecting PCR false negative infections was implemented on small footprint devices. (2) Automated machine learning to interpret the effects of genetic variants. An automated framework, AMBER, was developed for efficiently searching neural network architectures to model genomic sequences. AMBER is useful in various biological applications, including fine mapping variants, partitioning genetic heritability, and in personalized medicine enabled by CRISPR/Cas9 genome editing. Together, these efforts demonstrate quantitative methods coupled with large-scale biomedical data as an effective avenue to decode human regulatory and disease biology.
Frank Zhang is a Flatiron research fellow with Olga Troyanskaya at the Simons Foundation and Princeton University since 2019. Prior to that, he obtained his PhD at UCLA with Yi Xing. His research focuses on machine learning and statistical method developments for the prediction and interpretation of human molecular and genetic variations using biological big data. Recently, he works on adopting and developing cutting-edge neural architecture search methods to automate the design of deep neural networks in genomics. He is also interested in making deep learning in biomedicine more interpretable and equitable.