"Methods for Analyzing the 4D Nucleome, with Application to Cellular Reprogramming"
The dynamical relationship between chromatin structure, gene transcription, and cellular phenotype is referred to as the 4D Nucleome (4DN). 4DN data analysis has gained significant research attention in recent years for its ability to illuminate cell regulation principles. Despite its benefits, 4DN analysis can be difficult: data sets are large, analysis methods are underdeveloped, and data analysis requires a high-level understanding of biology, computer science, and mathematics. In this dissertation, I present novel analysis methods that address these issues. Using these methods, I help uncover 4DN relationships in the cell cycle, the circadian rhythm cycle, and cellular reprogramming.
First, I investigate methods for analyzing time series cellular images. In this work, I demonstrate how the detection of a "canonical framework," or a consistent genomic coordinate system both within and between time points, can help researchers elucidate latent patterns within cellular imaging data. Next, I examine 4DN data collected on proliferating human fibroblasts. Dynamical structure-function relationships between genes are uncovered in known gene modules, including cell cycle, circadian rhythm, and wound healing gene networks. Building upon this data set, I helped develop an algorithm for cellular reprogramming. This algorithm models the dynamics of human fibroblast proliferation and determines where and when to input control using transcription factors (TFs) to achieve optimal reprogramming efficiency. TFs known to successfully reprogram between cell types are recovered using this algorithm, thus validating the predictions. This includes the prediction of MYOD1 for fibroblast to muscle cell reprogramming. Next, I analyze 4DN data collected on fibroblasts undergoing MYOD1-mediated reprogramming to muscle cells to reveal previously unknown relationships between structure, function, and biological processes. A connection between MYOD1 and the core circadian clock gene network is also uncovered in this work. I continue with a 4DN analysis of cancer cells, including colorectal and breast cancer cells, where changes in structure and function show a clear relationship with their disease state. I conclude with the description of a software toolbox containing novel methods for the analysis of 4DN data. Collectively, this work provides a comprehensive guide for the analysis of 4DN data, spanning a range of experimental methods and applications.