"Predicting genome-wide transcription factor binding and residence time from chromatin accessibility data"
The majority of genetic variants implicated in complex diseases and traits by genome-wide association studies (GWAS) are non-coding. Thus, rather than affecting protein structure, these variants are likely to disrupt regulatory modules in the cell and cause dysregulation in the expression of downstream genes. The mechanism by which GWAS variants are expected to act is through creation or disruption of transcription factor (TF) binding sites at key regulatory loci. Given the limitations of chromatin immunoprecipitation sequencing (ChIP-seq) to evaluate the genome-wide binding patterns of only one TF at a time, we instead focus on chromatin accessibility data, using the assay for transposase-accessible chromatin followed by sequencing (ATAC-seq) to simultaneously predict binding for many TFs. My presentation will discuss some of the available methods for predicting TF binding with chromatin accessibility data, their strengths and weaknesses, and our own approaches for dealing with this problem as well as their application in understanding the genetic mechanisms of type 2 diabetes in muscle and pancreatic islets. Additionally, I will discuss how ATAC-seq data can be used to estimate biophysical aspects in the TF world, such as the ability to infer their residence times, and how these parameters relate to the performance of TF binding prediction methods.