Datta Mellacheruvu

Datta Mellacheruvu, Ph.D.
15

Ph.D. Program
Sr. Bioinformatics Scientist
Personalis, Inc.

Chair

Dissertation Title

A Computational and Informatics Framework for the Analysis of Affinity Purification Mass Spectrometry Data and Reconstruction of Protein Interaction Networks

Research Interest

Affinity purification coupled with mass spectrometry (AP-MS) is a high-throughput approach to detect protein interactions, where a protein complex is affinity-purified using an antibody targeted against a clonally modified member of the complex that expresses an epitope tag (bait) and analyzed using protein mass spectrometry. In most cases, co-purifying proteins (prey) include a large number of non-specific interactors, hence methods to discern bona fide interactions from non-specific background are critical for the successful application of AP-MS technology. In this thesis, I present a computational and informatics framework for streamlined analysis of AP-MS data, which includes (i) a pipeline for scoring and detecting potentially bona fide protein interactions (SPrInt), (ii) an integrated network reconstruction tool (PInt), (iii) a database of standardized negative control experiments that assist in scoring interactions (the CRAPome) and (iv) a protein interaction database that aggregates uniformly scored interactions (the RePrInt). SPrInt implements two novel scoring functions that are complementary to previously published models and several visualizations that assist in filtering data. PInt facilitates systematic network reconstruction and analysis by integrating prior knowledge from protein interaction databases and providing tools to dissect large networks into constituent sub-networks. In summary, SPrInt and PInt are versatile tools for analyzing a wide variety of AP-MS data sets. Small-scale AP-MS studies may not capture the complete set of non-specific interactions due to limited availability of negative controls. Fortunately, negative controls are largely bait-independent and can be aggregated to increase the coverage and characterization of the background. Accordingly, we created the first, large-scale, publicly accessible repository of negative controls (the CRAPome, www.crapome.org) and demonstrated its utility using a benchmark dataset. Current protein interaction databases are created using curated lists of protein interactions. While manual curation is limited in its scope, computational curation often leads to high false positives. We present an alternative (data driven) approach for creating protein interaction databases by aggregating raw data and making available uniformly scored interactions (the RePrInt). We also present a novel pipeline for comprehensive network reconstruction that includes a model to merge evidence from multiple datasets. 

Current Placement

Personalis, Inc.