Thursday, October 18, 2018

BISTRO - Chengxin Zhang

4:00 PM

2036 Palmer Commons

BISTRO is restricted to U-M Bioinformatics Graduate Program students and faculty.

"C-QUARK: Accurate De Novo Protein Structure Prediction by Deep Multiple Sequence Alignment Construction and Comprehensive Conformation Sampling"


Prediction of protein structures from sequences has remained a challenging problem, especially for hard protein targets with little sequence homology to experimentally solved structures in PDB. In an effort to solve this problem, we developed C-QUARK, a de novo protein structure prediction pipeline for folding non-homologous proteins. For a given sequence, C-QUARK first built a deep multiple sequence alignment (MSA), from which contacts (i.e. residue pairs whose Cβ atom distances were <8Å) were predicted by coevolution and deep learning. These predicted contacts, together with other knowledge-based energy terms, were used to guide the assembly of structure fragments into a full length protein structure by Replica Exchange Monte Carlo (REMC) simulation. In the REMC simulation, the conformations were sampled by a diverse set of 12 movements, which consist of atomic, residual, and topological level changes. C-QUARK participated in the community-wide Critical Assessment of protein Structure Prediction (CASP) challenge round 12 and round 13 and was shown to achieve state-of-the-art prediction accuracy. In particular, its performance was on par with I-TASSER, our lab’s template-based protein structure prediction server that had been ranked as the most accurate pipeline in CASP competitions in the last decade. The C-QUARK webserver is available at