Statistical theory of combinatorial library of proteins with negative design and biased Monte Carlo method for protein sequence optimization and sampling [electronic resource].

Zou, Jinming.
179 p.
Contained In:
Dissertation Abstracts International 63-11B.

Location Notes Your Loan Policy


Chemistry, Physical and theoretical.
System Details:
Mode of access: World Wide Web.
Combinatorial experiments provide a way to study a large number (10 4--1012) of protein sequences at the same time. Sequences are created with a desired degree of diversity and screened for the evidence of folding to a predetermined structure or specific functional properties. The exponentially huge number (10130 for a 100-residue protein) of possible sequences, however, complicates combinatorial experiments. A statistical theory for combinatorial library design of folding proteins is developed. The theory addresses the whole space of available compositions, not just the small fraction that is accessible to experiment and to computational enumeration and sampling. The theory takes as input a target backbone structure and a scoring or energy function for quantifying sequence-structure compatibility and yields the site-specific amino acid probabilities. The theory is formulated to include not only the energy of the target structure but also elements of negative design. The theory is tested using a simple lattice model, and excellent agreement with exact enumeration results is observed. The theory is applied to an all-atom protein. Atomistic and simplified potentials for protein design are examined. A Mean Field bias Monte Carlo (MFBMC) method is developed that utilizes the identity probabilities from the statistical theory for protein sequence optimization and sampling. Comparing with the classic Monte Carlo and configurational bias Monte Carlo methods, the MFBMC is more efficient for sequence design and sampling in most cases. Using a cluster variational method, the statistical theory is also formulated to directly address correlations among residue sites. In an application using a higher order foldability criterion, superposition approximations for three-body and four-body probabilities give excellent results.
Source: Dissertation Abstracts International, Volume: 63-11, Section: B, page: 5272.
Supervisor: J. G. Saven.
Thesis (Ph.D.)--University of Pennsylvania, 2002.
Local notes:
School code: 0175.
University of Pennsylvania.
Access Restriction:
Restricted for use by site license.