Where does that Protein’s Active Site Lie? What is its Structure?

Date

2011-01

Authors

Steinberger, Joseph M.

Journal Title

Journal ISSN

Volume Title

Publisher

Yeshiva College

YU Faculty Profile

Abstract

Predicting biological important processes necessitates effective algorithms that model the biological phenomena that emerge from the organism’s DNA. Our interests are to improve methodology to identify active sites, the region of interactions, and also to develop a program for predicting detailed side chain conformations. We evaluate the probability of a given region being the active site through the use of traditional methods in combination with our novel method. In 2005, the Fiser lab at the Albert Einstein College of Medicine developed a modified method for determining the active sites in a pair of structurally related proteins. For the training set, on average we recover 84.4 percent of active residues, while identifying only 7.9 percent of the total number of residues as potentially belonging to the active site. Once the active site region is identified, it is useful to predict the detailed structure of the active site. The 4D term scores two neighboring amino acids' geometric relationship based on the frequency of observed orientations among several thousand experimentally determined proteins. We have studied the effect of defining sc-sc geometric relationships by using a ‘3D model’, as opposed to a ‘4D model’ (Figure 1.2.2) in predicting side chain conformations. This ‘3D Model’ reduces dimensionality, but increases the definition. Besides evaluating the efficacy of a 3D vs. 4D model, we also propose a different method for optimizing the weights of the different terms for groups of similar amino acids rather than using the same set of weight factors for all amino acids. We group amino acids by finding the relative importance of adding an additional term, through comparison of the ratio of the new RMSD with the initial RMSD, and then clustering the amino acids along the range of values for relative importance of adding an additional term. Our results indicate that 4 4D is a more appropriate algorithm than 3D. The preliminary results for individualizing the HUNTER terms for clusters of similar residue-residue pairs, encourages further research, as the range of the ratios of RMSDs was quite wide.

Description

The file is restricted for YU community access only.

Keywords

Proteins --Structure --Research --Methodology., Proteins --Structure-activity relationships --Research --Methodology., DNA --Structure., Proteins --Analysis.

Citation