BIOINFORMATICS<-->STRUCTURE
Jerusalem, Israel, November 17-21, 1996

Abstract


Position specific sequence characteristics of alpha helices in globular proteins

Sandeep Kumar and Manju Bansal

Molecular Biophysics Unit, Indian Institute of Science, Bangalore - 560012, INDIA


Using Calpha based structural definition of alpha helices, we obtained a database of 1131 alpha helices with lengths > or = 9 residues and non identical sequences from 205 non homologous (sequence identity < 25%) globular protein chains in high resolution (<2.5 Angstroms) protein crystal structures deposited with PDB. We have analyzed this database to study the sequence determinants at 15 positions within and around an alpha helix viz. N", N', Ncap, N1, N2, N3, N4, Mid, C1, C2, C3, C4, Ccap, C', and C". chi square tests show that the differences in distribution of amino acid residues at each of these positions with respect to the overall distribution in 1131 helices as well as with respect to one another are highly significant, even at 99.9% level of confidence. Euclidean and Hamming distances in 20-D amino acid composition space for each position have large values, indicating each of these 15 helical positions has unique sequence characteristics. Four statistical parameters viz. propensity, preference, frequency of occurrence (%) and chi square (representing the significance of change in proportion of a given residue with reference to its overall proportion in 1131 helices) were used to study the sequence characteristics at each of the 15 positions. Asn is almost equally favored at both Ncap and Ccap positions. Ser, Asp and Thr are found to be more favored at Ncap position than even Asn. Besides being the most favored residue at Ccap position, Gly is also highly favored at all positions preceding N-terminal and succeeding C-terminal of an alpha helix. The imino acid Pro is the most avoided residue in the main body and C-terminus including Ccap of an alpha helix, but is favored at some positions near both the termini, especially N1 and C". The observation that Pro is the most avoided residue at Ccap position but is highly favored at C' and C" positions, indicates that Pro is not a good helix breaker by itself. Hydrophobic contribution to helix stability comes largely from the main body of an alpha helix as apolar aliphatic residues Ala, Leu, Val and Ile are highly favored at Mid and nearby positions. Furthermore, apolar aliphatic residues are avoided at several positions at and near the helix termini. Changes in the proportions of these residues at all these positions, are highly significant even at 99.9% level confidence, as indicated by their respective chi square values at these positions. The lessons learnt from this database analysis may have important implications for prediction and de novo protein design.


Back to the Abstract Index.