Molecular Genetics, Shmuel Pietrokovski's Lab

Shmuel Pietrokovski

Hermann and Lilly Schilling Foundation Chair

Room: 419A

Building: Arnold R. Meyer Institute of Biological Sciences

Tel: 972-8-934-2747

Fax: 972-8-934-4108


My Other Web Page

Proteins can be grouped together in families where all members have the same, or similar, function and are descended from a common ancestor. Proteins from the same family are usually composed of highly conserved sequence regions separated by regions of different lengths and little sequence similarity. These conserved sequence regions, called motifs, are typically 6-30 aa long and correspond to active sites, substrate or ligand binding sites, and structurally important segments of proteins.

Motifs can be identified by multiple-sequence alignments of proteins. Multiple alignments of motifs, called blocks, are extremely useful in various areas of protein research. In particular they are effective in identifying new family members through database sequence searches, in predicting the proteins function and structure, and in designing PCR primers to amplify genes of unknown family members.

My group is interested in developing methods for using and identifying protein motifs and in-depth studies of particular protein families. The methods include LAMA (Local Alignment of Multiple Alignments) that has been shown to be very sensitive in identifying structural, functional and distant homologous relations beyond the range of sequence-vs-sequence and even blocks-vs-sequence methods. A number of the method's predictions were experimentally verified. Among the group's objectives are:

further explore the usefulness of LAMA in structural and functional predictions,
develop the LAMA score statistics,
extend LAMA's input range.

The protein families we currently study are inteins and homing endonucleases. Inteins are proteins that catalyze their excision out of host proteins, ligating the host flanks with a polypeptide bond. This protein splicing activity is autoproteolytic and is not dependant on any host specific factors. Homing endonucleases usually mediate the recombination of the element in which they are encoded into alleles lacking the element. Some of these enzymes are only 18 Kd or smaller but most have very long recognition sites (10-40 bp). Both inteins and each of the different types of homing endonucleases are characterized by a few short sequence motifs. Our goals are:

identify key residues and their function,
predict the activity of putative family members,
find ways to control the activity of these proteins.

Inteins and hedgehog proteins (HH) perform similar reactions: inteins autocatalyze their removal from inside other proteins together with joining of their flanks by a peptide bond and the C-terminal domain of HH proteins autocatalyze their cleavage from the N-terminal domain of the proteins together with a covalent attachment of a cholesterol molecule to the cleavage point on the N-terminal domain.

The reactions are chemically similar with the cleaved peptide bond (the N-terminal one in inteins) first changed to an ester/thio-ester bond that is cleaved by a nucleophilic attack from the C-terminal flank in inteins or the cholesterol molecule in HH. However, these two type of proteins are found in different types of organisms (single cell organisms and organelles in intein and metazoa in HH) and have very low sequence similarity.

Using LAMA I predicted similar function for a number of specific regions in the two types of proteins, an overall structural similarity between them and their composition from a duplicated subdomain. The subsequent solution of a HH structure confirmed all these predictions.

The figure shows the crystal structures of the Mxe gyrA intein in grey (1AM2) and of the C-terminal end of Dm hedgehog protein in cyan (1AT0). Side chains of residues suggested to correspond and be part of the active sites are shown in red (intein) and blue (HH). The two structures were superimposed in 3-dimensions using the alpha-carbons backbone of a short motif identified in both types of proteins. An excellent fit was then found for the other parts of the structures as well. More details on this relation can be found at this page.


Selected Publications

Pietrokovski, S. "Conserved sequence features of inteins (protein introns) and their use in identifying new inteins and related proteins" Protein Science 3, 2340-2350 (1994).

Henikoff, S., Henikoff, J. G., Alford, W. J. & Pietrokovski, S. "Automated construction and graphical presentation of protein blocks from unaligned sequences" Gene 163, GC17-26 (1995).

Pietrokovski, S. "A new intein in Cyanobacteria and its significance for the spread of inteins" Trends In Genet. 12 (8), 287-288 (1996).

Pietrokovski, S. "Searching Databases of Conserved Sequence Regions by Aligning Protein Multiple-Alignments" Nucleic Acids Research 24 (19), 3836-3845 (1996).

Pietrokovski, S. & Henikoff, S. "A helix-turn-helix DNA-binding motif predicted for transposases of DNA transposons" Molecular and General Genetics 254, 689-695 (1997).

Pietrokovski, S. "Modular organization of inteins and C-terminal autocatalytic domains" Protein Science 7, 64-71 (1998).

Henikoff, S., Greene, E. A., Pietrokovski, S., Bork, P., Attwood, T. K. & Hood, L. "Gene families: the taxonomy of protein paralogs and chimeras" Science 278, 609-614 (1997).

Rose T. M., Schultz E. R., Henikoff J. G., Pietrokovski S., McCallum C. M. & Henikoff S. "Consensus-degenerate hybrid oligonucleotide primers for amplification of distantly-related sequences" Nucleic Acids Research, 26, 1628-1635, (1998).

Pietrokovski S. "Identification of a virus intein and a possible variation in the protein-splicing reaction" Current Biology, 8, R634-R635, (1998).

Henikoff, J. G., Henikoff, S. & Pietrokovski, S. "New features of the Blocks Database servers" Nucleic Acids Research 27, in-press (1998).

Keywords: computational biology, protein motifs, blocks, protein structure and function prediction, inteins, homing endonucleases

<<< Home

Department of Molecular Genetics
Weizmann Institute of Science

Tel: 972-8-934-3970
Fax: 972-8-934-4108


Last Updated: 1 August 2002