Posters list (Poster abstracts)

P1. High-Throughput Measurement of RNA Structure
Michael Kertesz, Yue Wan, Elad Mazor, Howard Chang, Eran Segal

P2. Paying the entropic cost: peptides and proteins in a bind.
Nir London, Dana Attias, Ora Schueler-Furman

P3. Accurate Refinement of Coarse Peptide-Protein Complexes with Backbone Flexibility.
Barak Raveh, Nir London, Ora Schueler-Furman

P4. Analyses of Conservation and Context of Dockerin Repeats.
Vered Fishbain, Ora Schueler-Furman

P5. Towards realistic codon models: among site variability and dependency of synonymous and nonsynonymous rates
Adi Doron-Faigenboim, Itay Mayrose, Eran Bacharach, Tal Pupko

P6. Analysis of Chromosomal Alterations in Cancer
Michal Ozery-Flato, Chaim Linhart, Lior Mechlovich, and Ron Shamir

P7. MicroRNA Target Predictions Based Exclusively On Positive Examples
Waleed Khalifa, Naim Najami, Malik Yousef

P8. Metabolic-network driven analysis of bacterial ecological strategies
Shiri Freilich, Anat Kreimer, Elhanan Borenstein, Nir Yosef, Roded Sharan, Uri Gophna & Eytan Ruppin

P9. A computational approach for genome-wide mapping of splicing factor binding sites
Martin Akerman, Hilda David-Eden, Ron Y Pinter and Yael Mandel-Gutfreund

P10. RNA Sequence Design by Reconstruction from Shape and Guiding Observables
Assaf Avihoo, Nir Dromi, Danny Barash

P11. BlockMaster: Partitioning protein structures into semi-rigid blocks and flexible regions using normal mode analysis
Marina Shudler and Masha Y.Niv

P12. Predicting the Role of Alternative Splicing in Modulating the Gene Expression Network
Idit Kosti, Predrag Radivojac and Yael Mandel-Gutfreund

P13. Proteins: Coexistence of Stability and Flexibility
Shlomi Reuveni, Rony Granek and Joseph Klafter

P14. Conservation of domain repeats in cellulosome scaffoldins and other proteins
Dan Reshef, Ed Bayer, Ora Schueler-Furman

P15. Mounting computational evidence for functionality of Fantom non-coding RNA
Ilana Lebenthal and Ron Unger

P16. Computational Methods for Dissection of MicroRNA Function
Igor Ulitsky, Louise Laurent, Franz-Josef Muller, Jeanne F.Loring and Ron Shamir

P17. A new library of surface patches - Design and applications
Roi Gamliel, Chen Keasar and Klara Kedem

P18. Towards the first genome-scale model of human liver metabolism
Livnat Jerby, Tomer Shlomi, Eytan ruppin

P19. Selecting Gene Expression Markers for Cancer Prognosis and Treatment
Ofer Lavi, Michael Gutkin, Gideon Dror, Ron Shamir

P20. Ab initio Construction of a Eukaryotic Transcriptome by Massively Parallel mRNA Sequencing.
Moran Yassour, Tommy Kaplan, Hunter B.Fraser, Joshua Z.Levin, Jenna Pfiffner, Xian Adiconis, Gary Schroth, Shujun Luo, Irina Khrebtukova, Andreas Gnirke, Chad Nusbaum, Dawn-Anne Thompson, Nir Friedman, and Aviv Regev

P21. Expander - A Comprehensive Platform for Expression Profile Analysis
Adi Maron-Katz, Ran Elkon, Seagull Shavit, Igor Ulitsky, Chaim Linhart, Amos Tanay, Roded Sharan, Dorit Sagir, Israel Steinfeld, Yosef Shiloh, Ron Shamir

P22. A Comparative Genome-wide Study of ncRNAs in Trypansomatids
Tirza Doniger, ChaimWachtel, Rodolfo Katz, Shulamit Michaeli, Ron Unger

P23. S2G - Candidate gene finder and OMIM search utility
Avitan Gefen, Raphael Cohen (EC) and Ohad S.Birk

P24. BIOINFORMATICS ANALYSIS OF SIGMA C PROTEIN OF AVIAN REOVIRUS ISRAELI ISOLATES, TOWARDS NOVEL VACCINE PRODUCTION
Yeheskel, M. Pasmanik-Chor, D.Goldenberg and J.Pitcovski

P25. What is the difference between Prokineticin Receptors 1 and 2? Bioinformatics and experimental study
Anat Levit, Helena Safrian, Rina Meidan and Masha Y. Niv

P26. Myasthenia Gravis Ig gene lineage trees and mutation analysis
Neta S. Zuckerman, Wendy Howard, Sonia Berrih-Aknin, Jacky Bismuth, Hanna Edelman, Kate Gibson, Deborah Dunn-Walters Ramit Mehr

P27. mtDNA genetic landscapes formed in tumors and during human evolution are shaped by similar selective constraints
Ilia Zhidkov, Erez A. Livneh, Eitan Rubin and Dan Mishmar

P28. Phosphorylation- and protonation-dependent protein kinase B (PKB) dynamics
Shu Cheng and Masha Y. Niv

P29. Nucleosomes Mark Exons
Schraga Schwartz, Eran Meshorer and Gil Ast

P30. Genome-wide approach of combined genomic and transcriptomic analyses of Parkinson's disease in the Ashkenazi population
Merav Kedmi, Anat Bar-Shira, Ziv Gan-Or, Nir Giladi and Avi Orr-Urtreger

P31. A model structure of the human potassium channel Kv7.2 in complex with a potent selective opener
Yana Gofman, Asher Peretz, Liat Pell, Yoni Haitin, Bernard Attali, Nir Ben-Tal

P32. Model-structure, mutagenesis and functional characteristics of the NHA2 transporter
Maya Schushan, Minghui Xiang, Etana Padan, Rajini Rao and Nir Ben-Tal

P33. A Feature-Based approach to Modeling Protein-DNA Interactions
Eilon Sharon, Shai Lubliner, Eran Segal

P34. Lateral gene transfer and convergent evolution lead to the emergence of eukaryotic-like genes in Legionella pneumophila
M. N. Lurie-Weinberger, L. Gomez-Valero, N. Merault, G. Glockner, C. Buchrieser and U. Gophna

P35. Selection constraints on selfish homing endonucleases facilitate their use in gene therapy
Eyal Privman, Adi Barzel, Michael Pe'eri, David Burstein, Uri Gophna, Martin Kupiec, Tal Pupko

P36. A Neuro-Immune Gene Ontology: a subset of GO directed for neurological and immunological systems
Nophar Geifman, Alon Monsonego and Eitan Rubin

P37. Inference and Characterization of Horizontally Transferred Gene Families using Probabilistic Mixture Models
Ofir Cohen and Tal Pupko

P38. The Complexity Hypothesis revisited
Ofir Cohen, Uri Gophna and Tal Pupko

P39. Variability of the substrate-binding groove in the human kinome
Mor Rubinstein, Masha Niv

P40. Analysing the origin of long-range interactions in proteins using lattice models
Orly Noivirt, Ron Unger, Amnon Horovitz

P41. A SYSTEM-LEVEL VIEW OF VIRAL MIMICRY OBTAINED VIA STRUCTURAL HOMOLOGY SCREENING
Nir Drayman and Ariella Oppenheim

P42. Unraveling nucleosome DNA repeat structure of C. elegans
Idan Gabdank, Danny Barash, Edward N. Trifonov

P43. Investigating structure and formation of the 5-HT1A serotonin receptor dimer
Noga Kowalsman, Ute Renner, Evgeni Ponimaskin, Masha Y. Niv

P44. Large-scale analysis of Arabidopsis transcription reveals a basal co-regulation network
Osnat Atias, Benny Chor, Daniel A. Chamovitz

P45. Genome-Scale Identification of Legionella pneumophila Effectors using a Machine Learning Approach
David Burstein, Tal Zusman, Elena Degtyar, Ram Viner, Gil Segal, and Tal Pupko

P46. The DNA-encoded nucleosome organization of a eukaryotic genome
Noam Kaplan, Irene K Moore, Yvonne Fondufe-Mittendorf, Andrea Gossett, Desiree Tillo, Yair Field, Emily M LeProust, Timothy R Hughes, Jason Lieb, Jonathan Widom, Eran Segal

P47. Receptor Specificity of the Hemagglutinin from the H5N1 Strain of the Influenza Virus
Daphna Meroz, Tomer Hertz and Nir Ben-Tal

P48. NetAge: an online network database for biogerontological research
Robi Tacutu, Arie Budovsky, Vadim Fraifeld

P49. Contact Map Prediction: Neighbours are Important When You're Close
Haim Ashkenazy, Ron Unger, Yossef Kliger

P50. Using a Differential Geometry Approach for Characterizing Biological Relevant Interfaces
Shula Shazman, Gershon Elber, Yael Mandel-Gutfreund

P51. GENECARDS: HUMAN GENOME PARALOG HUNTING, SET DISTILLATION, FUNCTIONAL SCORING AND FAST SEARCHES
Arye Harel, Gil Stelzer, Irina Dalah, Naomi Rosen, Justin Alexander, Michael Shmoish, Tsippi Iny Stein, Alexandra Sirota, Asaf Madi, Tsviya Olender, Aron Inger, Marilyn Safran, Doron Lancet.

P52. Topology-Free Querying of Protein Interaction Networks
Sharon Bruckner, Falk Huffner, Richard M. Karp, Ron Shamir, Roded Sharan

P53. Deriving Enzymatic Signatures from Short Read Data
Uri Weingart, Yair Lavi, Erez Persi, Uri Gophna, and David Horn

P54. An application of "divide-and-conquer" algorithm to analysis of whole genome tiling array data
Brodsky Leonid, BenTal Nir, BenJacob Eshel, and Nevo Eviatar

P55. MetaboStat: post ion-identification analysis of LC/GC-MS data
Brodsky Leonid, Rogachev Ilana, Venger Ilya, Malitsky Sergey and Aharoni Asaph

P56. Structural Signature of Antibiotic Binding Sites on the Ribosome
Hilda David-Eden and Yael Mandel-Gutfreund

P57. Gene Translation in Humans is Efficient
Yedael Y. Waldman, Tamir Tuller, Tomer Shlomi, Roded Sharan, Eytan Ruppin,

P58. Metazoan operons accelerate transcription and recovery rates
Alon Zaslaver, L. Ryan Baugh and Paul W. Sternberg

P59. Fractured Genes: A Novel Genomic Arrangement Involving New Split-Inteins and homing Endonuclease Family
B. Dassa, N. London, B. Stoddard, O. Schueler-Furman and S. Pietrokovski

P60. Deep transcriptome sequencing of the Sulfolobus solfataricus in a single nucleotide resolution
Omri Wurtzel, Rajat Sapra, Blake A. Simmons, Rotem Sorek

P61. Insertion hotspots and nepotism of DNA parasites: large-scale analysis of human nested transposed elements
Asaf Levy, Schraga Schwartz, Gil Ast

P62. Inferential optimization for simultaneous fitting of multiple components into a cryoEM map of their assembly
Keren Lasker, Maya Topf, Andrej Sali, Haim J. Wolfson

P63. Evolutionary Modeling of Rate-Shifts Reveals Specificity Determinants in HIV-1 Subtypes
Osnat Penn, Adi Stern, Nimrod D. Rubinstein, Tal Pupko

P64. Predicting the Affinity between Proteins and Small Molecules using a Random Forest Regression Model
Tammy Menasherov, Ron Unger, Yossef Kliger

P65. Predicting Synthetic Lethality in the Human Protein Interaction Network
Aron Inger, Tsviya Olender and Doron Lancet

P66. Converting promiscuous proteins into specific ones: design of calmodulin mutants with up to 900-fold enhancement in binding specificity towards CaMKII.
Eliyahu Yosef, Regina Politi, and Julia M. Shifman

P67. Computational design of protein-protein interactions: affinity enhancement at the fasciculin-AChE interface.
Oz Sharabi and Julia M. Shifman

P68. The Development of the Immune Network from Birth to Adulthood.
Asaf Madi, Sharron Bransburg-Zabary, Dror Y. Kenett, Alfred I. Tauber, Irun R. Cohen and Eshel Ben-Jacob

P70. A variability map of the olfactory receptor subgenome
Yehudit Hasin, Tsviya Olender ,Ifat Keidar ,Dina Leshkovitz, Miriam Khen and Doron Lancet

Poster abstracts

P1
High-Throughput Measurement of RNA Structure
Michael Kertesz 1, Yue Wan 2, Elad Mazor 1, Howard Chang 2, Eran Segal 1

1 Computer Science and Applied Mathematics Department, Weizmann Institute of Science. 2 Program in Epithelial Biology - Stanford University School of Medicine.

RNA structure plays a key role in many biological processes, yet there are no experimental techniques for high-throughput measurement of RNA structure. Here we devise a method, termed PARS (Parallel Analysis of RNA Structure), that uses high-throughput parallel sequencing technology to measure the secondary structure of thousands of RNA species. PARS starts by treating a pool of different RNA species with structure-specific enzymes, either RNase V1 which cleaves 3' of double-stranded RNA, or RNase T1+A which cleave single-stranded RNA. Thus, fragments resulting from this cleavage step provide structural evidence regarding the double- or single-strandedness of the cleaved nucleotide, depending on whether V1 or T1+A was used, respectively. Next, we use high-throughput sequencing to measure and map to the genome many nucleotides from of the cleaved RNA sample, and use this data to compute the double-stranded probability of each nucleotide in the input transcriptome. PARS allows the field of RNA structure probing to move from its low-throughput limitations into the realm of high-throughput, genome-wide analyses. By applying PARS to the entire transcriptome of yeast, we provide experimental evidence for several structural properties of yeast transcripts.

P2
Paying the entropic cost: peptides and proteins in a bind.
Nir London 1, Dana Attias 1, Ora Schueler-Furman 1

1 Department of Molecular Genetics and Biotechnology, Faculty of Medicine, The Hebrew University, Jerusalem, Israel. POB 12272, Jerusalem, 91120 Israel.

Peptide-protein interactions are among the most prevalent interactions in the cell. They mediate important processes, such as signal transduction and protein trafficking. How can peptides overcome the entropic cost involved in association, switching from an unstructured, flexible peptide to a rigid, well-defined structure at the interface? A structure-based analysis of peptide-protein interactions unravels that peptides use a number of strategies to compensate for this entropy loss. In particular, most peptides do not induce conformational changes on their partner upon binding, by this minimizing the entropic cost of binding. Furthermore, peptides optimize the binding enthalpy of the interaction: they display interfaces that are better packed than protein-protein interfaces, with significantly more hydrogen bonds (per constant interface size). In addition, they utilize their flexibility to the fullest in creating more interactions that involve main chain atoms. The distribution of binding energy along the peptide is not uniform; we find that on average one "hotspot" residue is required per three peptide residues. Finally, we show that peptides tend to bind in the largest pockets available on the protein surface. In addition to improved understanding of basic principles that underlie peptide-protein interactions, our findings have direct implications for the development of protocols for the structural modeling, design and manipulation of these interactions. This analysis is based on peptiDB, a new and comprehensive dataset of high-resolution peptide-protein complex structures.


P3
Accurate Refinement of Coarse Peptide-Protein Complexes with Backbone Flexibility.
Barak Raveh 1, Nir London 1, Ora Schueler-Furman 1

1 Department of Molecular Genetics and Biotechnology, Faculty of Medicine, The Hebrew University, Jerusalem, Israel. POB 12272, Jerusalem, 91120 Israel.

A wide range of regulatory processes in the cell are mediated by a flexible peptide that folds upon binding to a globular protein, involving substantial conformational changes. We present Rosetta FlexPepDock, a tool for refining low-resolution peptide-protein complexes, based on existing coarse level descriptions. The modeling problem involves a search in a high-dimensional space, comprising peptide backbone perturbations, side-chain modeling and rigid body moves. Therefore, our protocol allows significant changes in peptide internal degrees of freedom. We examined the performance of the protocol over sets of randomly perturbed native structures, and we show that accurate models (<1A from native backbone) are obtained from coarse structures, starting up to 3A from the native peptide backbone, and in some cases even up to 10A from the native backbone. In addition, Importantly, the modeling accuracy is comparable when using either the unbound or bound protein structures. We show that our protocol achieves accurate models in realistic settings of cross-docking using complexes of alternate peptides with the same receptor, or when starting from an ideal extended conformation. Recent studies show significant advancement in coarse prediction of peptide binding regions, and we believe our protocol is a significant step forward towards high-resolution peptide docking ab-initio. Since high-resolution models are essential for understanding binding mechanism in detail, our protocol may also facilitate rational peptide and drug design, where backbone flexibility plays an important role.


P4
Analyses of Conservation and Context of Dockerin Repeats.
Vered Fishbain, Ora Schueler-Furman

Department of microbiology and molecular genetics, institute for medical research, Hadassah Medical School, the Hebrew University in Jerusalem, Israel

The cellulosome is a multi-protein complex that can efficiently degrade cellulose. It is found in anaerobic cellulolytic bacteria living in various environments, such as the rumen and soil. Cellulosomes are complex multimers, whose composition is determined by many cohesin modules that interact with dockerin modules connected to a variety of hydrolases. This study focuses on the evolution and role of dockerins as part of the cellulosome, and beyond. The dockerin module consists in general of two repeats, where either one or both can bind to cohesin. We investigate the conservation pattern between these two repeats, as well as among different dockerin-containing proteins in various organisms. By comparing intra- and inter- sequence conservation, we hope to gain new insights into the contribution of symmetry to the function and stability of this domain. In addition, we have investigated the domain architecture of dockerin-containing proteins. We have located specific functional modules that are associated with dockerins, which can shed new light of additional putative functions of this domain.


P5
Towards realistic codon models: among site variability and dependency of synonymous and nonsynonymous rates
Adi Doron-Faigenboim1, Itay Mayrose2, Eran Bacharach1, Tal Pupko1

1 Department of Cell Research and Immunology, Tel-Aviv University, Israel. 2 Department of Zoology, University of British Columbia, BC, Canada

Codon evolutionary models are widely used to infer the selection forces acting on a protein. The nonsynonymous to synonymous rate ratio (Ka/Ks) is used to infer specific sites that are under purifying or positive selection. Current evolutionary models usually assume that only the nonsynonymous rates vary among sites while the synonymous substitution rates are constant. This assumption ignores the possibility of selection forces acting at the DNA or mRNA levels. Towards a more realistic description of sequence evolution, we present a model that accounts for among site synonymous and nonsynonymous rates variation. Furthermore, we alleviate the widespread assumption that sites evolve independently of each other. Thus, possible sources of bias caused by random fluctuations in either the synonymous or nonsynonymous rate estimations at a single site is removed. Our model is based on two hidden Markov models that operate on the spatial dimension: one describes the dependency between adjacent nonsynonymous rates while the other describes the dependency between adjacent synonymous rates. Using both simulations and real data analyses, we illustrate that accounting for synonymous rate variability and dependency greatly increases the accuracy of Ka/Ks estimation and in particular of positively selected sites. We studied the selection pressure across the HIV-1 genome and discuss the applicability of the model to infer the selection forces in regulatory and overlapping regions.


P6
Analysis of Chromosomal Alterations in Cancer
Michal Ozery-Flato 1, Chaim Linhart 1, Lior Mechlovich 1, and Ron Shamir 1

1 The Blavatnik School of Computer Science, Tel Aviv University, Israel

Chromosomal aberrations are a hallmark of cancer. Our study seeks to computationally reveal strong links between specific aberrations and different cancers. Uncovering such links requires a large number of samples. For our analysis we used the Mitelman database, which is the largest data depository of chromosomal aberrations in cancer, with over 57,000 tumor karyotypes from the scientific literature. We developed an effective algorithm that seeks a most plausible sequence of chromosomal aberrations that led to a given tumor karyotype, allowing 12 types of events. Our analysis implies that tumor karyotypes evolve predominantly via four principal events: chromosome gains and losses, translocations, and deletions. Moreover, we show that the frequencies of these events differ significantly between tumor types, and in particular between solid tumors and hematological disorders. We present a comprehensive list of significantly associated tumor-aberration pairs, where tumor types are defined by tissue morphology and topography. We also identify tumors that manifest similar recurrent aberrations, and show that these similarities are observed primarily within three categories: solid tumors, lymphomas, and non-lymphatic hematological disorders. Finally, we report on aberrations that tend to co-occur in different tumor types. Our results assign solid statistical foundations to many findings reported in the literature, and also reveal novel observations that merit further research.


P7
MicroRNA Target Predictions Based Exclusively On Positive Examples
Waleed Khalifa1,2, Naim Najami1,4, Malik Yousef1,2,3

1The Galilee Society Institute of Applied Research, Shefa-Amr,Israel, 2Computer Science, The College of Sakhnin, Sakhnin, Israel, 3Al-Qasemi Academic College, Baqa Algharbiya, Israel, 4The Academic Arab College Of Education, Haifa, Israel

The application of one-class machine learning is gaining attention in the computational biology community. Different studies have described the use of two-class machine learning to predict microRNAs (miRNAs) gene target. Most of these methods require the generation of an artificial negative class that might yield biased results. This study presents one-class machine use for miRNA target discovery and compares one-class to two-class approaches using naive Bayes. Of all one-class methods tested, we found that most of them gave similar accuracy ranges, from 0.81 to 0.89, while the two-class naive Bayes gave 0.99 accuracy. One and two class methods can both give useful classification accuracies. The advantage of one class methods is that they don't require any additional effort for choosing the best way of generating the negative class.


P8
Metabolic-network driven analysis of bacterial ecological strategies
Shiri Freilich*^1,2 , Anat Kreimer*3 , Elhanan Borenstein 5,6, Nir Yosef 1, Roded Sharan1, Uri Gophna 4& Eytan Ruppin 1,2

1 The Blavatnik School of Computer Sciences, 2 School of Medicine, 3 School of Mathematical Science, 4 Department of Molecular Microbiology and Biotechnology, Faculty of Life Sciences, Ramat Aviv 69978, Israel. 5Department of Biological Sciences, Stanford University, Stanford, CA 94305-5020 and 6Santa Fe Institute, Santa Fe, NM 87501 *These authors contributed equally to this work ^Corresponding author

The growth-rate of an organism is an important phenotypic trait, directly affecting its ability to survive in a given environment. Here we present the first large scale computational study of the association between ecological strategies and growth rate across 113 bacterial species, occupying a variety of metabolic habitats. Genomic data (first-order information) is used to generate second-order metabolic knowledge through the reconstruction of metabolic networks and third-order environmental knowledge through the reconstruction of habitable metabolic environments. These reconstructions are then used to model the typical ecological strategies taken by organisms in terms of two basic species-specific measures: metabolic variability the ability of a species to survive in a variety of different environments, and co-inhabitation score vector - the distribution of other species which co-inhabit each environment. We find that growth rate is significantly correlated with metabolic variability and the level of co-inhabitation (i.e., competition) encountered by an organism. Most bacterial organisms adopt one of two main ecological strategies: (a) a specialized niche with little co-inhabitation, associated with a typical slow rate of growth versus (b) ecological diversity with intense co-inhabitation, associated with a typical fast rate of growth. This suggests a universal principle where metabolic flexibility is associated with a need to grow fast, possibly in the face of competition.


P9
A computational approach for genome-wide mapping of splicing factor binding sites
Martin Akerman 1, Hilda David-Eden 1, Ron Y Pinter 2 and Yael Mandel-Gutfreund 1

1 Faculty of Biology, Technion. 2 Computer Science Department, Technion

Alternative splicing is regulated by splicing factors that serve as positive or negative effectors, interacting with regulatory elements along exons and introns. Here we present a novel computational method for genome-wide mapping of splicing factor binding sites which considers both the genomic environment and the evolutionary conservation of the regulatory elements. The method was applied to study the regulation of different alternative splicing events uncovering an interesting network of interactions among splicing factors.


P10
RNA Sequence Design by Reconstruction from Shape and Guiding Observables
Assaf Avihoo 1, Nir Dromi 2, Danny Barash 1

1 Department of Computer Science, Ben-Gurion University. 2 Rosetta Genomics, Weizmann Science Park, Rehovot

The process of designing novel RNA sequences by inverse RNA folding, as implemented in RNAinverse (Hofacker et al., 1994), can be thought of as a reconstruction of RNA sequences from secondary structure. To link between the inverse RNA folding problem and physical and evolutionary perspectives (Higgs, 2000), taking into consideration possible observables such as thermodynamic stability, mutational robustness, and linguistic complexity as constraints, an extension of the reconstruction problem was suggested in (Dromi et al., 2008) by which the starting point is an RNA shape. Such an extension is justified, for example, in cases where a functional stem-loop structure of a natural sequence should be strictly kept in the designed sequences but a distant motif in the rest of the structure may contain one more or less nucleotide at the expense of another as long as the global shape is preserved. This allows the insertion of physical observables as constraints to the problem, in addition to local sequence and structure rigid ones. In (Dromi et al., 2008), the problem was solved by a parallel evolutionary algorithm without considering computational cost. In practice, an efficient method should be developed for a uniprocessor server that solves this problem using an RNAinverse-like approach.


P11
BlockMaster: Partitioning protein structures into semi-rigid blocks and flexible regions using normal mode analysis
Marina Shudler and Masha Y. Niv

Institute of Biochemistry, Food Science and Nutrition, Hebrew University, Rehovot

Protein kinases are key signaling enzymes which are dysregulated in many health disorders and are major targets of extensive drug-discovery efforts. Their regulation in the cell is exerted via various mechanisms, including control of the 3D conformation of their catalytic domains. The accepted annotation of the protein kinase catalytic domain partitions it into an N-lobe and C-lobe. During activation the lobes undergo closure; when inactivated, lobes exhibit motion in the opposite direction. This fact led us to the idea that protein kinase motion may be described in terms of semi-rigid blocks contained in the lobes. We developed a procedure, BlockMaster, for partitioning protein structures into semi-rigid blocks and flexible regions based on normal modes analysis. It provided correct partitioning into domains and subdomains of several expert-annotated test set proteins. When applied to representative structures of protein kinases, BlockMaster indeed identified semi-rigid blocks within either N-lobe or C-lobe of the kinase domain. In addition, two blocks were identified which spanned both lobes. The first, termed "pivot" due to its potential role as a pivot point in lobes opening, appeared in both the active and inactive kinase conformations. The second, termed "loop" since it contains activation loop residues, differed between the active and inactive conformations. This novel inactive "loop" block may stabilize inactive conformation and thus downregulate kinase activity.


P12
Predicting the Role of Alternative Splicing in Modulating the Gene Expression Network
Idit Kosti 1 , Predrag Radivojac 2 and Yael Mandel-Gutfreund 1

1. Faculty of Biology, Technion- Israel Institute of Technology, Haifa Israel, 32000 2. School of Informatics, Indiana University , Bloomington, IN 47408, USA

Alternative splicing (AS) is a post transcriptional process which is considered to be responsible for the huge diversity of human proteins. AS can insert or delete inserts into the coding region presumably affecting the protein function. It has been shown that genes that undergo AS tend to encode for fully or partially disordered regions. Disordered regions in proteins are characterized as unstructured and flexible regions located at the protein surface area and were shown to be involved in different cell functions, such as transcription regulation. In this study we analyzed a unique set of human-mouse conserved AS events. We show that these AS events are located in predicted disordered protein regions. Accordingly, these regions are predicted to be highly exposed regions, located at the protein surface. This phenomenon was more apparent for the subset of regulatory proteins, specifically those related to transcription regulation. Furthermore, we found that these proteins are predicted to have a significantly higher density of conserved phosphorylation sites compared to the control set. Overall, our results suggest that AS plays an important role in regulation of the gene expression pathway by specifically modifying the regulatory regions of the proteins involved in the process. Further we studied the relationship between splicing factors and transcription regulation, uncovering an interesting network of interactions between splicing regulation and transcription regulation.


P13
Proteins: Coexistence of Stability and Flexibility
Shlomi Reuveni1, Rony Granek2 and Joseph Klafter1

1School of Chemistry, Tel-Aviv University, Tel-Aviv 69978, Israel. 2Department of Biotechnology Engineering, Ben-Gurion University, Beer Sheva 84105, Israel

We introduce an equation of state for proteins native topology based on recent analysis of data from the Protein Data Bank and on a generalization of the Landau-Peierls instability criterion for fractals. The equation relates the number of amino acids with the fractal and spectral dimensions describing the protein fold. Over 500 proteins have been analysed and found to obey this equation of state. Two seemingly conflicting properties of native proteins, such as enzymes and antibodies, are known to coexist. While proteins need to keep their specific native fold structure thermally stable, the native fold displays the ability to perform flexible motions that allow proper function. This conflict cannot be bridged by compact objects which are characterized by small amplitude vibrations and by a Debye density of low frequency modes. Recently, however, it became clear that proteins can be described as fractals; namely, geometrical objects that possess self similarity. Adopting the fractal point of view to proteins makes it possible to describe within the same framework essential information regarding topology and dynamics using three parameters: the number of amino acids along the protein backbone, the spectral dimension and the fractal dimension. Based on a generalization of the Landau-Peierls instability criterion and on a melting criterion for proteins, we derive a relation between the spectral dimension, the fractal dimension and the number of amino acids along the protein backbone: 2/ds + 1/df = 1 + b/ln(N). Deviations from this equation may render a protein unfolded. The fractal nature of proteins is shown to bridge their seemingly conflicting properties of stability and flexibility.


P14
Conservation of domain repeats in cellulosome scaffoldins and other proteins
Dan Reshef 1, Ed Bayer 2, Ora Schueler-Furman 1

1 Microbiology and Molecular Genetics Dept. Hebrew University. 2 Biological Chemistry Dept. Weizmann Institute of Science

The cellulosome is a multi-protein complex aimed at efficient degradation of cellulose. It is composed of a scaffoldin protein that contains a number of different cohesins, each binding to a dockerin module that is connected to a hydrolase. An analysis of the sequence conservation between cohesin repeats reveals that cohesin clusters of exceptionally high sequence identity occur in the scaffoldin. Other, less similar cohesin domains are separated from such a cluster by a distinct domain. This suggests that the duplication of the adjacent cohesin domains is a very recent event. The high sequence identity also offers a way to track the duplication process. Previous studies of domain repeats have suggested that adjacent repeated domains differ one from another, in order to prevent aggregation. A comprehensive analysis of repeat conservation in proteins reveals that indeed in most cases repeated domains show low sequence identity. However, this is not case for the cohesin scaffoldin: we found very few examples of domains that behave similarly to cohesin. Most of these are related to binding activity or to extracellular localization. Cellulosome diversity is thought to be achieved by the ability of the dockerin to bind in two different orientations, as well as the promiscuity of dockerin-cohesin interactions. We suggest that the mechanism of homologous recombination can create within a population various numbers of repeats, thus further enhancing cellulosome diversity.


P15
Mounting computational evidence for functionality of Fantom non-coding RNA
Ilana Lebenthal and Ron Unger

The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, 52900, Israel

A surprising observation in large scale studies of mammalian genome transcription is the large percent of the genomic DNA that is transcribed to RNA. The meaning of the transcription of so many genomic regions is still under debate, and it remains an open question whether most of the identified transcripts are in fact functional. Here we look for computational indirect evidence that can support the functionality of a 34,030 non-coding RNA (ncRNA) transcripts that were found in the Fantom3 project. We show that as a group this set of sequences is more conserved with human and rat than control sets of sequences taken randomly from the mouse genome. In particular, there are some Fantom sequences that show very high sequence conservation with the other species. We demonstrate that homologs of the Fantom ncRNA sequences in human and rat have more matches to ESTs in these organisms than homologs of the control sets. We show that the conserved subgroup of sequences is differentially expressed, and exhibits elevated expression levels in brain tissues. In addition, we were able to show that on average the Fantom ncRNA sequences have lower minimal free energy of folding than the control sets, partially because of statistically distinct dinucleotide composition. Taken these observations together, it is clear that as a group the Fantom ncRNA set is distinct from random sets from the genome. Therefore we conclude that many of these transcripts may indeed have biological function.


P16
Computational Methods for Dissection of MicroRNA Function
Igor Ulitsky 1, Louise Laurent 2, Franz-Josef Muller 2, Jeanne F. Loring 2 and Ron Shamir 1

1 Blavatnik School of Computer Science, Tel Aviv University, Israel., 2 Center for Regenerative Medicine, The Scripps Research Institute, San Diego, California, USA.

The function of the vast majority of microRNAs (miRNAs) is still unknown and even for relatively well studied miRNAs, only a handful of their computationally predicted targets have been rigorously characterized. We developed two computational methods that can assist in the discovery of miRNA function and prioritization of their targets. The first tool compares a given set of the predicted miRNA targets with gene sets pertinent to a specific function, such as co-annotated or co-expressed genes. Our method improves upon extant statistical methods by correcting the bias in 3'UTR length in specific functional classes. In addition, the method can predict function of groups of co-located or co-expressed miRNAs. The second method is based on the notion that coordinated binding of miRNAs enhances their repressive activity. We developed an algorithm for detecting groups of miRNAs that target the same pathway, utilizing information on protein interactions, putative miRNA targets and miRNA transcript levels. Application of this algorithm to a large miRNA expression dataset focused on stem cell biology identified over two hundred pathways regulated by miRNA groups. These miRNA groups manifest co-expression in other datasets and the targeted pathways are functionally coherent. In certain, but not all, cases we identify a significant anti-correlation between the miRNAs and target mRNA levels.


P17
A new library of surface patches - Design and applications
Roi Gamliel, Chen Keasar and Klara Kedem

Department of Computer Science Ben Gurion University of the Negev Israel

Proteins surfaces serve as an interface with their environment, where shape and chemical complementarity to other molecules provides interaction specificity. Thus, surface motifs that are essential to protein function may be more conserved than sequence or chain topology. Furthermore, non-homologous proteins sometimes share similar structure motifs due to convergent evolution. Despite the importance of protein surfaces they were less studied than protein sequence and topology. Moreover, most of the research on protein surfaces focuses on regions of specific interest such as ligand binding and docking sites. Our approach is more general, as we wish to study the entire surface regardless of its functional roles. In this work we characterize protein surfaces using surface patches, which are small continuous fractions of the surface of proteins. We defined a measure of distance between patches and used it to cluster recurring structural motifs from a large set of non redundant proteins. The resulting set of cluster centroids constitutes our surface patch library. Preliminary results show significant difference between the patches of decoys and native structures, suggesting that our library indeed captures some aspect of native protein surfaces. Further, the library is useful in ranking protein structure predictions. Thus, we expect that a high-quality, generic description of the surface, will improve protein predictions and the ability to determine the quality of models.


P18
Towards the first genome-scale model of human liver metabolism
Livnat Jerby 1, Tomer Shlomi 3, Eytan ruppin 1,2

1 School of Computer Sciences Tel Aviv University, 2 School of Medicine Tel Aviv University, 3 Computer Science Dept. Technion

The understanding of human hepatic metabolism is crucial for the research and treatment of various clinical conditions, ranging from rare inborn errors of metabolism to metabolic malfunctions as hyperlipidemia and obesity that are becoming epidemically frequent. Our research objective is to generate the first genome-scale stochiometric network model of human liver metabolism based on integrating an available generic network model with various liver-specific molecular data sources. These data sources include literature-based knowledge on liver-specific metabolic pathways, as well as, transcriptomic, proteomic, metabolomic and phenotypic liver data. The reconstruction process is based on a novel computational method that produces an intact model, satisfying stoichiometric mass-balance and thermodynamic constraints, consisting of the known liver-specific enzymes given as input, as well as additional enzymes required to fulfill these constraints. The resulting liver model would provide a tool for simulating and identifying the metabolic alterations imposed by both inheritable and acquired metabolism associated disorders, and for discovering potential diagnostic biomarkers and treatment targets.


P19
Selecting Gene Expression Markers for Cancer Prognosis and Treatment
Ofer Lavi 1, Michael Gutkin 1, Gideon Dror 2, Ron Shamir 1

1 Blavatnik School of Computer Science. Tel Aviv University. Israel, 2 Depatment of Computer Science. Tel Aviv College.

The first diagnostics that are based on gene expression profiles have already been FDA approved and are in clinical use for predicting the effectiveness of chemotherapy against various cancers. The diagnostics, combining transcription level measurement of marker genes with computational analysis, help medical professionals to make a decision about treatment at a significantly higher success rates compared to previous, more traditional methods. Given a gene expression profile, the basic computational problem is to classify it as one of two predefined categories - e.g. case/control, or poor/good prognosis in response to treatment. While microarrays can measure the levels of thousands of genes, most of them do not contribute to the classification task and actually make it less accurate and less efficient. Therefore, an initial step of feature selection is needed in order to determine the markers that give the best classification. We developed SlimPLS, a novel method for multivariate feature selection based on Partial Least Squares. We compared the method with common feature selection techniques across a large number of real case-control datasets, most of which are cancer related, using several classifiers. We demonstrate the advantage of the method and study the preferable combinations of classifier and feature selection technique


P20
Ab initio Construction of a Eukaryotic Transcriptome by Massively Parallel mRNA Sequencing.
Moran Yassour 1,2,*, Tommy Kaplan 1,3,*, Hunter B. Fraser 2, Joshua Z. Levin 2, Jenna Pfiffner 2, Xian Adiconis 2, Gary Schroth 4, Shujun Luo 4, Irina Khrebtukova 4, Andreas Gnirke 2, Chad Nusbaum 2, Dawn-Anne Thompson 2, Nir Friedman 1 ^, and Aviv Regev 2,5 ^

1 School of Computer Science and Engineering, The Hebrew University, Jerusalem, 91904, Israel. 2 Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, Massachusetts 02142, USA. 3 Department of Molecular Genetics and Biotechnology, Faculty of Medicine, The Hebrew University, Jerusalem 91120, Israel. 4 Illumina, Inc., 25861 Industrial Boulevard, Hayward, CA 94545. 5 Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, 02142. * These authors contributed equally to this work, ^ Corresponding Authors

Defining the transcriptome, the repertoire of transcribed regions encoded in the genome, is a challenging experimental task. Current approaches, relying on sequencing of expressed sequence tags (ESTs) or cDNA libraries, are expensive and labor-intensive. Consequently, we know little about the transcriptome of most sequenced species. Advances in massively parallel sequencing can revolutionize the study of transcriptomes. Here, we present a novel approach for ab initio discovery of the complete transcriptome of the budding yeast, based only on the (unannotated) genome sequence and millions of short reads from a single sequencing run. Using novel algorithms, we automatically construct a highly accurate transcript catalogue, including most known transcripts, and adding 160 novel transcripts and 25 introns. Our results demonstrate that massive parallel sequencing provides accurate definition of a eukaryotic transcriptome without any prior knowledge. This framework can be applied to poorly understood organisms, for which only the genomic sequence is known.


P21
Expander - A Comprehensive Platform for Expression Profile Analysis
Adi Maron-Katz(1), Ran Elkon(2), Seagull Shavit(1), Igor Ulitsky(1), Chaim Linhart(1), Amos Tanay(3), Roded Sharan(1), Dorit Sagir(2), Israel Steinfeld(1), Yosef Shiloh(2), Ron Shamir(1)

(1) School of Computer Science, Tel-Aviv University, Israel (2) The David and Inez Myers Laboratory for Genetic Research, Department of Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University (3) School of Computer Science, Weizmann Institue

A major challenge in the analysis of microarray gene expression data is to extract meaningful biological knowledge out of the huge volume of raw data. The EXPANDER (EXPression ANalyzer and DisplayER) software package is an integrative platform for the analysis of gene expression data, designed as a 'one-stop shop' tool that implements various data analysis algorithms. EXPANDER is available with pre-compiled up-to-date data supporting analysis of over 10 species. Typical analysis using EXPANDER starts from basic normalization and filtering of the gene expression data. Then, the data can be analyzed using diverse clustering and biclustering algorithms. The gene sets identified (as well as user-provided gene sets), can then be tested for enrichment of functionally related genes (based on Gene Ontology), co-regulated genes (using promoter sequences and miRNA target predictions) or co-localized genes. In addition, protein interaction or signaling networks can be used for detection and analysis of gene modules. The integrated analysis capabilities provided by EXPANDER and its built-in support of multiple organisms make it unique among the many tools available for microarray data analysis. Download from: http://acgt.cs.tau.ac.il/expander


P22
A Comparative Genome-wide Study of ncRNAs in Trypansomatids
Tirza Doniger 1, ChaimWachtel 1, Rodolfo Katz 1, Shulamit Michaeli 1, Ron Unger 1

1 The Mina and Everard Goodman Faculty of Life Science, Bar-Ilan University

Recent studies have provided extensive evidence for multitudes of non-coding RNA (ncRNA) transcripts in a wide range of eukaryotic genomes. ncRNAs are emerging as key players in multiple layers of cellular regulation. With the availability of many whole genome sequences, comparative analysis has become a powerful tool to identify ncRNA. We undertake a systematic genome-wide in silico screen to search for novel ncRNAs in the genome of Trypanosoma brucei by comparative genomics. T. brucei was compared to 6 other sequenced Trypansomatids. A total of 8877 and 15,141 sequences were found to be conserved in six genomes and at least four genomes, respectively. Almost one third of the known ncRNA was found in six genomes, and about half were found in four genomes. Annotated sequences were then filtered out. Thus, yielding at total of 57 conserved unannotated sequences in six genomes and 126 in at least four genomes. Among this collection we identified tRNA-sec, previously annotated incorrectly in the T. brucei genome annotation as well as snoRNAs, and several novel ncRNAs of unknown function. Many of the predicted ncRNAs were validated experimentally and categorized to their families.


P23
S2G - Candidate gene finder and OMIM search utility
Avitan Gefen , Raphael Cohen (EC) and Ohad S. Birk

The Morris Kahn Laboratory of Human Genetics , National Institute for Biotechnology in the Negev (NIBN), Ben-Gurion University.

We have produced novel website-based software which allows efficient search for candidate genes in a genomic locus, using known genes that cause phenotypically similar syndromes. The software includes two components: a phenotype OMIM-based search engine that alleviates many of the problems in the existing OMIM search engine. The second component is a gene prioritizing engine that uses a novel algorithm to integrate information from 17 databases. The software prioritizes a list of genes from within a genomic locus, based on their association with genes whose defects are known to underlie similar clinical syndromes. When the detailed phenotype of a syndrome is inserted to the software, S2G offers a complete improved search of the OMIM database for similar syndromes. Thus, S2G provides clinicians with an efficient tool for diagnosis and researchers with a candidate gene prediction tool based on phenotypic data and a wide range of gene data resources.


P24
BIOINFORMATICS ANALYSIS OF SIGMA C PROTEIN OF AVIAN REOVIRUS ISRAELI ISOLATES, TOWARDS NOVEL VACCINE PRODUCTION
A. Yeheskel (1), M. Pasmanik-Chor (1), D. Goldenberg (2) and J. Pitcovski (2)

1 Bioinformatics Unit, The G.S. Wise Faculty of Life Science, Tel-Aviv University, Ramat Aviv, Israel. 2 Department of Virology and Immunology, MIGAL - Galilee Technology Center, Kiryat-Shmona, Israel.

Avian reovirus (ARV) causes severe losses in avian industry. Common syndromes include viral arthritis, tendosynovitis, liver heart, intestine infections and immunosuppression. Birds are sensitive mainly at young ages. The vaccines presently in use are based on the attenuated strain (S1133), or inactivated virulent isolates. However, despite vaccination, many flocks are still infected by reovirus. In this study, sigma C protein from 28 Israeli viral isolates from various regions were cloned and sequenced. Molecular analysis of sigma C was performed, since it is known as the most variable protein of the virus. Sigma C is known to induce neutralizing antibodies and may bind the host receptor. Sequence analysis of the protein reveals that all variants may be clustered into four distinct groups. The current vaccine strain was classified to one of the groups, and differs gradually from current Israeli field isolates, and therefore is not efficient. The aim of this study is to apply bioinformatic tools in order to better predict novel epitopes that would serve as better vaccination options. For this purpose, we have built a model structure for sigma C of one of the Israeli strains in which exposed and buried residues were localized. In addition, various sequence based tools were applied to predict antigenicity. Sigma C sequences of the various strains were aligned, and epitopes were chosen based on conservation. Further laboratory studies are currently being conducted to confer anti-epitope antibodies efficiency in detection and neutralization of ARV. These newly designed vaccines, based on bioinformatics analysis, are expected to provide immunity to multiple strains in Israel and world-wide.


P25
What is the difference between Prokineticin Receptors 1 and 2? Bioinformatics and experimental study
Anat Levit 1,2, Helena Safrian1, Rina Meidan1 and Masha Y. Niv2

1. Department of Animal Sciences 2. Institute of Biochemistry, Food Science and Nutrition The Hebrew University, Rehovot

Prokineticin (PK) 1 and 2 are two novel secreted proteins with diverse biological functions. PKs are the cognate ligands for two G protein-coupled receptors (PKR1 & PKR2). We have shown that both GPCRs are involved in luteal endothelial cell (EC) proliferation and survival, but only PKR2 mediates increased permeability. The diversity in signaling between PKR1 and PKR2, and the molecular mechanisms involved are still unknown. We hypothesized that coupling to G-proteins as well as distinct phosphorylation sites may be responsible for these differences. PKRs were predicted to couple to Gs, Gi and Gq. PKs activated the ERK cascade and elevated COX2 and eNOS mRNA levels in ECs, indicating possible Gq coupling. PKs alone elevated basal cAMP levels, however, when stimulated with the Beta2-adrenergic agonist, Isoproterenol, PKs decreased cAMP production. Thus, at basal conditions PKRs may couple to Gs and activate AC, but with massive Gs recruitment, PKs may signal via Gi or Gq, as predicted. Sequence analysis of PKRs in various species and consensus phosphorylation sites revealed that PKR1 and PKR2 differ in putative phosphorylation sites, mainly in the first and second intracellular loops and in the cytoplasmic tail. The question whether different G protein coupling determines the distinct biological functions of PKR1 vs. PKR2 and the role of phosphorylation are currently under study. Podlovni et al. Cell Physiol Biochem 2006; 18:315-326


P26
Myasthenia Gravis Ig gene lineage trees and mutation analysis
Neta S. Zuckerman1,*, Wendy Howard3,*, Sonia Berrih-Aknin2, Jacky Bismuth2, Hanna Edelman1, Kate Gibson3, Deborah Dunn-Walters3 Ramit Mehr1. *Equal contributors.

1 Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel. 2 Centre Chirurgical Marie Lannelongue, France. 3 King's College, London Medical School, UK.

Much information about hypermutation and selection during the immune response is contained in the shape of immunoglobulin (Ig) gene mutational lineage trees deduced from responding B cell clones. We microdissected germinal centers (GCs) from pathological ectopic GCs seen in the thymus of Myasthenia Gravis (MG) patients to see whether Ig gene characteristics are similar to normal GCs. Lineage tree analysis showed similar diversification and mutations per cell compared to the normal control trees. Mutation analyses revealed that most replacement (R) mutations were observed in the framework regions, responsible for structural integrity of the B cell receptor (BCR); these mutations were mostly conservative or neutral, confirming a conserved functional BCR in MG. Analysis of R and S (silent) mutations revealed selection against R mutations in the complementary determining regions (CDRs), responsible for antigen binding, which may indicate that MG clones are highly specific for the antigen, and thus most CDR R mutations, which may be affinity-reducing, disappear during selection. Somatic hypermutation targeting motifs in MG were similar to those of the normal controls. However, the mutation spectrum in the MG in-frame genes slightly deviated from that of MG out-of-frame and normal controls. Overall, B cells in MG ectopic GCs seem to undergo normal diversification and selection in spite of the chronic nature of the response.


P27
mtDNA genetic landscapes formed in tumors and during human evolution are shaped by similar selective constraints
Ilia Zhidkov1,2, Erez A. Livneh2, Eitan Rubin2,3 and Dan Mishmar1,2

(1) Department of Life Sciences, (2) National Institute of Biotechnology in the Negev, (3) Department of Microbiology and Immunology, Ben-Gurion University of the Negev, Beer-Sheva, Israel

Genomic landscapes in humans reveal signatures of natural selection in mutations generated either during evolution or with the progress of cancer. The unique features of the mutational landscape in cancer have been the focus of extensive study. However, little attention has been paid to similarities in the signatures of selection in tumor cells and human populations. We hypothesized that such similarities could provide novel insight into the functional constraints acting on both systems. Here, we examined the rapidly evolving mitochondrial genome (mtDNA) to compare the de novo mutational landscapes in a cancer compendium (98 sequence pairs) with mutations fixed in mtDNA over the course of human evolution (2400 sequences). Nucleotide positions that underwent de novo changes in the cancer compendium preferentially co-localized with ancient mutations in human mtDNA phylogeny. An even stronger pattern was observed when recurrent combinations of mutations (COMs) were analyzed in the cancer compendium, revealing non-random COMs of up to seven mutations in length, longer than observed in reshuffling simulations (p<2.9_10-4). Strikingly, 23/25 positions comprising the COMs co-localized with positions that define major human mtDNA lineages. Our results reveal significant similarities in the mutational landscapes of mtDNA in cancer and normal human populations suggesting similar selective constraints. This implies a functional potential for specific positions, as well as combinations of positions in both de novo and inherited mtDNA mutations. Our findings thus offer new approaches to understanding principles in cancer and natural genomic evolution.


P28
Phosphorylation- and protonation-dependent protein kinase B (PKB) dynamics
Shu Cheng and Masha Y. Niv

Institute of Biochemistry, Food Science and Nutrition, Hebrew University, Rehovot

Protein kinase B (PKB) is a novel therapeutic target for cancer. One of the mechanisms involved in regulating the activity levels of this kinase is phosphorylation of Thr309 in the activation loop. The dynamics of PKB phosphorylation and dephosphorylation were studied using explicit solvent molecular dynamics (MD) simulations. The activation loop of the dephosphorylated (Thr309) form displayed increased fluctuations compared with that of the phosphorylated (pThr309) form, indicating the beginning of the conformational changes induced by a change in phosphorylation state. We hypothesized that the protonation states of the residues that coordinate pThr309, such as His196, might influence the activation loop dynamics. We therefore carried out MD simulations under different protonation states of His196. The protonation state of His196 influenced the distribution of the _1 angle of Thr309, but not of pThr309. Throughout the simulations, pThr309 maintained bidentate hydrogen bonds with Arg274 in the conserved catalytic loop independent of the protonation state of His196, and created hydrogen bonds with the HSE and HSP, but not HSD forms of His196. The protonation state of His196 also influenced the conformational dynamics of Lys308, an activation loop residue that is hydrogen-bonded to pThr309 (but not to Thr309). These results provide initial insights into the mechanism and pH dependence of kinase activation and the roles of individual residues in this orchestrated process.


P29
Nucleosomes Mark Exons
Schraga Schwartz1, Eran Meshorer2 and Gil Ast1

1Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel-Aviv University, Ramat Aviv 69978, Israel; 2Department of Genetics, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Edmond J. Safra Campus, Givat Ram, Jerusalem 91904, Israel

An increasing body of evidence indicates that transcription and splicing are coupled and it is accepted that chromatin organization regulates transcription. Little is known, however, about cross-talk between chromatin structure and the exon-intron architecture. By analysis of genome-wide nucleosome positioning datasets from human and C. elegans, we found that exons harbor a well-positioned nucleosome. We show that the differential nucleosome positioning landscape between exons and introns is conserved throughout metazoan evolution and we attribute this to differential GC content and differential distribution of nucleosome disfavoring elements within exons and introns. Based on genome-wide chromatin immunoprecipitation (ChIP-seq) data in human and mouse, we identified four specific post-translational histone modifications biased to occur along exons. Our findings suggest that there is an RNA-polymerase II mediated cross-talk between chromatin structure and exon-intron architecture, raising the possibility that processes such as exon selection are modulated by chromatin structure. Moreover, we suggest that the pressure to fold around mono-nucleosomes may be one of the evolutionary forces restraining metazoan exons to lengths of approximately 150 nucleotides.


P30
Genome-wide approach of combined genomic and transcriptomic analyses of Parkinson's disease in the Ashkenazi population
Merav Kedmi1, Anat Bar-Shira1, Ziv Gan-Or1,3, Nir Giladi2,3 and Avi Orr-Urtreger1,3

1Genetic Institute & 2Movement Disorders Unit, Parkinson Center, Department of Neurology, Tel-Aviv Sourasky Medical Center, 3Sackler Faculty of Medicine, Tel-Aviv University, Israel.

Prokineticin (PK) 1 and 2 are two novel secreted proteins with diverse biological functions. PKs are the cognate ligands for two G protein-coupled receptors (PKR1 & PKR2). We have shown that both GPCRs are involved in luteal endothelial cell (EC) proliferation and survival, but only PKR2 mediates increased permeability. The diversity in signaling between PKR1 and PKR2, and the molecular mechanisms involved are still unknown. We hypothesized that coupling to G-proteins as well as distinct phosphorylation sites may be responsible for these differences. PKRs were predicted to couple to Gs, Gi and Gq. PKs activated the ERK cascade and elevated COX2 and eNOS mRNA levels in ECs, indicating possible Gq coupling. PKs alone elevated basal cAMP levels, however, when stimulated with the Beta2-adrenergic agonist, Isoproterenol, PKs decreased cAMP production. Thus, at basal conditions PKRs may couple to Gs and activate AC, but with massive Gs recruitment, PKs may signal via Gi or Gq, as predicted. Sequence analysis of PKRs in various species and consensus phosphorylation sites revealed that PKR1 and PKR2 differ in putative phosphorylation sites, mainly in the first and second intracellular loops and in the cytoplasmic tail. The question whether different G protein coupling determines the distinct biological functions of PKR1 vs. PKR2 and the role of phosphorylation are currently under study. Podlovni et al. Cell Physiol Biochem 2006; 18:315-326


P31
A model structure of the human potassium channel Kv7.2 in complex with a potent selective opener
Yana Gofman 1, Asher Peretz 2, Liat Pell 2, Yoni Haitin 2, Bernard Attali 2, Nir Ben-Tal 1

1 Biochemistry Dept. Tel-Aviv University, 2 Physiology & Pharmacology Dept. Tel-Aviv University

Voltage-gated potassium channels are ubiquitously expressed transmembrane proteins, crucial for maintaining electrical potential in living cells. Mutations in these proteins are related to various inherited diseases, from cardiac arrhythmias to epilepsy. The voltage sensing domain (VSD) has been rarely targeted for drugs action, as opposed to the pore and the gate regions. In this work we focused on the human voltage-gated potassium channel Kv7.2, which is associated with benign familial neonatal convulsions when mutated, and its potent opener NH29, which binds to the VSD region. We produced a 3D model-structure of Kv7.2. The model was used to suggest molecular explanations for the effect of most known disease causing mutations. Further performed docking simulations correlated well with the previously available experimental data and defined the binding site of NH29. Based on the computational results, more residues were detected in the binding site. Their involvement in NH29 binding was then confirmed by mutagenesis studies. In general, NH29 docks to the groove formed by the interface between S1, S2 and S4 helices, stabilizing the open state. Mutants at S4 are notably less sensitive to the activating effect of NH29 compared to the WT channels, while other mutants displayed increased sensitivity to the opener. Our data provide a structural framework for the future designing of gating-modifiers targeted to the VSD region of voltage-gated ion channels.


P32
Model-structure, mutagenesis and functional characteristics of the NHA2 transporter
Maya Schushan1, Minghui Xiang2, Etana Padan3, Rajini Rao2 and Nir Ben-Tal1

1Department of Biochemistry, George S. Wise Faculty of Life Sciences, Tel-Aviv University, Israel 2Department of Physiology, The Johns Hopkins University School of Medicine, 725 N. Wolfe Street, Baltimore MD 21205 3 Department of Biological Chemistry, Alexander Silberman Institute of Life Sciences, Hebrew University of Jerusalem, 91904 Jerusalem, Israel.

Human NHA2 is a transmembrane cation/proton antiporter of the monovalent cation/proton antiporters-2 transporter family. We utilized the crystal structure of NhaA, a distant bacterial homologue, as a template for producing a three-dimensional model of NHA2. Because of the low sequence identity between the two transporters we used a unique modeling approach, involving various secondary structure prediction and fold recognition tools, to align their sequences correctly. The model-structure is supported by evolutionary conservation analysis and site-directed mutagenesis. The model guided mutagenesis experiments that revealed functional residues, possibly involved in transport or structure stabilization. Similar to the X-ray structure of NhaA and model structure of the human transporter NHE1, the NHA2 model discloses a cluster of highly conserved titratable residues, located in the so-called assembly region, made of the two discontinuous helices TM4 and TM11. Nevertheless, the NHA2 assembly region has unique properties, as it does not exhibit a negatively charged residue to compensate for the positive dipoles in the TM4-TM11 assembly. Another unique feature of NHA2 is that it possesses a novel conserved and negatively charged residue, located at a different structural location towards the assembly. Combining the structural data, conservation analysis and mutagenesis, we propose that this region directly participates in ion and also suggest an alternate-access transport mechanism.


P33
A Feature-Based approach to Modeling Protein-DNA Interactions
Eilon Sharon, Shai Lubliner, Eran Segal

Department of Computer Science And Applied Mathematics, Weizmann Institue

Commonly used motif finding software produce PSSM (Position Specific Scoring Matrix) model that assumes independence between different motif positions. We present an alternative, richer model, called FMM (Feature Motif Model). Within the FMM formulation, a variety of sequence features may be represented, capturing dependencies between binding site positions. We show that FMMs describe binding data better than PSSMs both on synthetic and real data. We also present a motif finder that learns FMM motifs from unaligned promoter sequences, and show how de-novo FMMs, learned from binding data of the human TFs c-Myc and CTCF, reveal intriguing insights about their binding specificities. Our FMM learning and motif finder software are available at: http://genie.weizmann.ac.il/pubs/fmm08/fmm08_learn_unalign.html


P34
Lateral gene transfer and convergent evolution lead to the emergence of eukaryotic-like genes in Legionella pneumophila
M. N. Lurie-Weinberger1, L. Gomez-Valero2, N. Merault2, G. Glockner3, C. Buchrieser2 and U. Gophna1

1Department of Molecular Microbiology and Biotechnology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel, 69978. 2Institut Pasteur, UP Biologie des Bacteries Intracellulaires and CNRS URA 2171, 75724 Paris, Cedex 15, France. 3 Leibniz Institute for Age Research Fritz Lipmann Institute, Jena, Germany

Legionella pneumophila, the causative agent of Legionnaire's disease, is known to be an intracellular pathogen of multiple species of protozoa, and is assumed to have coevolved with these organisms for millions of years. Genome sequencing of Legionella pneumophila strains has revealed an unparalleled abundance of eukaryotic-like proteins (ELPs). Here we investigate the evolution of these ELPs and identify their origin. 34 new ELPs were identified, based on a higher similarity to eukaryotic proteins than to bacterial ones, bringing the total of known ELPs to 103. Phylogenetic analysis demonstrated that both lateral gene transfer from eukaryotic hosts and bacterial genes that became eukaryotic-like by convergent evolution contributed to the existing repertoire of ELPs that comprise over 3% of the putative proteome of L. pneumophila strains. A PCR survey of 72 L. pneumophila strains showed that most ELPs were conserved in nearly all of these strains, indicating that they are likely to play important roles in this species. Genes of different evolutionary origin have distinct patterns of selection, as reflected by their ratio of non-synonymous vs. synonymous mutations. Several ELPs that contain residues under diversifying selection were identified, and a large fraction of those have C-termini that are typical to type IV-secreted effectors of Legionella, and represent promising targets for future study.


P35
Selection constraints on selfish homing endonucleases facilitate their use in gene therapy
Eyal Privman 1, Adi Barzel 2, Michael Pe'eri 1, David Burstein 1, Uri Gophna 2, Martin Kupiec2, Tal Pupko1

1 Cell Research Dept., Life Sciences Faculty, Tel Aviv University, 2 Microbiology and Biotechnology Dept, Life Sciences Faculty, Tel Aviv University

Homing endonuclease genes (HEGs) usually reside in selfish transposable introns and inteins. They code for nucleases that cleave DNA at a highly specific site within homologs of their hosting gene that are lacking the intron/intein, thereby inducing homologous recombination, effectively copying the intron/intein into the homolog. In this so-called "homing" process HEGs facilitate the horizontal propagation of the intron/intein. Here we characterize the selection constraints arising from coevolution of the parasitic HEG with its target sequence in the hosting gene, and use them to infer the range of sequences that may be recognized and cleaved by the HEG. HEGs have been demonstrated to be a potential tool for gene targeting in mammalian cells (replacing an endogenous gene sequence). Thus, HEGs can be used to correct mutated genes and serve as a powerful tool for gene therapy. Our inference of a wide target range can be harnessed to detect potential HEG targets in the human genome. Thereby, the potential for finding a HEG in the natural repertoire that will be applicable in the treatment of a specific disease-related human mutation is increased by several orders of magnitude. This study was supported in part by a fellowship from the Edmond J. Safra Bioinformatics program at Tel-Aviv University.


P36
A Neuro-Immune Gene Ontology: a subset of GO directed for neurological and immunological systems
Nophar Geifman, Alon Monsonego and Eitan Rubin

The Shraga Segal Dept. of Microbiology and Immunology, Faculty of Medical Sciences and The National Institute of Biotechnology in the Negev, Ben Gurion University

The Gene Ontology (GO) is used to describe genes and gene products from many organisms. In using GO for functional annotation of microarray data, GO is often slimmed by editing so that only higher level terms remain. This practice is aimed at improving the statistical power of GO term enrichment analysis and the ability to summarize experimental results by grouping high level terms. Here we propose a new approach to editing the gene ontology, clipping, which is the editing of GO according to biological relevance. Creation of a GO subset by clipping is done by removing terms (from all hierarchal levels) if they are not functionally relevant to a given domain of interest. Terms that are located in levels higher to relevant terms are kept, thus, biologically irrelevant terms are only removed if they are not parental to terms that are relevant. Using this approach, we have created the Neuro-Immune Gene Ontology (NIGO) subset of GO directed for neurological and immunological systems. We tested the performance of NIGO in extracting knowledge from microarray experiments by conducting functional analysis and comparing the results to those produced using the full GO and a generic GO slim. NIGO not only improved statistical scores given to relevant terms, but was also able to retrieve functionally relevant terms that did not pass statistical cutoffs when using the full GO or the slim subset.


P37
Inference and Characterization of Horizontally Transferred Gene Families using Probabilistic Mixture Models
Ofir Cohen and Tal Pupko

Cell Research and Immunology Dept. Tel Aviv University.

Macro genomic events, in which genes are gained and lost, play a pivotal evolutionary role in microbial evolution. Nevertheless, probabilistic evolutionary models describing such macro genomic events are considerably less developed than models describing site-specific sequence evolution. Accurate modeling of the dynamics of such macro-evolutionary events is important, for example, to reliably infer instances of Horizontal Gene Transfer (HGT).We present novel likelihood-based models for analyzing the evolutionary dynamics of gains and losses of gene families. In these models gains and losses are represented by the transition between presence and absence, given an underlying phylogeny. We employ a mixture-model approach in which we allow both the gain rate and the loss rate to vary among gene families. We further developed a stochastic mapping method to infer HGT events and utilize it to quantify the prevalence of these events. This enables us to rank various gene families and lineages according to their tendency to undergo gains and losses. The novel mixture models describe the observed variability in gene family content among microbes significantly better than previous models. The model-based tool for the inference of gain and loss events was further proven to be highly accurate based on simulations. Our methodology suggests that at least 35% of the 4,869 gene families analyzed have experienced HGT at least once during their evolution.


P38
The Complexity Hypothesis revisited
Ofir Cohen 1, Uri Gophna 2 and Tal Pupko 1

1 Cell Research and Immunology Dept. Tel Aviv University. 2 Molecular Microbiology and Biotechnology Dept. Tel Aviv University

With the increasing availability of microbial genomes, it has become increasingly evident that Horizontal Gene Transfer (HGT) is a prevalent and important mechanism in microbial species evolution. One of the important challenges in HGT research is to better understand the factors that determine the tendency of genes to be successfully transferred (i.e., transferability). The so called 'complexity hypothesis' suggested by Jain and Rivera propose that transferability of genes depends on how many interacting partners their proteins have. The phyletic pattern of 4,869 gene families across 50 species was assembled using the COG database. This large corpus of data was used to test in large scale the association between number of interactions and transferability. The inference of HGT events was conducted using a stochastic mapping method that utilizes novel likelihood-based evolutionary models. The number of interacting proteins for each gene family was extracted from STRING database. In agreement with the complexity hypothesis, we found that proteins that are part of multi-protein complexes in the network are less likely to undergo HGT. As expected, a similar negative correlation is observed regarding the tendency of gene families to be lost. We further partitioned gene families into separate functional categories and tested weather within functionally homogenous group of proteins the negative correlation between number of interactions and transferability is still observed. Surprisingly, we found that for some functional categories including METABOLISM related genes, there is no significant association between number of interactions and transferability.


P39
Variability of the substrate-binding groove in the human kinome
Mor Rubinstein 1, Masha Niv 2

1 Institute of Biochemistry, Food Science and Nutrition. Hebrew University

Protein kinases constitute one of the largest gene families in the human genome and are major targets for anti-cancer drug discovery and development. The substrate binding groove molds kinase specificity which is crucial for signal pathways integrity. This specificity can be utilized for derivation of specific kinase inhibitors. Using structural data and a multiple sequence alignment of the human kinome, we have calculated similarity scores for residues that participate in binding of peptidic substrates. Our main results indicate that both the P0 (phosphoacceptor-binding) and the P-1 pockets can differentiate between the Ser/Thr and the Tyr kinases, and are highly conserved within each super-family. The rest of the substrate binding groove is hyper-variable, in contrast to the highly conserved ATP-binding pocket. Substrate-binding groove similarities do not necessarily correlate with the overall similarities of the kinase domain. For example, PKA sub-family of Ser/Thr kinases is phylogenetically distant from the ULK sub-family, but their P-5, P-2 and P+1 pockets are surprisingly similar, suggesting potential cross-reactivity. These findings are beneficial for the rational design of highly specific inhibitors of kinase-substrate interactions.


P40
Analysing the origin of long-range interactions in proteins using lattice models
Orly Noivirt 1, Ron Unger 2, Amnon Horovitz 1

1 Structural Biology Dept. Weizmann Institute , 2 Faculty of Life Sciences Bar-Ilan University

Background: Long-range communication is very common in proteins but the physical basis of this phenomenon remains unclear. In order to gain insight into this problem, we decided to explore whether long-range interactions exist in lattice models of proteins. Lattice models of proteins have proven to capture some of the basic properties of real proteins and, thus, can be used for elucidating general principles of protein stability and folding. Results: Using a computational version of double-mutant cycle analysis, we show that long-range interactions emerge in lattice models even though they are not an input feature of them. The coupling energy of both short- and long-range pairwise interactions is found to become more positive (destabilizing) in a linear fashion with increasing "contact-frequency", an entropic term that corresponds to the fraction of states in the conformational ensemble of the sequence in which the pair of residues is in contact. A mathematical derivation of the linear dependence of the coupling energy on "contact-frequency" is provided. Conclusions: Our work shows how "contact-frequency" should be taken into account in attempts to stabilize proteins by introducing (or stabilizing) contacts in the native state and/or through "negative design" of non-native contacts.


P41
A SYSTEM-LEVEL VIEW OF VIRAL MIMICRY OBTAINED VIA STRUCTURAL HOMOLOGY SCREENING
Nir Drayman and Ariella Oppenheim

Department of Hematology, The Hebrew University-Hadassah Medical School, Jerusalem, Israel.

Mimicry of host proteins is a wide-spread phenomenon which enables pathogens, including viruses, to "trick" host systems and use them to their advantage. In the present study we have used the "DALI-server" web-tool to identify host homologs of SV40 major capsid protein VP1, hypothesizing that the virus might mimic host proteins to activate multiple signaling pathways. Over 500 hits were found, mainly other viral coat proteins and sugar binding proteins. However, 7 groups of eukaryotic proteins were identified, including ligands of 3 different receptors families - TNF-like receptors, TAM (Tyro-3, Axl and Mer) receptors and EphrinB2. The known signaling cascades activated by these receptors coincide with known signaling events following SV40 infection. Furthermore, SV40 binding and activation of these receptors may explain how previously characterized major signaling pathways are initiated by the virus. Structural homology of VP1 to intra-cellular proteins such as M-calpain and the complement protein C1q was also found, suggesting a role for the viral capsid proteins as competitive inhibitors of host proteins.


P42
Unraveling nucleosome DNA repeat structure of C. elegans
Idan Gabdank 1, Danny Barash 1, Edward N. Trifonov 2 & 3

1 Computer Science Dept. Ben Gurion University of the Negev, 2 Genome Diversity Center Institute of Evolution University of Haifa, 3 Division of Functional Genomics and Proteomics Masaryk University Brno Czech Republic

An original signal extraction procedure is applied to database of 146 base nucleosome core DNA sequences from C. elegans (Johnson SM et. al., Genome Research 16, 1505-1516, 2006). The positional preferences of various dinucleotides within the 10.4 base nucleosome DNA repeat are calculated, resulting in derivation of the nucleosome DNA bendability matrix of 16x10 elements. All 6 chromosomes of C. elegans conform to the bendability pattern. The strongest affinity to their respective positions is displayed by dinucleotides AT and CG, separated within the repeat by 5 bases. The derived pattern makes a basis for sequence-directed mapping of nucleosome positions in the genome of C. elegans. As the first complete matrix of bendability available the pattern may serve for iterative calculations of the species-specific matrices of bendability applicable to other genomic sequences.


P43
Investigating structure and formation of the 5-HT1A serotonin receptor dimer
Noga Kowalsman 1, Ute Renner 2, Evgeni Ponimaskin 3, Masha Y. Niv 1

1 Institute of Biochemistry, Food Science and Nutrition, Hebrew University, 2 Institute for Physiology University of Gottingen Germany, 3 Cellular Neurophysiology dep. Hannover Medical School Germany

Serotonin (5-hydroxytryptamine) plays an important role in the modulation of body temperature, mood and sleep and is targeted for drug development. Serotonin receptors belong to the G protein-coupled receptors (GPCRs) family. In the recent years it became clear that many GPCRs act as dimers or larger complexes. In particular, oligomerization of 5-HT1A receptors in cells has been established, but its functional role is still unknown. To elucidate the role of 5-HT1A oligomerization we aim to design and construct receptors with impaired ability to dimerize and to study these intrinsically monomeric mutants in cells. For that purpose we modeled the structure of the 5-HT1A serotonin receptor monomer and used several approaches to predict putative residues in the 5-HT1A dimer interface. Namely, we looked for residues that may alter the electrostatic potential on the monomer surface; residues that are equivalent to those found to be proximate in published cross-linking experiments of related GPCRs; residues that appear in a dimeric interface obtained by analogy to an atomic force microscopy-based model of Rhodopsin dimer; and residues in the interface of 5-HT1A dimer models that we obtained by protein-protein docking. Several residues on helices 1, 4 and 5 consistently emerged as interesting candidates for site-directed mutagenesis. These mutants are being tested for their ability to disrupt the dimer formation by FRET microscopy.


P44
Large-scale analysis of Arabidopsis transcription reveals a basal co-regulation network
Osnat Atias 1, Benny Chor 2, Daniel A. Chamovitz 1

1 Department of Plant Sciences, The George S. Wise Faculty of Life Sciences, Tel Aviv University, 2 School of Computer Science, The Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University

Analyses of gene expression data from microarray experiments has become a central tool for identifying functional gene modules. A crucial aspect of such analysis is the integration of data from different experiments and laboratories. How to weigh the contribution of different experiments is an important point influencing the final outcomes. We have developed a novel method for this integration, and applied it to genome-wide data from multiple Arabidopsis microarray experiments performed under a variety of experimental conditions. The goal of this study is to identify functional globally co-regulated gene modules in the Arabidopsis genome. We have analyzed 21,000 Arabidopsis genes in 43 datasets. The analysis reveals that at least 10% of the Arabidopsis transcriptome is globally co-regulated, and can be further divided into known as well as novel functional gene modules. Two types of modules were identified in the regulatory network: stable and unstable modules which we further show to pertain to general and specialized modules, respectively. These modules were validated by comparison with the Genevestigator compendium of microarray experiments.


P45
Genome-Scale Identification of Legionella pneumophila Effectors using a Machine Learning Approach
David Burstein 1, Tal Zusman 2, Elena Degtyar 2, Ram Viner 2, Gil Segal 2, and Tal Pupko 1

1 Cell Research and Immunology Dept., Tel Aviv University, 2 Molecular Microbiology and Biotechnology Dept. Tel Aviv University

Many pathogenic bacteria exert their function by translocating a set of proteins, termed effectors, into the cytoplasm of their host cell. These effectors subvert host cell processes for the benefit of the bacteria. Our goal in this study was to identify novel effectors in a genomic scale, towards a better understanding of the molecular mechanisms of the pathogenicity pathways in this important intracellular pathogen. We developed a computational approach for the detection of new effectors in Legionella pneumophila, the causative agent of Legionnaires' disease, a severe pneumonia-like disease. The novelty of our approach for detecting effectors is that it is systems-biology based and it utilizes state-of-the-art machine learning classification algorithms. Applying this method, we detected and experimentally validated dozens of new effectors. Notably, our computational predictions had exceedingly high accuracy of over 90%. Analyzing these effectors we were able to obtain new insights into the molecular mechanism of the pathogenesis system. Notably, our results suggest, for the first time, that over 10% of the Legionella genome is dedicated to pathogenesis. Finally, our approach is general and can be utilized to study effectors in many other human pathogens.


P46
The DNA-encoded nucleosome organization of a eukaryotic genome
Noam Kaplan 1, Irene K Moore 2, Yvonne Fondufe-Mittendorf 2, Andrea Gossett 3, Desiree Tillo 4, Yair Field 1, Emily M LeProust 5, Timothy R Hughes 4,6,7, Jason Lieb 3 , Jonathan Widom 2, Eran Segal 1,8

1 Computer Science and Applied Mathematics Dept. Weizmann Institute of Science, 2 Biochemistry Molecular Biology and Cell Biology Dept. Northwestern University, 3 Biology Dept. University of North Carolina, 4 Molecular Genetics Dept. University of Toronto, 5 Agilent Technologies Inc, 6 Terrence Donnelly Centre for Cellular & Biomolecular Research, 7 Banting and Best Department of Medical Research, 8 Molecular Cell Biology Dept. Weizmann Institute of Science

In living cells, nucleosome organization is determined by multiple factors, including the action of chromatin remodelers, competition with site-specific DNA-binding proteins, and the DNA sequence preferences of the nucleosomes themselves. However, it has been difficult to estimate the relative importance of each of these mechanisms in vivo, because in vivo nucleosome maps reflect the combined action of all influencing factors. Here, we determine the importance of the nucleosome sequence preferences experimentally by measuring the genome-wide occupancy of nucleosomes assembled on purified yeast genomic DNA. The resulting map, in which nucleosome occupancy is governed only by the intrinsic sequence preferences of nucleosomes, is remarkably similar to in vivo nucleosome maps we generated in three different growth conditions. In vitro, nucleosome depletion is evident at many transcription factor binding sites and around gene start and end sites, suggesting that nucleosome depletion at these sites in vivo is partially encoded in the genome. Using our in vitro data, we devise a computational model of the nucleosome sequence preferences that is significantly correlated with in vivo nucleosome occupancy in C. elegans. Our results indicate that the intrinsic DNA sequence preferences of nucleosomes play a central role in determining the organization of nucleosomes in vivo.


P47
Receptor Specificity of the Hemagglutinin from the H5N1 Strain of the Influenza Virus
Daphna Meroz1, Tomer Hertz2 and Nir Ben-Tal1

1 Department of Biochemistry, George S. Wise Faculty of Life Sciences, Tel-Aviv University, Israel. 2 Vaccine and Infectious Disease Institute, Fred Hutch Cancer Research Center, Seattle, WA

Hemagglutinin (HA) is the principal antigen on the influenza virus surface. It is responsible for viral binding to host receptors, and so enabling entry into the host cell. There are 16 known HA avian and mammalian subtypes (H1-H16). One possible candidate for the next influenza pandemic is the H5N1 strain. This strain has become an endemic in wild waterfowl and in domestic poultry in many parts of Southeast Asia, and has been spreading across Asia into Europe and Africa. H5N1 remains at present largely a birds' disease. Human and avian influenza A viruses differ in their recognition of host cell receptors; the former prefer terminal sialic acids of glycoprotein and glycolipid receptors in _2,6 linkage (SA_2,6Gal ) to galactose, while the latter favor the _2,3 linkage (SA_2,3Gal). An alteration from SA_2,3Gal to SA_2,6Gal recognition is thought to be one of the changes that must take place before avian influenza viruses can replicate efficiently in humans and obtain the capability to cause a pandemic. Our goal is to identify specificity determinants in the hemagglutinin receptor-binding domain, which are responsible for a shift in receptor specificity between avian and human hosts. The sequence identity between avian and human H5N1 HA sequences is remarkably high (~90%-98%), where a single residue alteration in the receptor-binding domain is sufficient to cause a change in receptor recognition. With the use of phylogenetic analysis, we gained new insights on the evolutionary behavior of H5N1, which may reinforce its potential of being responsible for the next disastrous influenza pandemic. Through structural analysis, sequence analysis and machine learning, we were able to create a model, which is capable of distinguishing between avian and human HA sequences. Through this model, we identified residues that may be liable for removing the species-barrier.


P48
NetAge: an online network database for biogerontological research
Robi Tacutu, Arie Budovsky, Vadim Fraifeld

The Shraga Segal Dept. of Microbiology and Immunology, Center for Multidisciplinary Research on Aging, Ben-Gurion University of the Negev, Beer-Sheva, Israel

The increasing amount of data on genes associated with aging, longevity, and age-related diseases (ARD) calls for a common platform for their integration and analysis. We have recently shown that the human longevity-associated genes and the genes involved in major ARDs act in a cooperative manner and could be organized as scale-free protein-protein interaction (PPI) networks. With this in mind, we developed the NetAge database which includes a repository for networks associated with aging, longevity, and ARDs, and special tools for their analysis. First, we extended our previous PPI network model by including microRNA-regulated gene expression. Using this model we created highly annotated networks for aging, longevity, and major ARDs. Then, we developed YABNA (Yet Another Biological Networks Analyzer), a software program which will aid the user to create, modify, analyze and manage the networks. Finally, the constructed networks have been organized as a database and included in the NetAge website. NetAge offers visualization and analytic tools including node browsing, orthology information, miRNA and gene expression data, topology analysis as well as different kinds of simulations. Altogether, the NetAge database will promote incorporation of a network-based approach into biogerontological studies, thus contributing to our understanding of the systems biology of aging. A network-based approach could be especially useful for predicting longevity-promoting targets.



P49
Contact Map Prediction: Neighbours are Important When You're Close
Haim Ashkenazy 1,2, Ron Unger 2, Yossef Kliger 1

1 Compugen LTD, Tel Aviv, 69512, Israel. 2 The Mina and Everard Goodman Faculty of Life Sciences, Bar Ilan University, Ramat-Gan, 52900, Israel

Contact map is a coarse, but helpful, representation of a protein structure. We developed two complementary contact map predictors: (1) An ab initio method that is an ensemble of 9 Random Forest classifiers that use a large set of features including various measures of correlated mutations. The relevant classifier is selected according to the nature of the data available for the relevant position pair, and their separation on the sequence. A beta version of this predictor, AK_RF_2, participated in the recent CASP8 and was ranked amongst the best predictors. Further analysis revealed that the information regarding the neighboring residues dramatically improves the predictor accuracy for positions that are separated by up to 12 residues, whereas have only marginal effect for more separated pairs. (2) A multi-template based predictor for targets that have homologous proteins with experimentally solved 3D structure. This method weights templates according to their evolutionary proximity to the target protein. In addition, it considers possible multiple conformations that may be cloaked in highly similar templates. The contact maps predicted by this method are as good as, and in many cases more reliable than those derived from models created by state-of-the-art 3D structure prediction algorithms, and ours is much simpler. Together, these two predictors provide state-of-the-art tools for determine whether two amino acid residues interact or not.


P50
Using a Differential Geometry Approach for Characterizing Biological Relevant Interfaces
Shula Shazman 1, Gershon Elber 2, Yael Mandel-Gutfreund 1

1 Biology Dept. Technion, 2 Computer Science Dept.Technion

Inferring protein function from structure is a challenging task. Since protein interactions are central to various biological processes, studying the underlining principle of these interactions is critical for understanding living cells. Proteins interact with their various partners via distinct regions on their surface. These surfaces are expected to be characterized by unique chemical, physical and geometrical properties. Previous studies have focused on the characterizing the unique features of protein binding interface. However, distinguishing between DNA and RNA binding interfaces is still an enigma. In this study we present a novel differential geometric method for the characterizing protein binding interfaces, specifically DNA and RNA interfaces. The method is based primarily on geometric surface properties, specifically the Gaussian and Mean curvatures, using the IRIT software package http://www.cs.technion.ac.il/~irit. In this approach each surface point is classified into different surface's local geometry shapes based on the magnitudes of the Gaussian and the Mean curvatures. Using an unsupervised clustering approach we could clearly classify the different interfaces according to the relative distribution of the curvature at the surface points, associated with the different geometric shapes. Consequently, we applied a Support Vector Machine (SVM) to automatically classify double stranded DNA (dsDNA) from single stranded RNA (ssRNA) binding interfaces based on the geometric properties of the interfaces, achieving a relative high accuracy.


P51
GENECARDS: HUMAN GENOME PARALOG HUNTING, SET DISTILLATION, FUNCTIONAL SCORING AND FAST SEARCHES
Arye Harel 1, Gil Stelzer 1, Irina Dalah 1, Naomi Rosen 1, Justin Alexander1, Michael Shmoish 1, Tsippi Iny Stein 1, Alexandra Sirota 1, Asaf Madi 1, Tsviya Olender 1, Aron Inger 1, Marilyn Safran 2, Doron Lancet 1.

Departments of 1 Molecular Genetics 2 Biological Services (Bioinformatics Unit), The Weizmann Institute of Science, Rehovot, Israel

GeneCards (www.genecards.org ) is a comprehensive gene-centric compendium of annotative information about human genes, automatically mined from over 80 data sources. It is a powerful system, which facilitates genetic data retrieval, dissemination and sharing for research projects dealing with biological complexity. GeneALaCart (www.genecards.org/BatchQueries), a member of the GeneCards suite, allows retrieval of information about multiple genes in a batch query, particularly useful for analyzing expression microarray results. The recently developed GeneDecks tool highlights shared descriptors of genes. GeneDecks has two modes: 1) Paralog Hunter, which seeks functional paralogs for a query gene based on combinatorial similarity of attributes; 2) Set Distiller, which ranks attributes by their degree of being shared among members of a given gene set, allowing the discovery of collective biological patterns. The improved GeneCards Inferred Functionality Scores (GIFtS) algorithm produce a quantitative assessment of the degree of knowledge about the functionality of a given gene. Each GeneCards entry is associated with a GIFtS value, corresponding to the sum of binary scores, indicating presence or absence of data in each of 67 selected GeneCards fields. The new GeneCards version 3 alpha (www.genecards.org/v3) is powered by a relational database enabling complex queries; its new search engine allows the remarkably fast retrieval of ranked results.


P52
Topology-Free Querying of Protein Interaction Networks
Sharon Bruckner 1, Falk Huffner 1, Richard M. Karp 2, Ron Shamir 1, Roded Sharan 1

1 The Blavatnik School of Computer Science, Tel Aviv University, 2 International Computer Science Institute, Berkeley, CA.

In the network querying problem, one is given a protein complex or pathway of species A and a protein-protein interaction network of species B; the goal is to identify subnetworks of B that are similar to the query. Existing approaches mostly depend on knowledge of the interaction topology of the query in the network of species A; however, in practice, this topology is often not known. To combat this problem, we develop a topology-free querying algorithm, which we call Torque. Given a query, represented as a set of proteins, Torque seeks a matching set of proteins that are sequence-similar to the query proteins and span a connected region of the network, while allowing both insertions and deletions. The algorithm uses alternatively dynamic programming and integer linear programming for the search task. We test Torque with queries from yeast, fly, and human, where we compare it to the QNet topology-based approach, and with queries from less studied species, where only topology-free algorithms apply. Torque detects many more matches than QNet, while in both cases giving results that are highly functionally coherent.


P53
An application of "divide-and-conquer" algorithm to analysis of whole genome tiling array data
Brodsky Leonid 1, BenTal Nir 2, BenJacob Eshel 3, and Nevo Eviatar1

1 Institute of Evolution, University of Haifa. 2 Faculty of Life Science, Tel Aviv University. 3 School of Physics and Astronomy, Tel Aviv University.

The sequence analysis oriented "divide-and-conquer" algorithm was transformed to a sensitive and accurate analysis tool for processing whole genome tiling array data. The advantage of the algorithm over previous methods is that it can detect both short and long genome fragments enriched by upregulated signals, determining the margins of short and long fragments with equal accuracy. The score of an enriched genome fragment reflects the discrepancy between the actual concentration of upregulated signals in the fragment, and the expected concentration under the null hypothesis of a uniform distribution of upregulated signals over the chromosome. The algorithm detects a series of non-intersecting fragments of different lengths with optimal scores. Recalculating fragment's baselines the procedure is applied to each of the detected fragments in a nested manner. As a result a nested whole genome landscape of activation is generated that is quite accurate in its micro and macro details. The algorithm was applied to (i) Arabidopsis expression data; (ii) Arabidopsis histone3 lysine27 trimethylation CHIP-on-chip data; (iii) Arabidopsis DNA methylation data; (iv) the CHIP-on-chip data on the chromatin remodeling factor ISW2 binding sites in S. cerevisiae genome; and (v) the spliced intron data in the S. cerevisiae genome. The analyses results demonstrate the power of the algorithm to identify both the short upregulated fragments (like genes, exons, transcription factor binding sites), and the long -- even moderately upregulated zones of the genome -- at their precise genome margins. The algorithm generates the whole genome nested landscapes that could be used as a helpful tool for cross comparison of different signals across the same genome. This may lead to the detection of important regulatory genome elements.


P54
MetaboStat: post ion-identification analysis of LC/GC-MS data
Brodsky Leonid 1,2, Rogachev Ilana 1, Venger Ilya 1, Malitsky Sergey 1 and Aharoni Asaph 1

1 Department of Plant Sciences, Weizmann Institute of Science, 2 Institute of Evolution, University of Haifa

The peak-picking/peak-alignment programs such as XCMS or MarkerLynx carry out identification of ion-profiles, where each profile is a series of ion intensities across biological samples. After identification of ions and following the behavior of their intensities across samples, the researcher has to interpret the results from several perspectives: quality control (QC), biostatistics, and structural identification of metabolites. Based on the profiles of ion intensities our MetaboStat program helps to answer these questions: Detection of trustable ions : ion-profiles that are robust under random perturbation of parameters of the peak-picking/peak-alignment procedure; Detection of "metabolites". Each metabolite is a group of ions that are (i) eluted at the same narrow retention time interval; (ii) have highly correlated profiles; and (iii) the group is enriched with the isotope series. Estimation of the statistically significant effects of applied biological factors and their interactions for each ion and each metabolite. As an input MetaboStat takes the multiple detection of ion-profiles by any peak-picking/peak-alignment program (XCMS by default) under variations of the program's parameters. The algorithmic core of MetaboStat exploits the following ideas: - Quantile normalization of ion-intensities inside each group of biological replicates. -Clustering of ion-profiles in the PCA space. The nested clustering is generated by a recursive identification of the local neighborhoods in the PCA-space that are enriched by ion-profiles. Under different initial limitations this procedure is applied to the identification of the trustable ions and to detection of the metabolites as ion groups. -The ANOVA-equivalent multiple linear regression analysis of ion intensities against orthogonal contrasts is used for the detection of the per-ion significant biological effects and their interactions. - The significant effects for metabolites (groups of ions) are identified via Wald statistics based "voting" of individual ion-profiles of the metabolite either "for" or "against" significance of every effect (contrast) of the ANOVA model.


P55
Structural Signature of Antibiotic Binding Sites on the Ribosome
Hilda David-Eden and Yael Mandel-Gutfreund

Faculty of Biology, The Technion- Israel Institute of Technology, Haifa Israel

The ribosome represents a major target for antibacterial drugs. Being a complex molecular machine it offers many potential sites for functional interference. The high resolution structures of ribosome in complex with various antibiotics provide an exclusive data set for understanding the unique features of drug binding pockets on the ribosome. In this work we have analyzed the structural and evolutionary properties of 33 antibiotic binding sites in the ribosome identified by crystallography. We compared these sites to similar putative small molecule-binding pockets present in the small and large ribosomal subunits. On the basis of this analysis we defined properties of the known drug binding sites which constitute an RNA signature of 'druggable' pocket in the ribosome. The most noticeable property that defines true drug-binding sites was the prevalence of non-paired bases. In addition, in these sites we observed a strong bias to the unusual syn conformations of the RNA bases and the C2' endo sugar pucker. We propose that albeit the different geometric and chemical properties of diverse antibiotics, their binding site tend to have common attributes, possibly reflecting the potency of the pocket for binding small organic molecules. The finding properties can be used to identify new ribosomal sites which could be targeted by small molecular weight ligands, including new antibiotics.


P56
Gene Translation in Humans is Efficient
Yedael Y. Waldman* 1, Tamir Tuller* 1-3, Tomer Shlomi 1, 4, Roded Sharan 1, Eytan Ruppin 1, 3

1 Blavatnik School of Computer Science, 2 Department of Molecular Microbiology and Biotechnology and 3 School of Medicine, Tel Aviv University, Ramat Aviv 69978, Israel. 4. current address: Computer Science Department, Technion - Israel Institute of Technology, Haifa 32000, Israel. *these authors contributed equally to this work.

It is believed that in many unicellular organisms codon bias has evolved to optimize translation efficiency (TE). Previous studies, however, have left the question of TE in humans an intriguingly open one. We perform the first large scale tissue-specific analysis of TE in human tissues, using the tRNA Adaptation Index (tAI) as a measure for a gene's TE and tissue specific gene expression levels. We find that a gene's TE is correlated with its expression levels. Different tissues have fairly different overall correlation levels, with the heart, lung and liver having the largest scores. A (yet small scale) study indicates that adult tissues have higher correlation scores than fetal tissues, suggesting that the genomic tRNA pool is adapted to the adult period. Additionally, we find marked correlations between a gene's TE and its functional importance, assessed via its expression breadth across tissues, its evolutionary rate and its degree in protein interaction network. Optimization based analysis shows that the tRNA pool - codon bias co-adaptation is globally (rather than tissue- specific) driven. Taken together, our results indicate that codon bias has an important role in humans, making gene translation efficient.


P57
Metazoan operons accelerate transcription and recovery rates
Alon Zaslaver, L. Ryan Baugh and Paul W. Sternberg

Howard Hughes Medical Institute and California Institute of Technology, Division of Biology, 1200 E. California Blvd., Pasadena, California 91125

Existing theories efficiently explain why operons are advantageous in prokaryotes, but their emergence in metazoans is still an enigma. We present a combination of genomic meta-analysis, experiment and theory to explain how operons could be adaptive during metazoan evolution. Focusing first on nematodes, we show that operon genes, typically consisted of growth genes, are significantly up-regulated during recovery from multiple growth-arrested states, and that this expression pattern is anti-correlated to the expression pattern of non-operon genes. In addition, we find that transcriptional resources are initially limited during arrest recovery, and that recovering animals are extremely sensitive to any additional limitation in transcriptional resources. By clustering growth genes into operons, fewer promoters compete for limited transcriptional machinery, effectively increasing the concentration of transcriptional resources and accelerating growth during recovery. A simple mathematical model of transcription dynamics reveals how a moderate increase in transcriptional resources can lead to a substantial enhancement in transcription rate and recovery. We find evidence for this design principle in different nematodes as well as in the chordate C. intestinalis. As recovery from a growth arrested state into a fast growing state is a physiological feature shared by many metazoans, operons could evolve as an evolutionary solution to facilitate these processes.


P58
Fractured Genes: A Novel Genomic Arrangement Involving New Split-Inteins and homing Endonuclease Family
B. Dassa 1, N. London 2, B. Stoddard 3, O. Schueler-Furman 2 and S. Pietrokovski 1

1 Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel. 2 Department of Molecular Genetics and Biotechnology, Hebrew University, Hadassah Medical School, Jerusalem, Israel. 3 Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA USA .

Intein domains are inserted in-frame into protein-coding genes, and auto-catalyze their removal from the protein precursor via a protein-splicing reaction. Intein-containing genes can be split into two at the intein domain, with the split-intein parts acting in trans, creating the mature protein by ligating the intein flanks. Bioinformatic analyses of environmental metagenomic sequences revealed 26 different loci with a novel genomic arrangement. In each locus, the coding region of a conserved enzyme is broken in two by a split- intein, with a free-standing endonuclease gene inserted in between. Eight types of DNA synthesis and repair enzymes have this "fractured" organization. Some loci include apparent gene control elements brought in with the endonuclease gene. Sequence and structure features of these new natural split-intein types were analyzed. A newly predicted homing endonuclease family, related to Very-short patch repair endonucleases, was also identified. These putative homing endonuclease genes also appear in group I introns, and as stand-alone inserts. The new fractured genes organization appears to mainly be present in phages, shows how endonucleases can integrate into intein reading frames, and may represent a missing link in the evolution of gene breaking in general, and in the creation of split-inteins in particular.


P59
Deep transcriptome sequencing of the Sulfolobus solfataricus in a single nucleotide resolution
Omri Wurtzel 1, Rajat Sapra 2,3, Blake A. Simmons 2,3, Rotem Sorek 1

1 Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel. 2 Sandia National Laboratories, Livermore, CA 94551. 3 Joint BioEnergy Institute, Emeryville, CA 94608.

RNA-level regulation, manifested in non-coding RNAs and cis-antisense transcripts, as well as RNA modifications and degradation, is found in all domains of life. While some knowledge has accumulated on RNA regulation in bacteria, little is known on the regulatory roles of RNAs in archaea. We have investigated the transcriptome of the cellulose-degrading extremophilic archaeon Sulfolobus solfataricus P2, which is important for future development of biofuels, using a modified high-throughput RNA sequencing technique (RNA-Seq). The sequenced transcriptome revealed abundance of non-coding RNAs and cis-antisense transcripts to a level never documented in any prokaryote, suggesting that antisense transcription has a fundamental regulatory role in Sulfolobus. Sequence motifs regulating transcription initiation and RNA degradation were also documented in a genome-wide manner. Our study highlights the power of RNA-seq for unveiling RNA-level regulation in prokaryotes.


P60
Deriving Enzymatic Signatures from Short Read Data
Uri Weingart1, Yair Lavi1, Erez Persi1, Uri Gophna2, and David Horn1

1School of Physics and Astronomy, Tel Aviv University, Tel Aviv 69978, Israel,2 Department of Molecular Microbiology and Biotechnology, Tel Aviv University, Tel Aviv 69978, Israel

We propose a method for deriving enzymatic signatures from short read (SR) metagenomic data of unknown species. The SR data are converted to six pseudo-peptide candidates. We search for occurrences of Specific Peptides (SPs) on the latter. SPs are peptides that are indicative of enzymatic function as defined by the Enzyme Commission nomenclature. Counting their hits we define SP-bags associated with specific EC categories. The latter can be converted to estimates of numbers of enzymes associated with the given EC categories on the studied metagenome, thus defining its enzymatic spectrum. The method is developed and tested on 15 species (bacteria and archaea) for which there exist good proteomic annotations.


P61
Insertion hotspots and nepotism of DNA parasites: large-scale analysis of human nested transposed elements
Asaf Levy, Schraga Schwartz, Gil Ast

Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel-Aviv University

Throughout evolution, eukaryotic genomes have been invaded by mobile genetic elements known as transposable element (TEs). We analyzed hundreds of thousands of nested TEs in the human genome. Nested TEs are an informative system, since the inserted and the targeted TE can be dated with regard to each other, and since the original DNA integration site can be inferred. Our analysis comprised three levels: We first performed a global assessment to determine the extent to which young TEs tend to nest within older transposed elements. Analysis of over 36,000 combinations of TE nesting events demonstrated a four-fold higher tendency of TEs to insert into existing TEs than to insert within non-TE intergenic regions. The major factors affecting TE nesting are the inserted and targeted TE types, their relative orientations and the integration site distribution. We next used a novel method to analyze the positions along the targeted TEs into which over 300,000 TEs were inserted, and discovered that most TEs insert at hotspots. In particular, retrotransposed Alu elements contain a single prominent non-canonical hotspot for insertion of other Alu sequences. Finally, we identified novel sequence motifs favored by the factors that regulate transposition of various important TE families. To conclude, transposed elements contributed to the genomic expansion of newer TEs. TE nesting events provide novel details of the molecular mechanisms underlying transposition.


P62
Inferential optimization for simultaneous fitting of multiple components into a cryoEM map of their assembly
Keren Lasker1,2, Maya Topf3, Andrej Sali2, Haim J. Wolfson1

1Blavatnik School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University. 2Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, and California Institute for Quantitative Biosciences (QB3), University of California at San Francisco. 3Institute of Structural and Molecular Biology, School of Crystallography, Birkbeck College, University of London, Malet Street, London WC1E 7HX, UK.

Motivation: Models of macromolecular assemblies are essential for a mechanistic description of cellular processes. Such models are increasingly obtained by fitting atomic-resolution structures of components into a density map of the whole assembly. Yet, current density-fitting techniques are frequently insufficient for an unambiguous determination of the positions and orientations of all components. Examples include the 19S proteasome, the mammalian ribosome, and the ryanodine receptor isoform 1. Methods and Results: We have developed MultiFit [1], a method for simultaneously fitting atomic structures of components into their assembly density map at resolutions as low as 25 A. The method was benchmarked on 7 large assemblies of known structure. It generally finds a near-native configuration in one of the 5 top scoring solutions. The component positions and orientations are optimized with respect to a scoring function that includes the quality-of-fit of components in the map, the protrusion of components from the map envelope, as well as the shape complementarity between pairs of components. The scoring function is optimized by our exact inference optimizer DOMINO that efficiently finds the global minimum in a discrete sampling space. The idea of the DOMINO optimizer is to decompose the set of variables into relatively uncoupled but potentially overlapping subsets that can be sampled independently form each other, followed by efficiently gathering the subset solutions into the global minimum. Implications: First, MultiFit can provide initial configurations for further refinement of many multi-component assembly structures described by electron microscopy. Second, the DOMINO optimizer can be applied to many problems in structural modeling, from low-resolution assembly modeling to sidechain refinement. Its strength derives from the junction tree algorithm that helps reduce the size of the search space from exponential in the number of components in the whole system to exponential in the number of components in the largest subset. Third, MultiFit and DOMINO can facilitate integrative modeling of large macromolecular assemblies based in part on their density maps. [1] Lasker K, Topf M, Sali A, Wolfson HJ, J. Mol. Biol. , in press.


P63
Evolutionary Modeling of Rate-Shifts Reveals Specificity Determinants in HIV-1 Subtypes
Osnat Penn 1, Adi Stern 1, Nimrod D. Rubinstein 1, Tal Pupko 1

1 Cell Research and Immunology Dept. Tel Aviv University.

A hallmark of the human immunodeficiency virus 1 (HIV-1) is its rapid rate of evolution. Two complementary hypotheses are suggested to explain the sequence variability among HIV-1 subtypes. The first suggests that the functional constraints at each site remain the same across all subtypes, and the differences among subtypes are a direct reflection of random substitutions, which have occurred during the time elapsed since their divergence. The alternative hypothesis suggests that the functional constraints themselves have evolved, and thus sequence differences among subtypes in some sites reflect shifts in function. To determine the contribution of each of these two alternatives to HIV-1 subtype evolution, we have developed a novel Bayesian method for testing and detecting site-specific rate shifts. The RAte Shift EstimatoR (RASER) method determines whether or not site-specific functional shifts characterize the evolution of a protein and, if so, points to the specific sites and lineages in which these shifts have most likely occurred. Applying RASER to a dataset composed of large samples of HIV-1 sequences from different group M subtypes, we reveal rampant evolutionary shifts throughout the HIV-1 proteome. Most of these rate shifts have occurred during the divergence of the major subtypes, establishing that subtype divergence occurred together with functional diversification. We report further evidence for the emergence of a new sub-subtype, characterized by abundant rate-shifting sites. When focusing on the rate-shifting sites detected, we find that many are associated with known function relating to viral life cycle and drug resistance.


P64
Predicting the Affinity between Proteins and Small Molecules using a Random Forest Regression Model
Tammy Menasherov 1,2 , Ron Unger 1 , Yossef Kliger 2

1 The Mina and Everard Goodman Faculty of Life Sciences, Bar Ilan University, Ramat-Gan, 52900, Israel. 2 Compugen LTD, Tel Aviv, 69512, Israel

Predicting the affinity between proteins and drug-like small molecules (SM) has a great potential in drug discovery. We present a method for predicting enzyme-SM binding affinity using a Random Forest Regression Model. Labeled data composed of all available enzymes-SM co-crystals was extracted from four publicly available interaction databases, and used to train a "Regressifier". The SMs were described using 27 traditional QSAR descriptors, including constitutional descriptors, functional group counts descriptors, and molecular properties. Each enzyme is represented by a bit-vector containing local descriptors extracted from its co-crystals for training, or from its unbound structure for testing. Both train and test sets contain enzyme-SM complexes from all 6 major enzyme classes. Parameter optimization resulted in Mtry=1 and number of trees >= 1000. The Pearson Correlation Coefficient between the predicted and the experimental affinities was 0.723. The correlation was higher for enzyme classes 1-4, where more than 50 data instances were available for each, but much lower for enzyme classes 5-6, where data was scarce. We conclude that predicting interactions between the entire biological and chemical space is an appealing field for further investigation.


P65
Predicting Synthetic Lethality in the Human Protein Interaction Network
Aron Inger, Tsviya Olender and Doron Lancet

Department of Molecular Genetics, The Weizmann Institute of Science

Synthetic lethality occurs when two otherwise non-lethal mutations together result in a non-viable cell. We analyzed 29 network properties within the yeast protein-protein interaction network (PPI) and searched for correlations with known lethality and synthetically lethality. A number of node-pair network properties correlated with synthetic lethality, e.g. number of shared interacting partners, information centrality, shortest path distance, and mean degree centrality. Neural networks were then built to predict the likelihood of lethality or synthetic lethality. The accuracy of the prediction was evaluated by comparison to known lethality data in yeast. Genes with high score were ~70% lethal, compared with ~20% for all genes, and highly scoring gene pairs were ~50% synthetically lethal, compared with ~1% for all gene pairs. We further applied the models learned on yeast to the human PPI, and obtained a prediction for lethality and synthetic lethality in 8208 human genes. For validation, we compared the human lethality predictions to experimentally established mammalian lethality via human-mouse orthology. The study provides an inferred whole-genome human synthetic lethality map, hitherto not available, and demonstrates the viability of cross-species predictions. We are currently applying these predictions in a study of cancer therapy based on synthetic lethality. Supported by the SYNLET EU project


P66
Converting promiscuous proteins into specific ones: design of calmodulin mutants with up to 900-fold enhancement in binding specificity towards CaMKII.
Eliyahu Yosef, Regina Politi, and Julia M. Shifman

Department of Biological Chemistry, Hebrew University of Jerusalem, Jerusalem, Israel 91904

Calmodulin (CaM) is a ubiquitous second messenger protein that regulates a variety of structurally and functionally diverse targets in response to changes in Ca2+ concentration. CaM-dependent protein kinase II (CaMKII) and calcineurin (CaN) are the prominent CaM targets that play an opposing role in many cellular functions including synaptic regulation. We used the computational protein design approach to modify CaM binding specificity for these two targets. Starting from the X-ray structure of CaM in complex with the CaM-binding domain of CaMKII, we optimized CaM interactions with CaMKII by introducing mutations into the CaM sequence. CaM optimization was performed with a protein design program ORBIT using a modified energy function that emphasized intermolecular interactions in the sequence selection procedure. Several CaM variants were experimentally constructed and tested for binding to the CaMKII and CaN peptides using the Surface Plasmon Resonance technique. Most of our CaM mutants demonstrated small increase in affinity for the CaMKII peptide and substantial decrease in affinity for the CaN peptide compared to that of WT CaM. Our best CaM design exhibited an about 900-fold increase in binding specificity towards the CaMKII peptide, becoming the highest specificity switch achieved in any protein-protein interface through the computational protein design approach. Our results demonstrate that very specific binding partners could be designed through computational means even without explicit negative design. Optimization of the protein sequence for interactions with a specific target is often sufficient in preventing the occurrence of favorable interactions with unwanted targets.


P67
Computational design of protein-protein interactions: affinity enhancement at the fasciculin-AChE interface.
Oz Sharabi and Julia M. Shifman

Department of Biological Chemistry, Hebrew University of Jerusalem, Jerusalem, Israel 91904

Re-engineering of protein-protein interactions remains an interesting yet challenging task. Computational design methods proved very successful in modulating protein binding specificity. However, the same methods are less powerful in predicting affinity-enhancing mutations at protein-protein interfaces. In particular, state-of-the-art design programs often fail to reproduce favorable hydrogen bond and electrostatic interactions across the binding interface. We developed improved computational methods that can model more accurately polar interactions between the binding partners. The performance of the improved methods was tested on the complex between a synaptic enzyme acetylcholinesterase (AChE) and its reversible inhibitor fasciculin (Fas), whose very high affinity for AChE (Kd of 10-11 M) is largely defined by polar interactions between the two proteins. Using several slightly different computational protocols, we redesigned the Fas binding interface to enhance its affinity for AChE. A number of computationally designed Fas mutants were experimentally constructed and tested for their ability to bind to and inhibit the enzyme. While some of the Fas mutants showed unchanged or slightly worse binding to AChE when tested experimentally, our best Fas mutant exhibited a 1.2 kcal/mol improvement in the binding free energy. The experimental results were used to further perfect the energy function for protein-protein interface design as well as to evaluate the range of Kds accessible through various binding interface mutations given the three-dimensional structure of the protein-protein complex.


P68
The Development of the Immune Network from Birth to Adulthood
Asaf Madi1,2, Sharron Bransburg-Zabary1,2, Dror Y. Kenett1, Alfred I. Tauber3, Irun R. Cohen4 and Eshel Ben-Jacob1

1 School of Physics and Astronomy, Tel Aviv University, Israel. 2 Faculty of Medicine, Tel Aviv University, Israel. 3 Center for Philosophy and History of Science, Boston University, Boston, MA, USA. 4 Department of Immunology, Weizmann Institute of Science, Israel.

The concept of immune network was first introduced by Niels Jerne in the early 1970's. He proposed that cells and molecules of the immune system do not only interact with foreign substances, but also recognize, respond to and are regulated by each other. In general, the production of a given antibody elicit the production of other antibodies that in turn elicit (or suppress) the production of other antibodies and so on. The immune network theory describes the immune system as a complex network of molecules, cells and organs that act together to maintain and repair the body and protect it against foreign invaders. These tasks involve the sensing and processing of foreign antigens together with internal signals that disclose the state of the body leading to immune responses, learning and memory. As topological structure of the immune network is that of a complete graph, extracting meaningful information about the network is quite difficult. Here we address this problem by extracting a subgraph based on the information embedded in the correlation matrix, into a topological structure. We analyzed 305 IgM and IgG antibody reactivity of mothers and their offspring (umbilical cord) data (Merbl et al., 2007; Madi et al., 2008). Here we propose the use of a Minimum Spanning Tree (MST) and Partial Correlation as a conceptual analysis for the study of the immune system as a network, and present preliminary results of such analysis.By studying the network properties of the immune antibody repertoire, our analysis methods shed light on hidden characteristic of the human immune system; we succeeded in detecting immune hubs, antibodies' intra connections, and in capturing physiological processes like the cross placental transfer of maternal IgG and the evolving nature of personal IgM repertoire.


P70
A variability map of the olfactory receptor subgenome
Yehudit Hasin, Tsviya Olender ,Ifat Keidar ,Dina Leshkovitz, Miriam Khen and Doron Lancet

Department of Molecular Biology, The Weizmann Institute of Science, Rehovot.

An important present challenge in human genomics is large-scale identification of inter-individual variation, serving evolutionary and genetic association studies of complex phenotypes. The recent advent of next-generation sequencing allows the launch of most ambitious sequencing projects, including the 1000 genomes project and international cancer genome consortium. Our laboratory aims at a comprehensive elucidation of the human genome variation picture relevant to olfactory receptors (ORs), the largest gene superfamily in the human genome. We applied next generation sequencing to 96 intact OR coding regions in 20 individuals of different origins. We identified 623 SNPs, 256 of which were novel. Most important, 19 of the novel SNPs are deleterious events of up to 4 bp. The novel SNPs are characterized by lower minor allele frequency, and show a significantly greater number of non-synonymous SNPs than expected. An excess of non-synonymous SNPs was also observed in 36 individuals sequenced at 4X coverage as part of 1000 genomes project. To access the false positive rate of our SNP detection we picked a sample of ~75 new SNPs and ~25 known SNPs, to be validated by Sequenom SNP scoring. Our results, in combinations with our previous data on high resolution copy number variations provide a comprehensive variability map of the olfactory sub-genome. Next, we intend to assess the phenotypes of different types of genomic variation identified in these works.