Publications
2023
-
(2023) Cell Reports Physical Science. 4, 5, 101384. Abstract
The origin of life must have involved an unlikely transition from chaotic chemistry to self-reproducing supramolecular structures. Previous quantitative analyses of self-reproducing mutually catalytic networks made of simple molecules have led to increasing popularity of this pre-RNA scenario for life’s origin. Here, we investigate in detail the reproduction characteristic of the graded autocatalysis replication domain (GARD) computer-simulated physicochemically rigorous lipid-based model. This model displays compatibility with heterogeneous environments, addresses the network’s spatial demarcation, and portrays trans-generational compositional information transfer. However, we find that compositionally reproducing states are extremely rare, suggesting that random roaming would be a vastly inefficient path toward reproduction. Rewardingly, the present study shows that all self-reproducing states are also dynamic attractors of the catalytic network. This suggests a greatly enhanced propensity for the spontaneous emergence of reproduction and primal evolution, augmenting the likelihood of protolife appearance.[Display omitted]
•Life’s origin may have involved self-reproducing supramolecular autocatalytic entities•Simulated physicochemical model for lipid assemblies shows frequent self-reproduction•Reproduction is observed only within very rare compositional states•Self-reproducers prove to be dynamic attractors, improving the chance for life’s origin
Simulations of the dynamic behavior of spontaneously formed lipid assemblies can offer insight into the origins of life, but few assembly compositions self-reproduce, presumably necessary for life to begin. Kahana et al. show that some self-reproducing compositions are dynamic attractors, making self-reproduction, and hence life’s emergence, much more plausible.
2022
-
(2022) Life (Basel, Switzerland). 12, 7, 955. Abstract
Mixed lipid micelles were proposed to facilitate life through their documented growth dynamics and catalytic properties. Our previous research predicted that micellar self-reproduction involves catalyzed accretion of lipid molecules by the residing lipids, leading to compositional homeostasis. Here, we employ atomistic Molecular Dynamics simulations, beginning with 54 lipid monomers, tracking an entire course of micellar accretion. This was done to examine the self-assembly of variegated lipid clusters, allowing us to measure entry and exit rates of monomeric lipids into pre-micelles with different compositions and sizes. We observe considerable rate-modifications that depend on the assembly composition and scrutinize the underlying mechanisms as well as the energy contributions. Lastly, we describe the measured potential for compositional homeostasis in our simulated mixed micelles. This affirms the basis for micellar self-reproduction, with implications for the study of the origin of life.
-
(2022) Practical Guide to Life Science Databases. p. 27-56 Abstract
The GeneCards® database of human genes was launched in 1997 and has expanded since then to encompass gene-centric, disease-centric, and pathway-centric entities and relationships within the GeneCards Suite, effectively navigating the universe of human biological data—genes, proteins, cells, regulatory elements, biological pathways, and diseases—and the connections among them. The knowledgebase amalgamates information from >150 selected sources related to genes, proteins, ncRNAs, regulatory elements, chemical compounds, drugs, splice variants, SNPs, signaling molecules, differentiation protocols, biological pathways, stem cells, genetic tests, clinical trials, diseases, publications, and more and empowers the suite’s Next Generation Sequencing (NGS), gene set, shared descriptors, and batch query analysis tools.
2021
-
(2021) Chemical Society Reviews. 50, 21, p. 11741-11746 Abstract
A widespread dogma asserts that life could not have emerged without biopolymers – RNA and proteins. However, the widely acknowledged implausibility of a spontaneous appearance and proliferation of these complex molecules in primordial messy chemistry casts doubt on this scenario. A proposed alternative is “Lipid-First”, based on the evidence that lipid assemblies may spontaneously emerge in heterogeneous environments, and are shown to undergo growth and fission, and to portray autocatalytic self-copying. What seems undecided is whether lipid assemblies have protein-like capacities for stereospecific interactions, a sine qua non of life processes. This Viewpoint aims to alleviate such doubts, pointing to growing experimental evidence that lipid aggregates possess dynamic surface configurations capable of stereospecific molecular recognition. Such findings help support a possible key role of lipids in seeding life's origin.
-
(2021) Nature reviews. Chemistry. Abstract
Protocells at life’s origin are often conceived as bilayer-enclosed precursors of life, whose self-reproduction rests on the early advent of replicating catalytic biopolymers. This Perspective describes an alternative scenario, wherein reproducing nanoscopic lipid micelles with catalytic capabilities were forerunners of biopolymer-containing protocells. This postulate gains considerable support from experiments describing micellar catalysis and autocatalytic proliferation, and, more recently, from reports on cross-catalysis in mixed micelles that lead to life-like steady-state dynamics. Such results, along with evidence for micellar prebiotic compatibility, synergize with predictions of our chemically stringent computer-simulated model, illustrating how mutually catalytic lipid networks may enable micellar compositional reproduction that could underlie primal selection and evolution. Finally, we highlight studies on how endogenously catalysed lipid modifications could guide further protocellular complexification, including micelle to vesicle transition and monomer to biopolymer progression. These portrayals substantiate the possibility that protocellular evolution could have been seeded by pre-RNA lipid assemblies.
-
(2021) Journal of Molecular Biology. 166913. Abstract
Non-coding RNA (ncRNA) genes assume increasing biological importance, with growing associations with diseases. Many ncRNA sources are transcript-centric, but for non-coding variant analysis and disease decipherment it is essential to transform this information into a comprehensive set of genome-mapped ncRNA genes. We present GeneCaRNA, a new all-inclusive gene-centric ncRNA database within the GeneCards Suite. GeneCaRNA information is integrated from four community-backed data structures: the major transcript database RNAcentral with its 20 encompassed databases, and the ncRNA entries of three major gene resources HGNC, Ensembl and NCBI Gene. GeneCaRNA presents 219,587 ncRNA gene pages, a 7-fold increase from those available in our three gene mining sources. Each ncRNA gene has wide-ranging annotation, mined from >100 worldwide sources, providing a powerful GeneCards-leveraged search. The latter empowers VarElect, our disease-gene interpretation tool, allowing one to systematically decipher ncRNA variants. The combined power of GeneCaRNA with GeneHancer, our regulatory elements database, facilitates wide-ranging scrutiny of the non-coding terra incognita of gene networks and whole genome analyses.
2020
-
(2020) Cell Reports. 33, 9, 108456. Abstract
Amyotrophic lateral sclerosis (ALS) is an incurable neurodegenerative disease. CAV1 and CAV2 organize membrane lipid rafts (MLRs) important for cell signaling and neuronal survival, and overexpression of CAV1 ameliorates ALS phenotypes in vivo. Genome-wide association studies localize a large proportion of ALS risk variants within the non-coding genome, but further characterization has been limited by lack of appropriate tools. By designing and applying a pipeline to identify pathogenic genetic variation within enhancer elements responsible for regulating gene expression, we identify disease-associated variation within CAV1/CAV2 enhancers, which replicate in an independent cohort. Discovered enhancer mutations reduce CAV1/CAV2 expression and disrupt MLRs in patient-derived cells, and CRISPR-Cas9 perturbation proximate to a patient mutation is sufficient to reduce CAV1/CAV2 expression in neurons. Additional enrichment of ALS-associated mutations within CAV1 exons positions CAV1 as an ALS risk gene. We propose CAV1/CAV2 overexpression as a personalized medicine target for ALS.
-
(2020) PLoS Genetics. 16, 11, 1009163. Abstract
Circulating inflammatory markers are essential to human health and disease, and they are often dysregulated or malfunctioning in cancers as well as in cardiovascular, metabolic, immunologic and neuropsychiatric disorders. However, the genetic contribution to the physiological variation of levels of circulating inflammatory markers is largely unknown. Here we report the results of a genome-wide genetic study of blood concentration of ten cytokines, including the hitherto unexplored calcium-binding protein (S100B). The study leverages a unique sample of neonatal blood spots from 9,459 Danish subjects from the iPSYCH initiative. We estimate the SNP-heritability of marker levels as ranging from essentially zero for Erythropoietin (EPO) up to 73% for S100B. We identify and replicate 16 associated genomic regions (p
-
(2020) Nature Reviews Urology. 17, 6, p. 351-361 Abstract
Prostate Cancer Diagnosis and Treatment Enhancement Through the Power of Big Data in Europe (PIONEER) is a European network of excellence for big data in prostate cancer, consisting of 32 private and public stakeholders from 9 countries across Europe. Launched by the Innovative Medicines Initiative 2 and part of the Big Data for Better Outcomes Programme (BD4BO), the overarching goal of PIONEER is to provide high-quality evidence on prostate cancer management by unlocking the potential of big data. The project has identified critical evidence gaps in prostate cancer care, via a detailed prioritization exercise including all key stakeholders. By standardizing and integrating existing high-quality and multidisciplinary data sources from patients with prostate cancer across different stages of the disease, the resulting big data will be assembled into a single innovative data platform for research. Based on a unique set of methodologies, PIONEER aims to advance the field of prostate cancer care with a particular focus on improving prostate-cancer-related outcomes, health system efficiency by streamlining patient management, and the quality of health and social care delivered to all men with prostate cancer and their families worldwide.
-
(2020) BMC Evolutionary Biology. 20, 1, p. 42 42. Abstract
Background - Olfactory receptors (ORs) are G protein-coupled receptors with a crucial role in odor detection. A typical mammalian genome harbors - 1000 OR genes and pseudogenes; however, different gene duplication/deletion events have occurred in each species, resulting in complex orthology relationships. While the human OR nomenclature is widely accepted and based on phylogenetic classification into 18 families and further into subfamilies, for other mammals different and multiple nomenclature systems are currently in use, thus concealing important evolutionary and functional insights. Results - Here, we describe the Mutual Maximum Similarity (MMS) algorithm, a systematic classifier for assigning a human-centric nomenclature to any OR gene based on inter-species hierarchical pairwise similarities. MMS was applied to the OR repertoires of seven mammals and zebrafish. Altogether, we assigned symbols to 10,249 ORs. This nomenclature is supported by both phylogenetic and synteny analyses. The availability of a unified nomenclature provides a framework for diverse studies, where textual symbol comparison allows immediate identification of potential ortholog groups as well as species-specific expansions/deletions; for example, Or52e5 and Or52e5b represent a rat-specific duplication of OR52E5. Another example is the complete absence of OR subfamily OR6Z among primate OR symbols. In other mammals, OR6Z members are located in one genomic cluster, suggesting a large deletion in the great ape lineage. An additional 14 mammalian OR subfamilies are missing from the primate genomes. While in chimpanzee 87% of the symbols were identical to human symbols, this number decreased to - 50% in dog and cow and to - 30% in rodents, reflecting the adaptive changes of the OR gene superfamily across diverse ecological niches. Application of the proposed nomenclature to zebrafish revealed similarity to mammalian ORs that could not be detected from the current zebrafish olfactory receptor gene nomenclature. Conclusions - We have consolidated a unified standard nomenclature system for the vertebrate OR superfamily. The new nomenclature system will be applied to cow, horse, dog and chimpanzee by the Vertebrate Gene Nomenclature Committee and its implementation is currently under consideration by other relevant species-specific nomenclature committees.
2019
-
(2019) BMC Medical Genomics. 12, 1, 200. Abstract
Background: The clinical genetics revolution ushers in great opportunities, accompanied by significant challenges. The fundamental mission in clinical genetics is to analyze genomes, and to identify the most relevant genetic variations underlying a patient's phenotypes and symptoms. The adoption of Whole Genome Sequencing requires novel capacities for interpretation of non-coding variants.Results: We present TGex, the Translational Genomics expert, a novel genome variation analysis and interpretation platform, with remarkable exome analysis capacities and a pioneering approach of non-coding variants interpretation. TGex's main strength is combining state-of-the-art variant filtering with knowledge-driven analysis made possible by VarElect, our highly effective gene-phenotype interpretation tool. VarElect leverages the widely used GeneCards knowledgebase, which integrates information from > 150 automatically-mined data sources. Access to such a comprehensive data compendium also facilitates TGex's broad variant annotation, supporting evidence exploration, and decision making. TGex has an interactive, user-friendly, and easy adaptive interface, ACMG compliance, and an automated reporting system. Beyond comprehensive whole exome sequence capabilities, TGex encompasses innovative non-coding variants interpretation, towards the goal of maximal exploitation of whole genome sequence analyses in the clinical genetics practice. This is enabled by GeneCards' recently developed GeneHancer, a novel integrative and fully annotated database of human enhancers and promoters. Examining use-cases from a variety of TGex users world-wide, we demonstrate its high diagnostic yields (42% for single exome and 50% for trios in 1500 rare genetic disease cases) and critical actionable genetic findings. The platform's support for integration with EHR and LIMS through dedicated APIs facilitates automated retrieval of patient data for TGex's customizable reporting engine, establishing a rapid and cost-effective workflow for an entire range of clinical genetic testing, including rare disorders, cancer predisposition, tumor biopsies and health screening.Conclusions: TGex is an innovative tool for the annotation, analysis and prioritization of coding and non-coding genomic variants. It provides access to an extensive knowledgebase of genomic annotations, with intuitive and flexible configuration options, allows quick adaptation, and addresses various workflow requirements. It thus simplifies and accelerates variant interpretation in clinical genetics workflows, with remarkable diagnostic yield, as exemplified in the described use cases.
-
(2019) Life. 9, 4, 77. Abstract
"The Lipid World" was published in 2001, stemming from a highly effective collaboration with David Deamer during a sabbatical year 20 years ago at the Weizmann Institute of Science in Israel. The present review paper highlights the benefits of this scientific interaction and assesses the impact of the lipid world paper on the present understanding of the possible roles of amphiphiles and their assemblies in the origin of life. The lipid world is defined as a putative stage in the progression towards life's origin, during which diverse amphiphiles or other spontaneously aggregating small molecules could have concurrently played multiple key roles, including compartment formation, the appearance of mutually catalytic networks, molecular information processing, and the rise of collective self-reproduction and compositional inheritance. This review brings back into a broader perspective some key points originally made in the lipid world paper, stressing the distinction between the widely accepted role of lipids in forming compartments and their expanded capacities as delineated above. In the light of recent advancements, we discussed the topical relevance of the lipid worldview as an alternative to broadly accepted scenarios, and the need for further experimental and computer-based validation of the feasibility and implications of the individual attributes of this point of view. Finally, we point to possible avenues for exploring transition paths from small molecule-based noncovalent structures to more complex biopolymer-containing proto-cellular systems.
-
(2019) Astrobiology. 10, p. 1263-1278 Abstract
A recent breakthrough publication has reported complex organic molecules in the plumes emanating from the subglacial water ocean of Saturn's moon Enceladus (Postberg et al., 2018, Nature 558:564-568). Based on detailed chemical scrutiny, the authors invoke primordial or endogenously synthesized carbon-rich monomers (
-
(2019) Nature. 571, 7763, p. 107-111 Abstract
Large-scale genome sequencing is poised to provide a substantial increase in the rate of discovery of disease-associated mutations, but the functional interpretation of such mutations remains challenging. Here we show that deletions of a sequence on human chromosome 16 that we term the intestine-critical region (ICR) cause intractable congenital diarrhoea in infants(1,2). Reporter assays in transgenic mice show that the ICR contains a regulatory sequence that activates transcription during the development of the gastrointestinal system. Targeted deletion of the ICR in mice caused symptoms that recapitulated the human condition. Transcriptome analysis revealed that an unannotated open reading frame (Percc1) flanks the regulatory sequence, and the expression of this gene was lost in the developing gut of mice that lacked the ICR. Percc1-knockout mice displayed phenotypes similar to those observed upon ICR deletion in mice and patients, whereas an ICR-driven Percc1 transgene was sufficient to rescue the phenotypes found in mice that lacked the ICR. Together, our results identify a gene that is critical for intestinal function and underscore the need for targeted in vivo studies to interpret the growing number of clinical genetic findings that do not affect known protein-coding genes.
-
(2019) Life. 9, 2, 38. Abstract
Systems chemistry has been a key component of origin of life research, invoking models of life's inception based on evolving molecular networks. One such model is the graded autocatalysis replication domain (GARD) formalism embodied in a lipid world scenario, which offers rigorous computer simulation based on defined chemical kinetics equations. GARD suggests that the first pre-RNA life-like entities could have been homeostatically-growing assemblies of amphiphiles, undergoing compositional replication and mutations, as well as rudimentary selection and evolution. Recent progress in molecular dynamics has provided an experimental tool to study complex biological phenomena such as protein folding, ligand-receptor interactions, and micellar formation, growth, and fission. The detailed molecular definition of GARD and its inter-molecular catalytic interactions make it highly compatible with molecular dynamics analyses. We present a roadmap for simulating GARD's kinetic and thermodynamic behavior using various molecular dynamics methodologies. We review different approaches for testing the validity of the GARD model by following micellar accretion and fission events and examining compositional changes over time. Near-future computational advances could provide empirical delineation for further system complexification, from simple compositional non-covalent assemblies towards more life-like protocellular entities with covalent chemistry that underlies metabolism and genetic encoding.
2018
-
(2018) Journal of the Royal Society Interface. 15, 144, 20180159. Abstract
Life is that which replicates and evolves, but there is no consensus on how life emerged. We advocate a systems protobiology view, whereby the first replicators were assemblies of spontaneously accreting, heterogeneous and mostly non-canonical amphiphiles. This view is substantiated by rigorous chemical kinetics simulations of the graded autocatalysis replication domain (GARD) model, based on the notion that the replication or reproduction of compositional information predated that of sequence information. GARD reveals the emergence of privileged non-equilibrium assemblies (composomes), which portray catalysis-based homeostatic (concentration-preserving) growth. Such a process, along with occasional assembly fission, embodies cell-like reproduction. GARD pre-RNA evolution is evidenced in the selection of different composomes within a sparse fitness landscape, in response to environmental chemical changes. These observations refute claims that GARD assemblies (or other mutually catalytic networks in the metabolism first scenario) cannot evolve. Composomes represent both a genotype and a selectable phenotype, anteceding present-day biology in which the two are mostly separated. Detailed GARD analyses show attractor-like transitions from random assemblies to self-organized composomes, with negative entropy change, thus establishing composomes as dissipative system-shallmarks of life. We showa preliminary new version of our model, metabolic GARD (M-GARD), in which lipid covalent modifications are orchestrated by non-enzymatic lipid catalysts, themselves compositionally reproduced. M-GARD fills the gap of the lack of true metabolism in basic GARD, and is rewardingly supported by a published experimental instance of a lipid-based mutually catalytic network. Anticipating near-future far-reaching progress of molecular dynamics, M-GARD is slated to quantitatively depict elaborate protocells, with orchestrated reproduction of both lipid bilayer and lumenal content. Finally, a GARD analysis in a whole-planet context offers the potential for estimating the probability of life's emergence. The invigorated GARD scrutiny presented in this reviewenhances the validity of autocatalytic sets as a bona fide early evolution scenario and provides essential infrastructure for a paradigm shift towards a systems protobiology view of life's origin.
-
(2018) Astrobiology. 18, 4, p. 419-430 Abstract
We studied the simulated replication and growth of prebiotic vesicles composed of 140 phospholipids and cholesterol using our R-GARD (Real Graded Autocatalysis Replication Domain) formalism that utilizes currently extant lipids that have known rate constants of lipid-vesicle interactions from published experimental data. R-GARD normally modifies kinetic parameters of lipid-vesicle interactions based on vesicle composition and properties. Our original R-GARD model tracked the growth and division of one vesicle at a time in an environment with unlimited lipids at a constant concentration. We explore here a modified model where vesicles compete for a finite supply of lipids. We observed that vesicles exhibit complex behavior including initial fast unrestricted growth, followed by intervesicle competition for diminishing resources, then a second growth burst driven by better-adapted vesicles, and ending with a final steady state. Furthermore, in simulations without kinetic parameter modifications (invariant kinetics), the initial replication was an order of magnitude slower, and vesicles' composition variability at the final steady state was much lower. The complex kinetic behavior was not observed either in the previously published R-GARD simulations or in additional simulations presented here with only one lipid component. This demonstrates that both a finite environment (inducing selection) and multiple components (providing variation for selection to act upon) are crucial for portraying evolution-like behavior. Such properties can improve survival in a changing environment by increasing the ability of early protocellular entities to respond to rapid environmental fluctuations likely present during abiogenesis both on Earth and possibly on other planets. This in silico simulation predicts that a relatively simple in vitro chemical system containing only lipid molecules might exhibit properties that are relevant to prebiotic processes.
-
(2018) European Journal of Paediatric Neurology. 22, 1, p. 93-101 Abstract
Background: AIFM1 encodes a mitochondrial flavoprotein with a dual role (NADH oxidoreductase and regulator of apoptosis), which uses riboflavin as a cofactor. Mutations in the X linked AIFM1 were reported in relation to two main phenotypes: a severe infantile mitochondrial encephalomyopathy and an early-onset axonal sensorimotor neuropathy with hearing loss. In this paper we report two unrelated males harboring AIFM1 mutations (one of which is novel) who display distinct phenotypes including progressive ataxia which partially improved with riboflavin treatment.Methods: For both patients trio whole exome sequencing was performed. Validation and segregation were performed with Sanger sequencing. Following the diagnosis, patients were treated with up to 200 mg riboflavin/day for 12 months. Ataxia was assessed by the ICARS scale at baseline, and 6 and 12 months following treatment.Results: Patient 1 presented at the age of 5 years with auditory neuropathy, followed by progressive ataxia, vermian atrophy and axonal neuropathy. Patient 2 presented at the age of 4.5 years with severe limb and palatal myoclonus, followed by ataxia, cerebellar atrophy, ophthalmoplegia, sensorineural hearing loss, hyporeflexia and cardiomyopathy. Two deleterious missense mutations were found in the AIFM1 gene: p. Met340Thr mutation located in the FAD dependent oxidoreductase domain and the novel p. Thr141Ile mutation located in a highly conserved DNA binding motif. Ataxia score, decreased by 39% in patient 1 and 20% in patient 2 following 12 months of treatment.Conclusion: AIFM1 mutations cause childhood cerebellar ataxia, which may be partially treatable in some patients with high dose riboflavin. (C) 2017 European Paediatric Neurology Society. Published by Elsevier Ltd. All rights reserved.
2017
-
(2017) European Journal of Human Genetics. 25, 12, p. 1377-1387 Abstract
We performed whole exome or genome sequencing in eight multiply affected families with ostensibly isolated congenital anosmia. Hypothesis-free analyses based on the assumption of fully penetrant recessive/dominant/X-linked models obtained no strong single candidate variant in any of these families. In total, these eight families showed 548 rare segregating variants that were predicted to be damaging, in 510 genes. Three Kallmann syndrome genes (FGFR1, SEMA3A, and CHD7) were identified. We performed permutation-based analysis to test for overall enrichment of these 510 genes carrying these 548 variants with genes mutated in Kallmann syndrome and with a control set of genes mutated in hypogonadotrophic hypogonadism without anosmia. The variants were found to be enriched for Kallmann syndrome genes (3 observed vs. 0.398 expected, p = 0.007), but not for the second set of genes. Among these three variants, two have been already reported in genes related to syndromic anosmia (FGFR1 (p.(R250W)), CHD7 (p.(L2806V))) and one was novel (SEMA3A (p.(T717I))). To replicate these findings, we performed targeted sequencing of 16 genes involved in Kallmann syndrome and hypogonadotrophic hypogonadism in 29 additional families, mostly singletons. This yielded an additional 6 variants in 5 Kallmann syndrome genes (PROKR2, SEMA3A, CHD7, PROK2, ANOS1), two of them already reported to cause Kallmann syndrome. In all, our study suggests involvement of 6 syndromic Kallmann genes in isolated anosmia. Further, we report a yet unreported appearance of di-genic inheritance in a family with congenital isolated anosmia. These results are consistent with a complex molecular basis of congenital anosmia.
-
(2017) BioMedical Engineering Online. 16, 72. Abstract
Background: A key challenge in the realm of human disease research is next generation sequencing (NGS) interpretation, whereby identified filtered variant-harboring genes are associated with a patient's disease phenotypes. This necessitates bioinformatics tools linked to comprehensive knowledgebases. The GeneCards suite databases, which include GeneCards (human genes), MalaCards (human diseases) and PathCards (human pathways) together with additional tools, are presented with the focus on MalaCards utility for NGS interpretation as well as for large scale bioinformatic analyses. Results: VarElect, our NGS interpretation tool, leverages the broad information in the GeneCards suite databases. MalaCards algorithms unify disease-related terms and annotations from 69 sources. Further, MalaCards defines hierarchical relatedness- aliases, disease families, a related diseases network, categories and ontological classifications. GeneCards and MalaCards delineate and share a multi-tiered, scored gene-disease network, with stringency levels, including the definition of elite status- high quality gene-disease pairs, coming from manually curated trustworthy sources, that includes 4500 genes for 8000 diseases. This unique resource is key to NGS interpretation by VarElect. VarElect, a comprehensive search tool that helps infer both direct and indirect links between genes and user-supplied disease/phenotype terms, is robustly strengthened by the information found in MalaCards. The indirect mode benefits from GeneCards' diverse gene-to-gene relationships, including SuperPathsintegrated biological pathways from 12 information sources. We are currently adding an important information layer in the form of "disease SuperPaths", generated from the gene-disease matrix by an algorithm similar to that previously employed for biological pathway unification. This allows the discovery of novel gene-disease and disease-disease relationships. The advent of whole genome sequencing necessitates
-
(2017) Database-The Journal Of Biological Databases And Curation. 2017, bax028. Abstract
A major challenge in understanding gene regulation is the unequivocal identification of enhancer elements and uncovering their connections to genes. We present GeneHancer, a novel database of human enhancers and their inferred target genes, in the framework of GeneCards. First, we integrated a total of 434 000 reported enhancers from four different genonne-wide databases: the Encyclopedia of DNA Elements (ENCODE), the Ensennbl regulatory build, the functional annotation of the mammalian genonne (FANTOM) project and the VISTA Enhancer Browser. Employing an integration algorithm that aims to remove redundancy, GeneHancer portrays 285 000 integrated candidate enhancers (covering 12.4% of the genonne), 94 000 of which are derived from more than one source, and each assigned an annotation-derived confidence score. GeneHancer subsequently links enhancers to genes, using: tissue co-expression correlation between genes and enhancer RNAs, as well as enhancer-targeted transcription factor genes; expression quantitative trait loci for variants within enhancers; and capture Hi-C, a promoter-specific genonne conformation assay. The individual scores based on each of these four methods, along with gene-enhancer genonnic distances, form the basis for GeneHancer's combinatorial likelihood-based scores for enhancer-gene pairing. Finally, we define 'elite' enhancer gene relations reflecting both a high-likelihood enhancer definition and a strong enhancer -gene association. GeneHancer predictions are fully integrated in the widely used GeneCards Suite, whereby candidate enhancers and their annotations are displayed on every relevant GeneCard. This assists in the mapping of non-coding variants to enhancers, and via the linked genes, forms a basis for variant-phenotype interpretation of whole-genome sequences in health and disease.
-
(2017) Nucleic Acids Research. 45, D1, p. D877-D887 Abstract
The MalaCards human disease database (http://www.malacards.org/) is an integrated compendium of annotated diseases mined from 68 data sources. MalaCards has a web card for each of similar to 20 000 disease entries, in six global categories. It portrays a broad array of annotation topics in 15 sections, including Summaries, Symptoms, Anatomical Context, Drugs, Genetic Tests, Variations and Publications. The Aliases and Classifications section reflects an algorithm for disease name integration across often-conflicting sources, providing effective annotation consolidation. A central feature is a balanced Genes section, with scores reflecting the strength of disease-gene associations. This is accompanied by other gene-related disease information such as pathways, mouse phenotypes and GO-terms, stemming from MalaCards' affiliation with the GeneCards Suite of databases. MalaCards' capacity to inter-link information from complementary sources, along with its elaborate search function, relational database infrastructure and convenient data dumps, allows it to tackle its rich disease annotation landscape, and facilitates systems analyses and genome sequence interpretation. MalaCards adopts a `flat' disease-card approach, but each card is mapped to popular hierarchical ontologies (e.g.International Classification of Diseases, Human Phenotype Ontology and Unified Medical Language System) and also contains information about multi-level relations among diseases, thereby providing an optimal tool for disease representation and scrutiny.
2016
-
(2016) Database-The Journal Of Biological Databases And Curation. baw132. Abstract
We present here an exploration of the evolution of three well-established, web-based resources dedicated to the dissemination of information related to olfactory receptors (ORs) and their functional ligands, odorants. These resources are: the Olfactory Receptor Database (ORDB), the Human Olfactory Data Explorer (HORDE) and ODORactor. ORDB is a repository of genomic and proteomic information related to ORs and other chemosensory receptors, such as taste and pheromone receptors. Three companion databases closely integrated with ORDB are OdorDB, ORModelDB and OdorMapDB; these resources are part of the SenseLab suite of databases (http://senselab.med.yale.edu). HORDE (http://genome.weizmann.ac.il/horde/) is a semi-automatically populated database of the OR repertoires of human and several mammals. ODORactor (http://mdl.shsmu.edu.cn/ODORactor/) provides information related to OR-odorant interactions from the perspective of the odorant. All three resources are connected to each other via web-links.
-
(2016) Clinical Genetics. 90, 3, p. 211-219 Abstract
2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.Congenital general anosmia (CGA) is a neurological disorder entailing a complete innate inability to sense odors. While the mechanisms underlying vertebrate olfaction have been studied in detail, there are still gaps in our understanding of the molecular genetic basis of innate olfactory disorders. Applying whole-exome sequencing to a family multiply affected with CGA, we identified three members with a rare X-linked missense mutation in the TENM1 (teneurin 1) gene (ENST00000422452:c.C4829T). In Drosophila melanogaster, TENM1 functions in synaptic-partner-matching between axons of olfactory sensory neurons and target projection neurons and is involved in synapse organization in the olfactory system. We used CRISPR-Cas9 system to generate a Tenm1 disrupted mouse model. Tenm1(-/-) and point-mutated Tenm1(A) (/A) adult mice were shown to have an altered ability to locate a buried food pellet. Tenm1(A) (/A) mice also displayed an altered ability to sense aversive odors. Results of our study, that describes a new Tenm1 mouse, agree with the hypothesis that TENM1 has a role in olfaction. However, additional studies should be done in larger CGA cohorts, to provide statistical evidence that loss-of-function mutations in TENM1 can solely cause the disease in our and other CGA cases.
-
(2016) PLoS Genetics. 12, 5, e1006008. Abstract
Pemphigus vulgaris (PV) is a life-threatening autoimmune mucocutaneous blistering disease caused by disruption of intercellular adhesion due to auto-antibodies directed against epithelial components. Treatment is limited to immunosuppressive agents, which are associated with serious adverse effects. The propensity to develop the disease is in part genetically determined. We therefore reasoned that the delineation of PV genetic basis may point to novel therapeutic strategies. Using a genome-wide association approach, we recently found that genetic variants in the vicinity of the ST18 gene confer a significant risk for the disease. Here, using targeted deep sequencing, we identified a PV-associated variant residing within the ST18 promoter region (p
-
(2016) OMICS A Journal of Integrative Biology. 20, 3, p. 139-151 Abstract
Postgenomics data are produced in large volumes by life sciences and clinical applications of novel omics diagnostics and therapeutics for precision medicine. To move from "data-to-knowledge-to-innovation," a crucial missing step in the current era is, however, our limited understanding of biological and clinical contexts associated with data. Prominent among the emerging remedies to this challenge are the gene set enrichment tools. This study reports on GeneAnalytics (TM) (geneanalytics.genecards.org), a comprehensive and easy-to-apply gene set analysis tool for rapid contextualization of expression patterns and functional signatures embedded in the postgenomics Big Data domains, such as Next Generation Sequencing (NGS), RNAseq, and microarray experiments. GeneAnalytics' differentiating features include in-depth evidence-based scoring algorithms, an intuitive user interface and proprietary unified data. GeneAnalytics employs the LifeMap Science's GeneCards suite, including the GeneCards (R)-the human gene database; the MalaCards-the human diseases database; and the PathCards-the biological pathways database. Expression-based analysis in GeneAnalytics relies on the LifeMap Discovery (R)-the embryonic development and stem cells database, which includes manually curated expression data for normal and diseased tissues, enabling advanced matching algorithm for gene-tissue association. This assists in evaluating differentiation protocols and discovering biomarkers for tissues and cells. Results are directly linked to gene, disease, or cell "cards" in the GeneCards suite. Future developments aim to enhance the GeneAnalytics algorithm as well as visualizations, employing varied graphical display items. Such attributes make GeneAnalytics a broadly applicable postgenomics data analyses and interpretation tool for translation of data to knowledge-based innovation in various Big Data fields such as precision medicine, ecogenomics, nutrigenomics, pharmacogenomics, vaccinomic
-
(2016) BMC Genomics. 17, 1, p. 619 Abstract
BACKGROUND: Olfaction is a versatile sensory mechanism for detecting thousands of volatile odorants. Although molecular basis of odorant signaling is relatively well understood considerable gaps remain in the complete charting of all relevant gene products. To address this challenge, we applied RNAseq to four well-characterized human olfactory epithelial samples and compared the results to novel and published mouse olfactory epithelium as well as 16 human control tissues.RESULTS: We identified 194 non-olfactory receptor (OR) genes that are overexpressed in human olfactory tissues vs.CONTROLS: The highest overexpression is seen for lipocalins and bactericidal/permeability-increasing (BPI)-fold proteins, which in other species include secreted odorant carriers. Mouse-human discordance in orthologous lipocalin expression suggests different mammalian evolutionary paths in this family. Of the overexpressed genes 36 have documented olfactory function while for 158 there is little or no previous such functional evidence. The latter group includes GPCRs, neuropeptides, solute carriers, transcription factors and biotransformation enzymes. Many of them may be indirectly implicated in sensory function, and ~70% are over expressed also in mouse olfactory epithelium, corroborating their olfactory role. Nearly 90% of the intact OR repertoire, and ~60% of the OR pseudogenes are expressed in the olfactory epithelium, with the latter showing a 3-fold lower expression. ORs transcription levels show a 1000-fold inter-paralog variation, as well as significant inter-individual differences. We assembled 160 transcripts representing 100 intact OR genes. These include 1-4 short 5' non-coding exons with considerable alternative splicing and long last exons that contain the coding region and 3' untranslated region of highly variable length. Notably, we identified 10 ORs with an intact open reading frame but with seemingly non-functional transcripts, suggesting a yet unreported OR pseudoge
-
(2016) Database : the journal of biological databases and curation. 2016, 27048349. Abstract
The Author(s) 2016. Published by Oxford University Press.GeneCards is a one-stop shop for searchable human gene annotations (http://www.genecards.org/). Data are automatically mined from 120 sources and presented in an integrated web card for every human gene. We report the application of recent advances in proteomics to enhance gene annotation and classification in GeneCards. First, we constructed the Human Integrated Protein Expression Database (HIPED), a unified database of protein abundance in human tissues, based on the publically available mass spectrometry (MS)-based proteomics sources ProteomicsDB, Multi-Omics Profiling Expression Database, Protein Abundance Across Organisms and The MaxQuant DataBase. The integrated database, residing within GeneCards, compares favourably with its individual sources, covering nearly 90% of human protein-coding genes. For gene annotation and comparisons, we first defined a protein expression vector for each gene, based on normalized abundances in 69 normal human tissues. This vector is portrayed in the GeneCards expression section as a bar graph, allowing visual inspection and comparison. These data are juxtaposed with transcriptome bar graphs. Using the protein expression vectors, we further defined a pairwise metric that helps assess expression-based pairwise proximity. This new metric for finding functional partners complements eight others, including sharing of pathways, gene ontology (GO) terms and domains, implemented in the GeneCards Suite. In parallel, we calculated proteome-based differential expression, highlighting a subset of tissues that overexpress a gene and subserving gene classification. This textual annotation allows users of VarElect, the suite's next-generation phenotyper, to more effectively discover causative disease variants. Finally, we define the protein-RNA expression ratio and correlation as yet another attribute of every gene in each tissue, adding further annotative information. The results con
-
(2016) Current protocols in bioinformatics / editoral board, Andreas D. Baxevanis ... [et al.]. 54, p. 1.30.1-1.30.33 27322403. Abstract
Copyright 2016 John Wiley & Sons, Inc.GeneCards, the human gene compendium, enables researchers to effectively navigate and inter-relate the wide universe of human genes, diseases, variants, proteins, cells, and biological pathways. Our recently launched Version 4 has a revamped infrastructure facilitating faster data updates, better-targeted data queries, and friendlier user experience. It also provides a stronger foundation for the GeneCards suite of companion databases and analysis tools. Improved data unification includes gene-disease links via MalaCards and merged biological pathways via PathCards, as well as drug information and proteome expression. VarElect, another suite member, is a phenotype prioritizer for next-generation sequencing, leveraging the GeneCards and MalaCards knowledgebase. It automatically infers direct and indirect scored associations between hundreds or even thousands of variant-containing genes and disease phenotype terms. VarElect's capabilities, either independently or within TGex, our comprehensive variant analysis pipeline, help prepare for the challenge of clinical projects that involve thousands of exome/genome NGS analyses. 2016 by John Wiley & Sons, Inc.
-
(2016) BMC Genomics. 17, 444. Abstract
Background: Next generation sequencing (NGS) provides a key technology for deciphering the genetic underpinnings of human diseases. Typical NGS analyses of a patient depict tens of thousands non-reference coding variants, but only one or very few are expected to be significant for the relevant disorder. In a filtering stage, one employs family segregation, rarity in the population, predicted protein impact and evolutionary conservation as a means for shortening the variation list. However, narrowing down further towards culprit disease genes usually entails laborious seeking of gene-phenotype relationships, consulting numerous separate databases. Thus, a major challenge is to transition from the few hundred shortlisted genes to the most viable disease-causing candidates. Results: We describe a novel tool, VarElect (http://ve.genecards.org), a comprehensive phenotype-dependent variant/gene prioritizer, based on the widely-used GeneCards, which helps rapidly identify causal mutations with extensive evidence. The GeneCards suite offers an effective and speedy alternative, whereby > 120 gene-centric automatically-mined data sources are jointly available for the task. VarElect cashes on this wealth of information, as well as on GeneCards' powerful free-text Boolean search and scoring capabilities, proficiently matching variant-containing genes to submitted disease/symptom keywords. The tool also leverages the rich disease and pathway information of MalaCards, the human disease database, and PathCards, the unified pathway (SuperPaths) database, both within the GeneCards Suite. The VarElect algorithm infers direct as well as indirect links between genes and phenotypes, the latter benefitting from GeneCards' diverse gene-to-gene data links in GenesLikeMe. Finally, our tool offers an extensive gene-phenotype evidence portrayal ("MiniCards") and hyperlinks to the parent databases. Conclusions: We demonstrate that VarElect compares favorably with several often-used NGS phen
-
(2016) European Journal of Paediatric Neurology. 20, 1, p. 69-79 Abstract
Background: TECPR2 was first described as a disease causing gene when the c.3416delT frameshift mutation was found in five Jewish Bukharian patients with similar features. It was suggested to constitute a new subtype of complex hereditary spastic paraparesis (SPG49). Results: We report here 3 additional patients from unrelated non-Bukharian families, harboring two novel mutations (c.1319delT, c.C566T) in this gene. Accumulating clinical data clarifies that in addition to intellectual disability and evolving spasticity the main disabling feature of this unique disorder is autonomic-sensory neuropathy accompanied by chronic respiratory disease and paroxysmal autonomic events. Conclusion: We suggest that the disease should therefore be classified as a new subtype of hereditary sensory-autonomic neuropathy. The discovery of additional mutations in non Bukharian patients implies that this disease might be more common than previously appreciated and should therefore be considered in undiagnosed cases of intellectual disability with autonomic features and respiratory symptoms regardless of demographic origin. (C) 2015 European Paediatric Neurology Society. Published by Elsevier Ltd. All rights reserved.
2015
2014
-
(2014) Journal of Theoretical Biology. 357, p. 26-34 Abstract
Present life portrays a two-tier phenomenology: molecules compose supramolecular structures, such as cells or organisms, which in turn portray population behaviors, including selection, evolution and ecological dynamics. Prebiotic models have often focused on evolution in populations of self-replicating molecules, without explicitly invoking the intermediate molecular-to-supramolecular transition. Here, we explore a prebiotic model that allows one to relate parameters of chemical interaction networks within molecular assemblies to emergent population dynamics. We use the graded autocatalysis replication domain (GARD) model, which simulates the network dynamics within amphiphile-containing molecular assemblies, and exhibits quasi-stationary compositional states termed compotype species. These grow by catalyzed accretion, divide and propagate their compositional information to progeny in a replication-like manner. The model allows us to ask how molecular network parameters influence assembly evolution and population dynamics parameters. In 1000 computer simulations, each embodying different parameter set of the global chemical interaction network parameters, we observed a wide range of behaviors. These were analyzed by a multi species logistic model often used for analyzing population ecology (r-K or Lotka-Volterra competition model). We found that compotypes with a larger intrinsic molecular repertoire show a higher intrinsic growth (r) and lower carrying capacity (K), as well as lower replication fidelity. This supports a prebiotic scenario initiated by fast-replicating assemblies with a high molecular diversity, evolving into more faithful replicators with narrower molecular repertoires. (C) 2014 Elsevier Ltd. All rights reserved.
-
(2014) Journal of Proteome Research. 13, 1, p. 107-113 Abstract
The Model Organism Protein Expression Database (MOPED, http://moped.proteinspire.org) is an expanding proteomics resource to enable biological and biomedical discoveries. MOPED aggregates simple, standardized and consistently processed summaries of protein expression and metadata from proteomics (mass spectrometry) experiments from human and model organisms (mouse, worm, and yeast). The latest version of MOPED adds new estimates of protein abundance and concentration as well as relative (differential) expression data. MOPED provides a new updated query interface that allows users to explore information by organism, tissue, localization, condition, experiment, or keyword. MOPED supports the Human Proteome Project's efforts to generate chromosome- and diseases-specific proteomes by providing links from proteins to chromosome and disease information as well as many complementary resources. MOPED supports a new omics metadata checklist to harmonize data integration, analysis, and use. MOPED's development is driven by the user community, which spans 90 countries and guides future development that will transform MOPED into a multiomics resource. MOPED encourages users to submit data in a simple format: They can use the metadata checklist to generate a data publication for this submission. As a result, MOPED will provide even greater insights into complex biological processes, and systems and enable deeper and more comprehensive biological and biomedical discoveries.
2013
-
(2013) Neuron. 80, 2, p. 429-441 Abstract
We analyzed four families that presented with a similar condition characterized by congenital microcephaly, intellectual disability, progressive cerebral atrophy, and intractable seizures. We show that recessive mutations in the ASNS gene are responsible for this syndrome. Two of the identified missense mutations dramatically reduce ASNS protein abundance, suggesting that the mutations cause loss of function. Hypomorphic Asns mutant mice have structural brain abnormalities, including enlarged ventricles and reduced cortical thickness, and show deficits in learning and memory mimicking aspects of the patient phenotype. ASNS encodes asparagine synthetase, which catalyzes the synthesis of asparagine from glutamine and aspartate. The neurological impairment resulting from ASNS deficiency may be explained by asparagine depletion in the brain or by accumulation of aspartate/glutamate leading to enhanced excitability and neuronal damage. Our study thus indicates that asparagine synthesis is essential for the development and function of the brain but not for that of other organs.
-
(2013) Autophagy. 9, 5, p. 801-802 Abstract
Autophagy dysfunction has been implicated in a group of progressive neurodegenerative diseases, and has been reported to play a major role in the pathogenesis of these disorders. We have recently reported a recessive mutation in TECPR2, an autophagy-implicated WD repeat-containing protein, in five individuals with a novel form of monogenic hereditary spastic paraparesis (HSP). We found that diseased skin fibroblasts had a decreased accumulation of the autophagy-initiation protein MAP1LC3B/LC3B, and an attenuated delivery of both LC3B and the cargo-recruiting protein SQSTM1/p62 to the lysosome where they are subject to degradation. The discovered TECPR2 mutation reveals for the first time a role for aberrant autophagy in a major class of Mendelian neurodegenerative diseases, and suggests mechanisms by which impaired autophagy may impinge on a broader scope of neurodegeneration.
-
(2013) Israel Journal of Chemistry. 53, 4-Mar, p. 185-198 Abstract
A network of biological databases is reviewed, supplying a framework for studies of human genes and the association of their genomic variations with human phenotypes. The network is composed of GeneCards, the human gene compendium, which provides comprehensive information on all known and predicted human genes, along with its suite members GeneDecks and GeneLoc. Two databases are shown that address genes and variations focusing on olfactory reception (HORDE) and transduction (GOSdb). In the realm of disease scrutiny, we portray MalaCards, a novel comprehensive database of human diseases and their annotations. Also shown is GeneKid, a tool aimed at generating novel kidney disease biomarkers using systems biology, as well as Xome, a database for whole-exome next-generation DNA sequences for human diseases in the Israeli population. Finally, we show LifeMap Discovery, a database of embryonic development, stem cell research and regenerative medicine, which links to both GeneCards and MalaCards.
-
(2013) Database-The Journal Of Biological Databases And Curation. Abstract
Comprehensive disease classification, integration and annotation are crucial for biomedical discovery. At present, disease compilation is incomplete, heterogeneous and often lacking systematic inquiry mechanisms. We introduce MalaCards, an integrated database of human maladies and their annotations, modeled on the architecture and strategy of the GeneCards database of human genes. MalaCards mines and merges 44 data sources to generate a computerized card for each of 16 919 human diseases. Each MalaCard contains disease-specific prioritized annotations, as well as inter-disease connections, empowered by the GeneCards relational database, its searches and GeneDecks set analyses. First, we generate a disease list from 15 ranked sources, using disease-name unification heuristics. Next, we use four schemes to populate MalaCards sections: (i) directly interrogating disease resources, to establish integrated disease names, synonyms, summaries, drugs/therapeutics, clinical features, genetic tests and anatomical context; (ii) searching GeneCards for related publications, and for associated genes with corresponding relevance scores; (iii) analyzing disease-associated gene sets in GeneDecks to yield affiliated pathways, phenotypes, compounds and GO terms, sorted by a composite relevance score and presented with GeneCards links; and (iv) searching within MalaCards itself, e. g. for additional related diseases and anatomical context. The latter forms the basis for the construction of a disease network, based on shared MalaCards annotations, embodying associations based on etiology, clinical features and clinical conditions. This broadly disposed network has a power-law degree distribution, suggesting that this might be an inherent property of such networks. Work in progress includes hierarchical malady classification, ontological mapping and disease set analyses, striving to make MalaCards an even more effective tool for biomedical research.
-
(2013) Human Mutation. 34, 1, p. 32-41 Abstract
Genetic variations in olfactory receptors likely contribute to the diversity of odorant-specific sensitivity phenotypes. Our working hypothesis is that genetic variations in auxiliary olfactory genes, including those mediating transduction and sensory neuronal development, may constitute the genetic basis for general olfactory sensitivity (GOS) and congenital general anosmia (CGA). We thus performed a systematic exploration for auxiliary olfactory genes and their documented variation. This included a literature survey, seeking relevant functional in vitro studies, mouse gene knockouts and human disorders with olfactory phenotypes, as well as data mining in published transcriptome and proteome data for genes expressed in olfactory tissues. In addition, we performed next-generation transcriptome sequencing (RNA-seq) of human olfactory epithelium and mouse olfactory epithelium and bulb, so as to identify sensory-enriched transcripts. Employing a global score system based on attributes of the 11 data sources utilized, we identified a list of 1,680 candidate auxiliary olfactory genes, of which 450 are shortlisted as having higher probability of a functional role. For the top-scoring 136 genes, we identified genomic variants (probably damaging single nucleotide polymorphisms, indels, and copy number deletions) gleaned from public variation repositories. This database of genes and their variants should assist in rationalizing the great interindividual variation in human overall olfactory sensitivity (http://genome.weizmann.ac.il/GOSdb). Hum Mutat 34:32-41, 2013. (C) 2012 Wiley Periodicals, Inc.
-
(2013) Bioinformatics. 29, 2, p. 255-261 Abstract
Motivation: Non-coding RNA (ncRNA) genes are increasingly acknowledged for their importance in the human genome. However, there is no comprehensive non-redundant database for all such human genes. Results: We leveraged the effective platform of GeneCards, the human gene compendium, together with the power of fRNAdb and additional primary sources, to judiciously unify all ncRNA gene entries obtainable from 15 different primary sources. Overlapping entries were clustered to unified locations based on an algorithm employing genomic coordinates. This allowed GeneCards' gamut of relevant entries to rise similar to 5-fold, resulting in similar to 80 000 human non-redundant ncRNAs, belonging to 14 classes. Such 'grand unification' within a regularly updated data structure will assist future ncRNA research.
-
(2013) Computational and Mathematical Methods in Medicine. Abstract
We propose an automaton, a theoretical framework that demonstrates how to improve the yield of the synthesis of branched chemical polymer reactions. This is achieved by separating substeps of the path of synthesis into compartments. We use chemical containers (chemtainers) to carry the substances through a sequence of fixed successive compartments. We describe the automaton in mathematical terms and show how it can be configured automatically in order to synthesize a given branched polymer target. The algorithm we present finds an optimal path of synthesis in linear time. We discuss how the automaton models compartmentalized structures found in cells, such as the endoplasmic reticulum and the Golgi apparatus, and we show how this compartmentalization can be exploited for the synthesis of branched polymers such as oligosaccharides. Lastly, we show examples of artificial branched polymers and discuss how the automaton can be configured to synthesize them with maximal yield.
2012
-
(2012) American Journal of Human Genetics. 91, 6, p. 1065-1072 Abstract
We studied five individuals from three Jewish Bukharian families affected by an apparently autosomal-recessive form of hereditary spastic paraparesis accompanied by severe intellectual disability, fluctuating central hypoventilation, gastresophageal reflux disease, wake apnea, areflexia, and unique dysmorphic features. Exome sequencing identified one homozygous variant shared among all affected individuals and absent in controls: a 1 bp frameshift TECPR2 deletion leading to a premature stop codon and predicting significant degradation of the protein. TECPR2 has been reported as a positive regulator of autophagy. We thus examined the autophagy-related fate of two key autophagic proteins, SQSTM1 (p62) and MAP1LC3B (LC3), in skin fibroblasts of an affected individual, as compared to a healthy control, and found that both protein levels were decreased and that there was a more pronounced decrease in the lipidated form of LC3 (LC3II). siRNA knockdown of TECPR2 showed similar changes, consistent with aberrant autophagy. Our results are strengthened by the fact that autophagy dysfunction has been implicated in a number of other neurodegenerative diseases. The discovered TECPR2 mutation implicates autophagy, a central intracellular mechanism, in spastic paraparesis.
-
(2012) Origins of Life and Evolution of Biospheres. 42, 5, p. 469-473 Abstract
In this paper we explore the question of whether there is an optimal set up for a putative prebiotic system leading to open-ended evolution (OEE) of the events unfolding within this system. We do so by proposing two key innovations. First, we introduce a new index that measures OEE as a function of the likelihood of events unfolding within a universe given its initial conditions. Next, we apply this index to a variant of the graded autocatalysis replication domain (GARD) model, Segre et al. (P Natl Acad Sci USA 97(8):4112-4117, 2000; Markovitch and Lancet Artif Life 18(3), 2012), and use it to study - under a unified and concise prebiotic evolutionary framework - both a variety of initial conditions of the universe and the OEE of species that evolve from them.
-
(2012) Chemical Senses. 37, 7, p. 581-584 Abstract
Considerable evidence supports the idea that odorant recognition depends on specific sequence variations in olfactory receptor (OR) proteins. Much of this emerges from in vitro screens in heterogenous expression systems. However, the ultimate proof should arise from measurements of odorant thresholds in human individuals harboring different OR genetic variants, a research vein that has so far been only scantly explored. The study of McRae et al., published in this issue of Chemical Senses, shows how the recognition of a grassy odorant depends on specific OR interindividual sequence changes. It provides a clear relevant example for the impact of genetics on olfaction and is an excellent portrayal of the power of human genomics to decipher olfactory perception.
-
(2012) BMC Genomics. 13, Abstract
Background: Information on nucleotide diversity along completely sequenced human genomes has increased tremendously over the last few years. This makes it possible to reassess the diversity status of distinct receptor proteins in different human individuals. To this end, we focused on the complete inventory of human olfactory receptor coding regions as a model for personal receptor repertoires. Results: By performing data-mining from public and private sources we scored genetic variations in 413 intact OR loci, for which one or more individuals had an intact open reading frame. Using 1000 Genomes Project haplotypes, we identified a total of 4069 full-length polypeptide variants encoded by these OR loci, average of similar to 10 per locus, constituting a lower limit for the effective human OR repertoire. Each individual is found to harbor as many as 600 OR allelic variants, similar to 50% higher than the locus count. Because OR neuronal expression is allelically excluded, this has direct effect on smell perception diversity of the species. We further identified 244 OR segregating pseudogenes (SPGs), loci showing both intact and pseudogene forms in the population, twenty-six of which are annotatively "resurrected" from a pseudogene status in the reference genome. Using a custom SNP microarray we validated 150 SPGs in a cohort of 468 individuals, with every individual genome averaging 36 disrupted sequence variations, 15 in homozygote form. Finally, we generated a multi-source compendium of 63 OR loci harboring deletion Copy Number Variations (CNVs). Our combined data suggest that 271 of the 413 intact OR loci (66%) are affected by nonfunctional SNPs/indels and/or CNVs. Conclusions: These results portray a case of unusually high genetic diversity, and suggest that individual humans have a highly personalized inventory of functional olfactory receptors, a conclusion that might apply to other receptor multigene families.
-
(2012) International Journal of Neuropsychopharmacology. 15, 4, p. 459-469 Abstract
It is well accepted that schizophrenia has a strong genetic component. Several genome-wide association studies (GWASs) of schizophrenia have been published in recent years; most of them population based with a case-control design. Nevertheless, identifying the specific genetic variants which contribute to susceptibility to the disorder remains a challenging task. A family-based GWAS strategy may be helpful in the identification of schizophrenia susceptibility genes since it is protected against population stratification, enables better accounting for genotyping errors and is more sensitive for identification of rare variants which have a very low frequency in the general population. In this project we implemented a family-based GWAS of schizophrenia in a sample of 107 Jewish-Israeli families. We found one genome-wide significant association in the intron of the DOCK4 gene (rs2074127, p value=1.134 x 10(-7)) and six additional nominally significant association signals with p
-
(2012) Journal of Bacteriology. 194, 8, p. 2127-2128 Abstract
Paenibacillus dendritiformis is a Gram-positive, soil-dwelling, spore-forming social microorganism. An intriguing collective faculty of this strain is manifested by its ability to switch between different morphotypes, such as the branching ( T) and the chiral (C) morphotypes. Here we report the 6.3-Mb draft genome sequence of the P. dendritiformis C454 chiral morphotype.
-
(2012) Nucleic Acids Research. 40, D1, p. D1093-D1099 Abstract
Large numbers of mass spectrometry proteomics studies are being conducted to understand all types of biological processes. The size and complexity of proteomics data hinders efforts to easily share, integrate, query and compare the studies. The Model Organism Protein Expression Database (MOPED, htttp://moped.proteinspire.org) is a new and expanding proteomics resource that enables rapid browsing of protein expression information from publicly available studies on humans and model organisms. MOPED is designed to simplify the comparison and sharing of proteomics data for the greater research community. MOPED uniquely provides protein level expression data, meta-analysis capabilities and quantitative data from standardized analysis. Data can be queried for specific proteins, browsed based on organism, tissue, localization and condition and sorted by false discovery rate and expression. MOPED empowers users to visualize their own expression data and compare it with existing studies. Further, MOPED links to various protein and pathway databases, including GeneCards, Entrez, UniProt, KEGG and Reactome. The current version of MOPED contains over 43 000 proteins with at least one spectral match and more than 11 million high certainty spectra.
-
(2012) PLoS ONE. 7, 1, Abstract
Many reports in different populations have demonstrated linkage of the 10q24-q26 region to schizophrenia, thus encouraging further analysis of this locus for detection of specific schizophrenia genes. Our group previously reported linkage of the 10q24-q26 region to schizophrenia in a unique, homogeneous sample of Arab-Israeli families with multiple schizophrenia-affected individuals, under a dominant model of inheritance. To further explore this candidate region and identify specific susceptibility variants within it, we performed re-analysis of the 10q24-26 genotype data, taken from our previous genome-wide association study (GWAS) (Alkelai et al, 2011). We analyzed 2089 SNPs in an extended sample of 57 Arab Israeli families (189 genotyped individuals), under the dominant model of inheritance, which best fits this locus according to previously performed MOD score analysis. We found significant association with schizophrenia of the TCF7L2 gene intronic SNP, rs12573128, (p = 7.01 x 10(-6)) and of the nearby intergenic SNP, rs1033772, (p = 6.59 x 10(-6)) which is positioned between TCF7L2 and HABP2. TCF7L2 is one of the best confirmed susceptibility genes for type 2 diabetes (T2D) among different ethnic groups, has a role in pancreatic beta cell function and may contribute to the comorbidity of schizophrenia and T2D. These preliminary results independently support previous findings regarding a possible role of TCF7L2 in susceptibility to schizophrenia, and strengthen the importance of integrating linkage analysis models of inheritance while performing association analyses in regions of interest. Further validation studies in additional populations are required.
-
(2012) Artificial Life. 18, 3, p. 243-266 Abstract
It is widely accepted that autocatalysis constitutes a crucial facet of effective replication and evolution (e.g., in Eigen's hypercycle model). Other models for early evolution (e.g., by Dyson, Ganti, Varela, and Kauffman) invoke catalytic networks, where cross-catalysis is more apparent. A key question is how the balance between auto- (self-) and cross- (mutual) catalysis shapes the behavior of model evolving systems. This is investigated using the graded autocatalysis replication domain (GARD) model, previously shown to capture essential features of reproduction, mutation, and evolution in compositional molecular assemblies. We have performed numerical simulations of an ensemble of GARD networks, each with a different set of lognormally distributed catalytic values. We asked what is the influence of the catalytic content of such networks on beneficial evolution. Importantly, a clear trend was observed, wherein only networks with high mutual catalysis propensity (p(mc)) allowed for an augmented diversity of composomes, quasi-stationary compositions that exhibit high replication fidelity. We have reexamined a recent analysis that showed meager selection in a single GARD instance and for a few nonstationaty target compositions. In contrast, when we focused here on compotypes (clusters of composomes) as targets for selection in populations of compositional assemblies, appreciable selection response was observed for a large portion of the networks simulated. Further, stronger selection response was seen for high p(mc) values. Our simulations thus demonstrate that GARD can help analyze important facets of evolving systems, and indicate that excess mutual catalysis over self-catalysis is likely to be important for the emergence of molecular systems capable of evolutionlike behavior.
2011
-
(2011) Physical Biology. 8, 6, Abstract
We present a new embodiment of the graded autocatalysis replication domain (GARD) for the growth, replication and evolution of lipid vesicles based on a semi-empirical foundation using experimentally measured kinetic values of selected extant lipid species. Extensive simulations using this formalism elucidated the details of the dependence of the replication and properties of the vesicles on the physicochemical properties and concentrations of the lipids, both in the environment and in the vesicle. As expected, the overall concentration and number of amphiphilic components strongly affect average replication time. Furthermore, variations in acyl chain length and unsaturation of vesicles also influence replication rate, as do the relative concentrations of individual lipid types. Understanding of the dependence of replication rates on physicochemical parameters opens a new direction in the study of prebiotic vesicles and lays the groundwork for future studies involving the competition between lipid vesicles for available amphiphilic monomers.
-
(2011) FASEB Journal. 25, 11, p. 4011-4023 Abstract
While the use of population-based samples is a common strategy in genome-wide association studies (GWASs), family-based samples have considerable advantages, such as robustness against population stratification and false-positive associations, better quality control, and the possibility to check for both linkage and association. In a genome-wide linkage study of schizophrenia in Arab-Israeli families with multiple affected individuals, we previously reported significant evidence for a susceptibility locus at chromosome 6q23.2-q24.1 and suggestive evidence at chromosomes 10q22.3-26.3, 2q36.1-37.3 and 7p21.1-22.3. To identify schizophrenia susceptibility genes, we applied a family-based GWAS strategy in an enlarged, ethnically homogeneous, Arab-Israeli family sample. We performed genome-wide single nucleotide polymorphism (SNP) genotyping and single SNP transmission disequilibrium test association analysis and found genome-wide significant association (best value of P=1.22x10-(11)) for 8 SNPs within or near highly reasonable functional candidate genes for schizophrenia. Of particular interest are a group of SNPs within and flanking the transcriptional factor LRRFIP1 gene. To determine replicability of the significant associations beyond the Arab-Israeli population, we studied the association of the significant SNPs in a German case-control validation sample and found replication of associations near the UGT1 subfamily and EFHD1 genes. Applying an exploratory homozygosity mapping approach as a complementary strategy to identify schizophrenia susceptibility genes in our Arab Israeli sample, we identified 8 putative disease loci. Overall, this GWAS, which emphasizes the important contribution of family based studies, identifies promising candidate genes for schizophrenia.-Alkelai, A., Lupoli, S., Greenbaum, L., Giegling, I., Kohn, Y., Sarner-Kanyas, K., Ben-Asher, E., Lancet, D., Rujescu, D., Macciardi, F., Lerer, B. Identification of new schizophrenia susceptibility
-
(2011) Proteomics Clinical Applications. 5, 6-May, p. 354-366 Abstract
Purpose: For diseases with complex phenotype such as diabetic nephropathy (DN), integration of multiple Omics sources promises an improved description of the disease pathophysiology, being the basis for novel diagnostics and therapy, but equally important personalization aspects. Experimental design: Molecular features on DN were retrieved from public domain Omics studies and by mining scientific literature, patent text and clinical trial specifications. Molecular feature sets were consolidated on a human protein interaction network and interpreted on the level of molecular pathways in the light of the pathophysiology of the disease and its clinical context defined as associated biomarkers and drug targets. Results: About 1000 gene symbols each could be assigned to the pathophysiological description of DN and to the clinical context. Direct feature comparison showed minor overlap, whereas on the level of molecular pathways, the complement and coagulation cascade, PPAR signaling, and the renin-angiotensin system linked the disease descriptor space with biomarkers and targets. Conclusion and clinical relevance: Only the combined molecular feature landscapes closely reflect the clinical implications of DN in the context of hypertension and diabetes. Omics data integration on the level of interaction networks furthermore provides a platform for identification of pathway-specific biomarkers and therapy options.
-
(2011) Bioinformatics For Omics Data. p. 71-96 (trueMethods in Molecular Biology). Abstract
Technological lilies breakthroughs, including next generation sequencing, bring avalanches of data which need to undergo effective data management to ensure integrity, security, and maximal knowledge-gleaning. Data management system requirements include flexible input formats, diverse data entry mechanisms and views, user friendliness, attention to standards, hardware and software platform definition, as well as robustness. Relevant solutions elaborated by the scientific community include Laboratory Information Management Systems (LIMS) and standardization protocols facilitating data sharing and managing. In project planning, special consideration has to be made when choosing relevant lilies annotation sources, since many of them overlap and require sophisticated integration heuristics. The data modeling step defines and categorizes the data into objects (e.g., genes, articles, disorders) and creates an application flow. A data storage/warehouse mechanism must be selected, such as file-based systems and relational databases, the latter typically used for larger projects. Omics project life cycle considerations must include the definition and deployment of new versions, incorporating either full or partial updates. Finally, quality assurance (QA) procedures must validate data and feature integrity, as well as system performance expectations. We illustrate these data management principles with examples from the life cycle of the GeneCards lilies project (http://www.genecards.org), a comprehensive, widely used compendium of annotative information about human genes. For example, the GeneCards infrastructure has recently been changed from text files to a relational database, enabling better organization and views of the growing data. Omics data handling benefits from the wealth of Web-based information, the vast amount of public domain software, increasingly affordable hardware, and effective use of data management and annotation principles as outlined in this chapter
2010
-
(2010) BMC Genomics. 11, Abstract
Background: The pattern-forming bacterium Paenibacillus vortex is notable for its advanced social behavior, which is reflected in development of colonies with highly intricate architectures. Prior to this study, only two other Paenibacillus species (Paenibacillus sp. JDR-2 and Paenibacillus larvae) have been sequenced. However, no genomic data is available on the Paenibacillus species with pattern-forming and complex social motility. Here we report the de novo genome sequence of this Gram-positive, soil-dwelling, sporulating bacterium. Results: The complete P. vortex genome was sequenced by a hybrid approach using 454 Life Sciences and Illumina, achieving a total of 289x coverage, with 99.8% sequence identity between the two methods. The sequencing results were validated using a custom designed Agilent microarray expression chip which represented the coding and the non-coding regions. Analysis of the P. vortex genome revealed 6,437 open reading frames (ORFs) and 73 non-coding RNA genes. Comparative genomic analysis with 500 complete bacterial genomes revealed exceptionally high number of two-component system (TCS) genes, transcription factors (TFs), transport and defense related genes. Additionally, we have identified genes involved in the production of antimicrobial compounds and extracellular degrading enzymes. Conclusions: These findings suggest that P. vortex has advanced faculties to perceive and react to a wide range of signaling molecules and environmental conditions, which could be associated with its ability to reconfigure and replicate complex colony architectures. Additionally, P. vortex is likely to serve as a rich source of genes important for agricultural, medical and industrial applications and it has the potential to advance the study of social microbiology within Gram-positive bacteria.
-
(2010) PLoS Computational Biology. 6, 11, e1000988. Abstract
Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel computational approach with an underlying statistical framework that analyzes the depth-of-coverage of high-throughput DNA sequencing reads, and can incorporate paired-end and breakpoint junction analysis based CNV-analysis approaches, to infer locus copy-number genotypes. We benchmarked CopySeq by genotyping 500 chromosome 1 CNV regions in 150 personal genomes sequenced at low-coverage. The assessed copy-number genotypes were highly concordant with our performed qPCR experiments (Pearson correlation coefficient 0.94), and with the published results of two microarray platforms (95-99% concordance). We further demonstrated the utility of CopySeq for analyzing gene regions enriched for segmental duplications by comprehensively inferring copy-number genotypes in the CNV-enriched >800 olfactory receptor (OR) human gene and pseudogene loci. CopySeq revealed that OR loci display an extensive range of locus copy-numbers across individuals, with zero to two copies in some OR loci, and two to nine copies in others. Among genetic variants affecting OR loci we identified deleterious variants including CNVs and SNPs affecting similar to 15% and similar to 20% of the human OR gene repertoire, respectively, implying that genetic variants with a possible impact on smell perception are widespread. Finally, we found that for several OR loci the reference genome appears to represent a minor-frequency variant, implying a necessary revision of the OR repertoire for future functional studies. CopySeq can ascertain genomic structural variation in specific gene families as well as at a genome-wide scale, where it may enable the quantitative evaluation of CNVs in genome-wide association studies in
-
(2010) Molecular BioSystems. 7, 1, p. 200-214 Abstract
Chemotherapy of cancer experiences a number of shortcomings including development of drug resistance. This fact also holds true for neuroblastoma utilizing chemotherapeutics as vincristine. We performed a comparative analysis of molecular and cellular mechanisms associated with vincristine resistance utilizing cell line as well as human tissue data. Differential gene expression analysis revealed molecular features, processes and pathways afflicted with drug resistance mechanisms in general, and specifically with vincristine significantly involving actin associated features. However, specific mode of resistance as well as underlying genotype of parental, vincristine sensitive cells apparently exhibited significant heterogeneity. No consensus profile for vincristine resistance could be derived, but resistance-associated changes on the level of individual neuroblastoma cell lines as well as individual patient profiles became clearly evident. Based on these prerequisites we utilized the concept of synthetic lethality aimed at identifying hub proteins which when inhibited promise to induce cell death due to a synthetic lethal interaction with down-regulated, chemoresistance associated features. Our screening procedure identified synthetic lethal hub proteins afflicted with actin associated processes holding synthetic lethal interactions to down-regulated features individually found in all chemoresistant cell lines tested, therefore promising an improved therapeutic window. Verification of such synthetic lethal hub candidates in human neuroblastoma tissue expression profiles indicated the feasibility of this screening approach for addressing vincristine resistance in neuroblastoma.
-
(2010) FASEB Journal. 24, 8, p. 3066-3082 Abstract
In previous studies, we identified a locus for schizophrenia on 6q23.3 and proposed the Abelson helper integration site 1 (AHI1) as the candidate gene. AHI1 is expressed in the brain and plays a key role in neurodevelopment, is involved in Joubert syndrome, and has been recently associated with autism. The neurodevelopmental role of AHI1 fits with etiological hypotheses of schizophrenia. To definitively confirm our hypothesis, we searched for associations using a dense map of the region. Our strongest findings lay within the AHI1 gene: single-nucleotide polymorphisms rs11154801 and rs7759971 showed significant associations (P=6.23E-06; P=0.84E-06) and haplotypes gave P values in the 10E-8 to 10E-10 range. The second highest significant region maps close to AHI1 and includes the intergenic region between BC040979 and PDE7B (rs2038549 at P=9.70E-06 and rs1475069 at P=6.97E-06), and PDE7B and MAP7. Using a sample of Palestinian Arab families to confirm these findings, we found isolated signals. While these results did not retain their significance after correction for multiple testing, the joint analysis across the 2 samples supports the role of AHI1, despite the presence of heterogeneity. Given the hypothesis of positive selection of schizophrenia genes, we resequenced a 11 kb region within AHI1 in ethnically defined populations and found evidence for a selective sweep. Network analysis indicates 2 haplotype clades, with schizophrenia-susceptibility haplotypes clustering within the major clade. In conclusion, our data support the role of AHI1 as a susceptibility gene for schizophrenia and confirm it has been subjected to positive selection, also shedding light on new possible candidate genes, MAP7 and PDE7B.-Torri, F., Akelai, A., Lupoli, S., Sironi, M., Amann-Zalcenstein, D., Fumagalli, M., Dal Fiume, C., BenAsher, E., Kanyas, K., Cagliani, R., Cozzi, P., Trombetti, G., Lievers, L. S., Salvi, E., Orro, A., Beckmann, J. S., Lancet, D., Kohn, Y., Milanesi, L., Ebste
-
(2010) Schizophrenia Research. 120, 3-Jan, p. 159-166 Abstract
Association with schizophrenia of the Abelson Helper Integration Site 1 (AHI1) gene on chromosome 6q23 and the adjacent primate-specific gene, C6orf217, was demonstrated in an inbred, Arab Israeli family sample and replicated in an Icelandic case control sample. Further support was provided by a second replication in a large European sample and a meta-analysis that supported association with schizophrenia of all seven alleles overtransmitted to affected subjects in the original study. We examined constitutive expression of AHI1 and C6orf217 in immortalized lymphoblasts of patients from the Arab Israeli family sample in which the association with schizophrenia was originally discovered and population-matched normal controls, and in post-mortem brain of patients with schizophrenia and bipolar (BP) disorder and control subjects from the Stanley Medical Research Institute Collection. We found a significant effect of diagnostic group in the lymphoblast sample (F = 5.72; df = 2,39; p = 0.006). Patients with early age of onset had higher AHI1 expression than controls and later onset patients (p = 0.002; 0.03 respectively). C6orf217 expression in lymphoblasts was too low to measure. We found no difference in brain expression of AHI1 in schizophrenia or BP patients compared to controls. However, there was a genotypic difference in AHI1 expression for SNP rs9321501, which was strongly associated with schizophrenia in the original study. Genotypes that included the undertransmitted C allele (CC/AC) showed lower expression than the homozygous AA genotype (F = 4.73, df = 2,83; p = 0.028). There was no significant difference in brain expression of C6orf217 between patients and controls and no genotypic effect. This study provides further evidence for involvement of AHI1 in susceptibility to schizophrenia. (C) 2010 Elsevier B.V. All rights reserved.
-
(2010) Biology Direct. 5, 38. Abstract
Background: An important facet of early biological evolution is the selection of chiral enantiomers for molecules such as amino acids and sugars. The origin of this symmetry breaking is a long-standing question in molecular evolution. Previous models addressing this question include particular kinetic properties such as autocatalysis or negative cross catalysis. Results: We propose here a more general kinetic formalism for early enantioselection, based on our previously described Graded Autocatalysis Replication Domain (GARD) model for prebiotic evolution in molecular assemblies. This model is adapted here to the case of chiral molecules by applying symmetry constraints to mutual molecular recognition within the assembly. The ensuing dynamics shows spontaneous chiral symmetry breaking, with transitions towards stationary compositional states (composomes) enriched with one of the two enantiomers for some of the constituent molecule types. Furthermore, one or the other of the two antipodal compositional states of the assembly also shows time-dependent selection. Conclusion: It follows that chiral selection may be an emergent consequence of early catalytic molecular networks rather than a prerequisite for the initiation of primeval life processes. Elaborations of this model could help explain the prevalent chiral homogeneity in present-day living cells.
-
(2010) Database-The Journal Of Biological Databases And Curation. Abstract
GeneCards (www.genecards.org) is a comprehensive, authoritative compendium of annotative information about human genes, widely used for nearly 15 years. Its gene-centric content is automatically mined and integrated from over 80 digital sources, resulting in a web-based deep-linked card for each of >73 000 human gene entries, encompassing the following categories: protein coding, pseudogene, RNA gene, genetic locus, cluster and uncategorized. We now introduce GeneCards Version 3, featuring a speedy and sophisticated search engine and a revamped, technologically enabling infrastructure, catering to the expanding needs of biomedical researchers. A key focus is on gene-set analyses, which leverage GeneCards' unique wealth of combinatorial annotations. These include the GeneALaCart batch query facility, which tabulates user-selected annotations for multiple genes and GeneDecks, which identifies similar genes with shared annotations, and finds set-shared annotations by descriptor enrichment analysis. Such set-centric features address a host of applications, including microarray data analysis, cross-database annotation mapping and gene-disorder associations for drug targeting. We highlight the new Version 3 database architecture, its multi-faceted search engine, and its semi-automated quality assurance system. Data enhancements include an expanded visualization of gene expression patterns in normal and cancer tissues, an integrated alternative splicing pattern display, and augmented multi-source SNPs and pathways sections. GeneCards now provides direct links to gene-related research reagents such as antibodies, recombinant proteins, DNA clones and inhibitory RNAs and features gene-related drugs and compounds lists. We also portray the GeneCards Inferred Functionality Score annotation landscape tool for scoring a gene's functional information status. Finally, we delineate examples of applications and collaborations that have benefited from the GeneCards suite. Database
2009
-
(2009) Omics-A Journal Of Integrative Biology. 13, 6, p. 477-487 Abstract
Sophisticated genomic navigation strongly benefits from a capacity to establish a similarity metric among genes. GeneDecks is a novel analysis tool that provides such a metric by highlighting shared descriptors between pairs of genes, based on the rich annotation within the GeneCards compendium of human genes. The current implementation addresses information about pathways, protein domains, Gene Ontology (GO) terms, mouse phenotypes, mRNA expression patterns, disorders, drug relationships, and sequence-based paralogy. GeneDecks has two modes: (1) Paralog Hunter, which seeks functional paralogs based on combinatorial similarity of attributes; and (2) Set Distiller, which ranks descriptors by their degree of sharing within a given gene set. GeneDecks enables the elucidation of unsuspected putative functional paralogs, and a refined scrutiny of various gene-sets (e. g., from high-throughput experiments) for discovering relevant biological patterns.
-
(2009) Journal of Molecular Evolution. 69, 5, p. 568-578 Abstract
The Graded Autocatalysis Replication Domain (GARD) model describes an origin of life scenario which involves non-covalent compositional assemblies, made of monomeric mutually catalytic molecules. GARD constitutes an alternative to informational biopolymers as a mechanism of primordial inheritance. In the present work, we examined the effect of mutations, one of the most fundamental mechanisms for evolution, in the context of the networks of mutual interaction within GARD prebiotic assemblies. We performed a systematic analysis analogous to single and double gene deletions within GARD. While most deletions have only a small effect on both growth rate and molecular composition of the assemblies, similar to 10% of the deletions caused lethality, or sometimes showed enhanced fitness. Analysis of 14 different network properties on 2,000 different GARD networks indicated that lethality usually takes place when the deleted node has a high molecular count, or when it is a catalyst for such node. A correlation was also found between lethality and node degree centrality, similar to what is seen in real biological networks. Addressing double knockout mutations, our results demonstrate the occurrence of both synthetic lethality and extragenic suppression within GARD networks, and convey an attempt to correlate synthetic lethality to network node-pair properties. The analyses presented help establish GARD as a workable alternative prebiotic scenario, suggesting that life may have begun with large molecular networks of low fidelity, that later underwent evolutionary compaction and fidelity augmentation.
-
(2009) BMC Bioinformatics. 10, Abstract
Background: Gene annotation is a pivotal component in computational genomics, encompassing prediction of gene function, expression analysis, and sequence scrutiny. Hence, quantitative measures of the annotation landscape constitute a pertinent bioinformatics tool. GeneCards (R) is a gene-centric compendium of rich annotative information for over 50,000 human gene entries, building upon 68 data sources, including Gene Ontology (GO), pathways, interactions, phenotypes, publications and many more. Results: We present the GeneCards Inferred Functionality Score (GIFtS) which allows a quantitative assessment of a gene's annotation status, by exploiting the unique wealth and diversity of GeneCards information. The GIFtS tool, linked from the GeneCards home page, facilitates browsing the human genome by searching for the annotation level of a specified gene, retrieving a list of genes within a specified range of GIFtS value, obtaining random genes with a specific GIFtS value, and experimenting with the GIFtS weighting algorithm for a variety of annotation categories. The bimodal shape of the GIFtS distribution suggests a division of the human gene repertoire into two main groups: the high-GIFtS peak consists almost entirely of protein-coding genes; the low-GIFtS peak consists of genes from all of the categories. Cluster analysis of GIFtS annotation vectors provides the classification of gene groups by detailed positioning in the annotation arena. GIFtS also provide measures which enable the evaluation of the databases that serve as GeneCards sources. An inverse correlation is found (for GIFtS>25) between the number of genes annotated by each source, and the average GIFtS value of genes associated with that source. Three typical source prototypes are revealed by their GIFtS distribution: genome-wide sources, sources comprising mainly highly annotated genes, and sources comprising mainly poorly annotated genes. The degree of accumulated knowledge for a given gene measured
-
(2009) American Journal Of Medical Genetics Part B-Neuropsychiatric Genetics. 150B, 7, p. 914-925 Abstract
A genome scan for schizophrenia related loci in Arab Israeli families by Lerer et al. [Lerer et al. (2003); Mol Psychiatry 8:488-498] detected significant evidence for linkage at chromosome 6q23. Subsequent fine mapping [Levi et al. (2005); Eur J Hum Genet 13:763-771], association [Amann-Zalcenstein et al. (2006); Eur J Hum Genet 14:1111-1119] and replication studies (Ingason et al. (2007); Eur J Hum Genet 15:988-991] identified AHI1 as a putative susceptibility gene. The same genome scan revealed suggestive evidence for a schizophrenia susceptibility locus in the 10q23-26 region. Genes at these two loci may act independently in the pathogenesis of the disease in our homogeneous sample of Arab Israeli families or may interact with each other and with other factors in a common biological pathway. The purpose of our current study was to test the hypothesis of genetic interaction between these two loci and to identify the type of interaction between them. The initial stage of our study focused on the 10q23-q26 region which has not been explored further in our sample. The second stage of the study included a test for possible genetic interaction between the 6q23.3 locus and the refined 10q24.33-q26.13 locus. A final candidate region of 19.9 Mb between markers D10S222 (105.3 Mb) and D10S587 (125.2 Mb) was found on chromosome 10 by non-parametric and parametric linkage analyses. These linkage findings are consistent with previous reports in the same chromosomal region. Two-locus multipoint linkage analysis under three complex disease inheritance models (heterogeneity, multiplicative, and additive models) yielded a best maximum LOD score of 7.45 under the multiplicative model suggesting overlapping function of the 6q23.3 and 10q24.33-q26.13 loci. (C) 2009 Wiley-Liss, Inc.
-
(2009) BMC Evolutionary Biology. 9, Abstract
Background: Olfactory Receptors (ORs) form the largest multigene family in vertebrates. Their evolution and their expansion in the vertebrate genomes was the subject of many studies. In this paper we apply a motif-based approach to this problem in order to uncover evolutionary characteristics. Results: We extract deterministic motifs from ORs belonging to ten species using the MEX (Motif Extraction) algorithm, thus defining Common Peptides (CPs) characteristic to ORs. We identify species-specific CPs and show that their relative abundance is high only in fish and frog, suggesting relevance to water-soluble odorants. We estimate the origins of CPs according to the tree of life and track the gains and losses of CPs through evolution. We identify major CP gain in tetrapods and major losses in reptiles. Although the number of human ORs is less than half of the number of ORs in other mammals, the fraction of lost CPs is only 11%. By examining the positions of CPs along the OR sequence, we find two regions that expanded only in tetrapods. Using CPs we are able to establish remote homology relations between ORs and non-OR GPCRs. Selecting CPs according to their evolutionary age, we bicluster ORs and CPs for each species. Clean biclustering emerges when using relatively novel CPs. Evolutionary age is used to track the history of CP acquisition in the collection of mammalian OR families within HORDE (Human Olfactory Receptor Data Explorer). Conclusion: The CP method provides a novel perspective that reveals interesting traits in the evolution of olfactory receptors. It is consistent with previous knowledge, and provides finer details. Using available phylogenetic trees, evolution can be rephrased in terms of CP origins.
-
(2009) Trends in Genetics. 25, 4, p. 178-184 Abstract
The sense of smell is a complex molecular device, encompassing several hundred olfactory receptor proteins (ORs). These receptors, encoded by the largest human gene superfamily, integrate odorant signals into an accurate 'odor image' in the brain. Widespread phenotypic diversity in human olfaction is, in part, attributable to prevalent genetic variation in OR genes, owing to copy number variation, deletion alleles and deleterious single nucleotide polymorphisms. The development of new genomic tools, including next generation sequencing and CNV assays; provides opportunities to characterize the genetic variations of this system. The advent of large-scale functional screens of expressed ORs, combined with genetic association studies, has the potential to link variations in ORs to human chemosensory phenotypes. This promises to provide a genome-wide view of human olfaction, resulting in a deeper understanding of personalized odor coding, with the potential to decipher flavor and fragrance preferences.
-
(2009) Pharmacogenomics Journal. 9, 2, p. 103-110 Abstract
RGS2 (regulator of G-protein signaling 2) modulates dopamine receptor signal transduction. Functional variants in the gene may influence susceptibility to extrapyramidal symptoms (EPS) induced by antipsychotic drugs. To further investigate our previous report of association of the RGS2 gene with susceptibility to antipsychotic-induced EPS, we performed a replication study. EPS were rated in 184 US patients with schizophrenia (115 African Americans, 69 Caucasian) treated for at least a month with typical antipsychotic drugs (n = 45), risperidone (n = 46), olanzapine (n = 50) or clozapine (n = 43). Six single nucleotide polymorphisms (SNPs) within or flanking RGS2 were genotyped (rs1933695, rs2179652, rs2746073, rs4606, rs1819741 and rs1152746). Odds ratios (ORs) and 95% confidence intervals (CIs) were calculated by logistic regression. Our results indicate association of SNP rs4606 with antipsychotic-induced parkinsonism (AIP), as measured by the Simpson Angus scale, in the overall sample and in the African-American subsample, the G (minor) allele having a protective effect. ORs for AIP among rs4606 G-allele carriers were 0.23 (95% CI 0.10-0.54, P = 0.001) in the overall sample, and 0.20 (0.07-0.57, P = 0.003) in the African-American subsample. In the previously studied Israeli sample the OR was 0.31 (0.11-0.84, P = 0.02). We completely sequenced the RGS2 gene in nine patients with AIP and nine patients without, from the Israeli sample. No common coding polymorphisms or additional regulatory variants were revealed, suggesting that association of the rs4606 C/G polymorphism with AIP is biologically meaningful and not a consequence of linkage disequilibrium with another functional variant. Taken together, the findings of the current study support the association of RGS2 with AIP and focus on a possible protective effect of the minor G allele of SNP rs4606. This SNP is located in the 3 '-regulatory region of the gene, and is known to influence RGS2 mRNA levels and pr
2008
-
(2008) PLoS Genetics. 4, 11, Abstract
Olfactory receptors (ORs), which are involved in odorant recognition, form the largest mammalian protein superfamily. The genomic content of OR genes is considerably reduced in humans, as reflected by the relatively small repertoire size and the high fraction (similar to 55%) of human pseudogenes. Since several recent low-resolution surveys suggested that OR genomic loci are frequently affected by copy-number variants (CNVs), we hypothesized that CNVs may play an important role in the evolution of the human olfactory repertoire. We used high-resolution oligonucleotide tiling microarrays to detect CNVs across 851 OR gene and pseudogene loci. Examining genomic DNA from 25 individuals with ancestry from three populations, we identified 93 OR gene loci and 151 pseudogene loci affected by CNVs, generating a mosaic of OR dosages across persons. Our data suggest that similar to 50% of the CNVs involve more than one OR, with the largest CNV spanning 11 loci. In contrast to earlier reports, we observe that CNVs are more frequent among OR pseudogenes than among intact genes, presumably due to both selective constraints and CNV formation biases. Furthermore, our results show an enrichment of CNVs among ORs with a close human paralog or lacking a one-to-one ortholog in chimpanzee. Interestingly, among the latter we observed an enrichment in CNV losses over gains, a finding potentially related to the known diminution of the human OR repertoire. Quantitative PCR experiments performed for 122 sampled ORs agreed well with the microarray results and uncovered 23 additional CNVs. Importantly, these experiments allowed us to uncover nine common deletion alleles that affect 15 OR genes and five pseudogenes. Comparison to the chimpanzee reference genome revealed that all of the deletion alleles are human derived, therefore indicating a profound effect of human-specific deletions on the individual OR gene content. Furthermore, these deletion alleles may be used in future genetic assoc
-
(2008) Genes Brain And Behavior. 7, 2, p. 164-172 Abstract
Previous work suggests that young women who smoke cigarettes regularly, or did so in the past, manifest a neurocognitive profile that is characterized by small but significant impairments of response inhibition and attention. The present study sought to determine whether variation in nicotinic cholinergic receptor (nAchR) genes impacts upon cognitive function in these domains by overall or differential effects on the performance of current, former and non-smokers. The study sample consisted of 100 female college students, current or past smokers, and 144 who had never smoked. All performed a computerized neurocognitive test battery and were genotyped for 39 single nucleotide polymorphisms in 11 nAchR genes. The results, derived from linear or logistic regression, show significant direct and interactive relationships between single nucleotide polymorphisms and haplotypes in several nAchR genes and performance on the Matching Familiar Figures Test (MFFT) Stroop test, Continuous Performance Task (CPT) and Tower of London (TOL) test. Response inhibition (MFFT, Stroop, CPT Loading Phase, TOL) was associated with variants in CHRNA2, CHRNA4, CHRNA5, CHRNA7, CHRNA9, CHRNA10, CHRNB2 and CHRNB3. Selective attention (Stroop) was associated with CHRNA4, CHRNA5, CHRNA9 and CHRNB2. Sustained attention (CPT Boring Phase) was associated with CHRNA4, CHRNA5, CHRNA7, CHRNA10 and CHRNB3. Up to 37% of the variance among the smokers and up to 47% of the variance among the non-smokers on the test measures was explained. Differences between smokers and non-smokers in neurocognitive function, putatively implicated in susceptibility to nicotine dependence, may be modulated by variants in nAchR genes, with potential implications for prevention and treatment.
-
(2008) American Journal Of Medical Genetics Part B-Neuropsychiatric Genetics. 147B, 2, p. 209-215 Abstract
Linkage and association studies in schizophrenia have repeatedly drawn attention to several chromosomal regions and to genes within them. Conflicting patterns of association and the lack of a clear functional significance of the associated variants limit the interpretation of these results. The use of rare pedigrees, where genes with a major effect cause the disorder, has been proven beneficial in studies of other complex disorders. Our objective was to use this advantage by performing a genome wide linkage analysis for schizophrenia in a large, multiplex Israeli Arab pedigree. We genotyped 346 microsatellite markers in 24 pedigree members affected with schizophrenia spectrum disorders and 32 unaffected relatives. Two-point linkage analysis with SUPERLINK demonstrated a LOD score of 2.47 for D20S116 on chromosome 20p13 under an autosomal dominant mode of inheritance. Further fine mapping yielded a two-point LOD score of 2.56 for the adjacent marker D20S193 and narrowed down the linked region to 2-5 cM. A haplotype containing the markers D20S193, D20S889, and D20S116, 0.7 Mb in length, was found to be shared by most affected pedigree members. Genotyping of 43 SNPs in the interval supported these results with a multipoint LOD score of 2.7 around D20S193. We were also able to better define the boundaries of the shared haplotype which contains strong candidate genes for schizophrenia. Our study exemplifies the power of rare and unique pedigrees in drawing attention to novel regions for genetic studies of schizophrenia. (c) 2007 Wiley-Liss, Inc.
2007
-
(2007) PLoS Biology. 5, 11, p. 2462-2468 Abstract
The genetic basis of odorant-specific variations in human olfactory thresholds, and in particular of enhanced odorant sensitivity (hyperosmia), remains largely unknown. Olfactory receptor (OR) segregating pseudogenes, displaying both functional and nonfunctional alleles in humans, are excellent candidates to underlie these differences in olfactory sensitivity. To explore this hypothesis, we examined the association between olfactory detection threshold phenotypes of four odorants and segregating pseudogene genotypes of 43 ORs genome-wide. A strong association signal was observed between the single nucleotide polymorphism variants in OR11H7P and sensitivity to the odorant isovaleric acid. This association was largely due to the low frequency of homozygous pseudogenized genotype in individuals with specific hyperosmia to this odorant, implying a possible functional role of OR11H7P in isovaleric acid detection. This predicted receptor-ligand functional relationship was further verified using the Xenopus oocyte expression system, whereby the intact allele of OR11H7P exhibited a response to isovaleric acid. Notably, we also uncovered another mechanism affecting general olfactory acuity that manifested as a significant inter-odorant threshold concordance, resulting in an overrepresentation of individuals who were hyperosmic to several odorants. An involvement of polymorphisms in other downstream transduction genes is one possible explanation for this observation. Thus, human hyperosmia to isovaleric acid is a complex trait, contributed to by both receptor and other mechanisms in the olfactory signaling pathway.
-
(2007) BMC Bioinformatics. 8, Abstract
Background: Improvements in genome sequence annotation revealed discrepancies in the original probeset/gene assignment in Affymetrix microarray and the existence of differences between annotations and effective alignments of probes and transcription products. In the current generation of Affymetrix human GeneChips, most probesets include probes matching transcripts from more than one gene and probes which do not match any transcribed sequence. Results: We developed a novel set of custom Chip Definition Files (CDF) and the corresponding Bioconductor libraries for Affymetrix human GeneChips, based on the information contained in the GeneAnnot database. GeneAnnot-based CDFs are composed of unique custom-probesets, including only probes matching a single gene. Conclusion: GeneAnnot-based custom CDFs solve the problem of a reliable reconstruction of expression levels and eliminate the existence of more than one probeset per gene, which often leads to discordant expression signals for the same transcript when gene differential expression is the focus of the analysis. GeneAnnot CDFs are freely distributed and fully compliant with Affymetrix standards and all available software for gene expression analysis. The CDF libraries are available from http://www.xlab.unimo.it/GA CDF, along with supplementary information (CDF libraries, installation guidelines and R code, CDF statistics, and analysis results).
-
(2007) Philosophical Transactions Of The Royal Society B-Biological Sciences. 362, 1486, p. 1813-1819 Abstract
The coevolution of environment and living organisms is well known in nature. Here, it is suggested that similar processes can take place before the onset of life, where protocellular entities, rather than full-fledged living systems, coevolve along with their surroundings. Specifically, it is suggested that the chemical composition of the environment may have governed the chemical repertoire generated within molecular assemblies, compositional protocells, while compounds generated within these protocells altered the chemical composition of the environment. We present an extension of the graded autocatalysis replication domain (GARD) model-the environment exchange polymer GARD (EE-GARD) model. In the new model, molecules, which are formed in a protocellular assembly, may be exported to the environment that surrounds the protocell. Computer simulations of the model using an infinite-sized environment showed that EE-GARD assemblies may assume several distinct quasi-stationary compositions (composomes), similar to the observations in previous variants of the GARD model. A statistical analysis suggested that the repertoire of composomes manifested by the assemblies is independent of time. In simulations with a finite environment, this was not the case. Composomes, which were frequent in the early stages of the simulation disappeared, while others emerged. The change in the frequencies of composomes was found to be correlated with changes induced on the environment by the assembly. The EE-GARD model is the first GARD model to portray a possible time evolution of the composomes repertoire.
-
(2007) Origins of Life and Evolution of Biospheres. 37, 5-Apr, p. 429-432 Abstract
Five common assumptions about the first cells are challenged by the pre- biotic ecology model and are replaced by the following propositions: firstly, early cells were more complex, more varied and had a greater diversity of constituents than modern cells; secondly, the complexity of a cell is not related to the number of genes it contains, indeed, modern bacteria are as complex as eukaryotes; thirdly, the unit of early life was an 'ecosystem' rather than a 'cell'; fourthly, the early cell needed no genes at all; fifthly, early life depended on non-covalent associations and on catalysts that were not confined to specific reactions. We present here the outlines of a theory that connects findings about modern bacteria with speculations about their origins.
-
Pharmacogenetics of glatiramer acetate therapy for multiple sclerosis reveals drug-response markers(2007) Pharmacogenetics and Genomics. 17, 8, p. 657-666 Abstract
Genetic-based optimization of treatment prescription is becoming a central research focus in the management of chronic diseases, such as multiple sclerosis, which incur a prolonged drug-regimen adjustment. This study was aimed to identify genetic markers that can predict response to glatiramer acetate (Copaxone) immunotherapy for relapsing multiple sclerosis. For this purpose, we genotyped fractional cohorts of two glatiramer acetate clinical trials for HLA-DRB1*1 501 and 61 single nucleotide polymorphisms within a total of 27 candidate genes. Statistical analyses included single nucleotide polymorphism-by-single nucleotide polymorphism and haplotype tests of drug-by-genotype effects in drug-treated versus placebo-treated groups. We report the detection of a statistically significant association between glatiramer acetate response and a single nucleotide polymorphism in a T-cell receptor beta (TRB@) variant replicated in the two independent cohorts (odds ratio=6.85). Findings in the Cathepsin S (CTSS) gene (P=0.049 corrected for all single nucleotide polymorphisms and definitions tested, odds ratio = 11.59) in one of the cohorts indicate a possible association that needs to be further investigated. Additionally, we recorded nominally significant associations of response with five other genes, MBP, CD86, FAS, IL1R1 and IL12RB2, which are likely to be involved in glatiramer acetate's mode-of-action, both directly and indirectly. Each of these association signals in and of itself is consistent with the no-association null-hypothesis, but the number of detected associations is surprising vis-A-vis chance expectation. Moreover, the restriction of these associations to the glatiramer acetate-treated group, rather than the placebo group, clearly demonstrates drug-specific genetic effects. These findings provide additional progress toward development of pharmacogenetics-based personalized treatment for multiple sclerosis.
-
(2007) Pharmacogenetics and Genomics. 17, 7, p. 519-528 Abstract
Objectives To investigate the role of genes encoding regulators of G protein signaling in early therapeutic response to antipsychotic drugs and in susceptibility to drug-induced extrapyramidal symptoms. As regulators of G protein signaling and regulators of G protein signaling-like proteins play a pivotal role in dopamine receptor signaling, genetically based, functional variation could contribute to interindividual variability in therapeutic and adverse effects. Methods Consecutively hospitalized, psychotic patients with Diagnostic and Statistical Manual of Mental Disorder-IV schizophrenia (n=121) were included in the study if they received treatment with typical antipsychotic medication (n=72) or typical antipsychotic drugs and risperidone (n=49) for at least 2 weeks. Clinical state and adverse effects were rated at baseline and after 2 weeks. Twenty-four single nucleotide polymorphisms were genotyped in five regulators of G protein signaling genes. Results None of the single nucleotide polymorphisms were related to clinical response to antipsychotic treatment at 2 weeks. Five out of six single nucleotide polymorphisms within or flanking the RGS2 gene were nominally associated with development or worsening of parkinsonian symptoms (PARK+) as measured by the Simpson Angus Scale, one of them after correction for multiple testing (rs4606, P=0.002). A GCCTG haplotype encompassing tagging single nucleotide polymorphisms within and flanking RGS2 was significantly overrepresented among PARK+ compared with PARK- patients (0.23 vs. 0.08, P=0.003). A second, 'protective', GTGCA haplotype was significantly overrepresented in PARK- patients (0.13 vs. 0.30, P=0.009). Both haplotype associations survive correction for multiple testing. Conclusions Subject to replication, these findings suggest that genetic variation in the RGS2 gene is associated with susceptibility to extrapyramidal symptoms induced by antipsychotic drugs.
-
(2007) International Journal of Neuropsychopharmacology. 10, 3, p. 321-333 Abstract
Genetic variation in antipsychotic drug targets could underlie variability among patients in the time required for antipsychotic effects to be elicited. In a clinical, pharmacogenetic study we focused on the dopamine receptor interacting protein (DRIP) gene family. DRIPs are pivotally involved in regulating dopamine receptor signal transduction. Consecutively hospitalized, acutely psychotic patients with DSM-IV schizophrenia (n=121) were included in the study if they received treatment with typical antipsychotic medication (TYP, n=72) or TYP plus risperidone (TYP-R, n = 49) for at least 2 wk. Clinical state and adverse effects were rated at baseline and after 2 wk. Patients improved significantly on both TYP and TYP-R with no significant difference between them. Early responders were defined as patients whose PANSS change scores were greater than the median. Twenty-two single nucleotide polymorphisms (SNPs) were analysed in five DRIP-encoding genes. Two SNPs in NEF3, which encodes the DRIP, neurofilament-medium (NF-M), were associated with early response (rs1457266, p=0.01; rs1379357, p = 0.006). A 5 SNP haplotype spanning NEF3 was over-represented in early responders (p = 0.015), in the combined patient group and in the TYP group alone. These findings suggest that variation in NEF3, most likely functional variants that are in linkage disequilibrium with the SNPs that we studied, influences rate of response to TYP. Since NEF3 is primarily associated with dopamine D(1) receptor function, the evidence for a complementary role of dopamine D(1) receptors in antipsychotic effects is considered. The findings reported here open an interesting research avenue in the pharmacogenetics of antipsychotic effects but require replication in larger samples treated in a controlled context.
-
(2007) Proceedings of the National Academy of Sciences of the United States of America. 104, 11, p. 4524-4529 Abstract
The MDM2 protein is an ubiquitin ligase that plays a critical role in regulating the levels and activity of the p53 protein, which is a central tumor suppressor. A SNP in the human MDM2 gene (SNP309 T/G) occurs at frequencies dependent on demographic history and has been shown to have important differential effects on the activity of the MDM2 and p53 proteins and to associate with altered risk for the development of several cancers. In this report, the haplotype structure of the MDM2 gene is determined by using 14 different SNPs across the gene from three different population samples: Caucasians, African Americans, and the Ashkenazi Jewish ethnic group. The results presented in this report indicate that there is a substantially reduced variability of the deleterious SNP309 G allele haplotype in all three populations studied, whereas multiple common T allele haplotypes were found in all three populations. This observation, coupled with the relatively high frequency of the G allele haplotype in both and Caucasian and Ashkenazi Jewish population data sets, suggests that this haplotype could have undergone a recent positive selection sweep. An entropy-based selection test is presented that explicitly takes into account the correlations between different SNPs, and the analysis of MDM2 reveals a significant departure from the standard assumptions of selective neutrality.
-
(2007) Human Biology. 79, 1, p. 1-14 Abstract
The existence of osteoarthritis susceptibility loci on chromosome 6 for individuals suffering from hip and knee osteoarthritis has been suggested. We determined whether radiographic hand osteoarthritis in a demographically homogeneous population of European origin can be linked to loci on chromosome 6p12.3-p12.1. Nine single nucleotide polymorphisms (SNPs) were genotyped in 764 individuals (members of 189 nuclear and more complex two- or three-generation families). Radiographic hand osteoarthritis was characterized by two traits: (1) the total individual osteoarthritis score (PC1-OA) and (2) the osteophytes score (PC1-OS), obtained from the principal components analysis of sums of the Kellgren and Lawrence grade and of the osteophyte grades, respectively, for 14 joints on each hand. The contribution of genetic and environmental factors and of covariates such as age and body mass index to hand osteoarthritis was evaluated by variance components analysis. The association between the studied traits and selected DNA markers was evaluated by three types of transmission disequilibrium tests. The parent-offspring and sib-sib correlations were statistically significant for all studied traits. The additive genetic effects for PC1-OA and PC1-OS were estimated to be 43% and 37.9%, respectively. Transmission disequilibrium tests consistently revealed a statistically significant association (p values ranged from 0.017 to 0.030) between SNP rs 1508632 and PC1-OS. In the tested cohort the putative genetic factors are influential enough to determine interindividual differences regarding the extent of hand osteoarthritis. SNP rs 1508632 lies in immediate proximity to the TINAG gene, implicating it as a possible hand osteoarthritis susceptibility gene.
-
(2007) Chemical Senses. 32, 1, p. 21-30 Abstract
Anosmia affects the western world population, mostly the elderly, reaching to 5% in subjects over the age of 45 years and strongly lowering their quality of life. A smaller minority (about 0.01%) is born without a sense of smell, afflicted with congenital general anosmia (CGA). No causative genes for human CGA have been identified yet, except for some syndromic cases such as Kallman syndrome. In mice, however, deletion of any of the 3 main olfactory transduction components (guanidine triphosphate binding protein, adenylyl cyclase, and the cyclic adenosine monophosphate-gated channel) causes profound reduction of physiological responses to odorants. In an attempt to identify human CGA-related mutations, we performed whole-genome linkage analysis in affected families, but no significant linkage signals were observed, probably due to the small size of families analyzed. We further carried out direct mutation screening in the 3 main olfactory transduction genes in 64 unrelated anosmic individuals. No potentially causative mutations were identified, indicating that transduction gene variations underlie human CGA rarely and that mutations in other genes have to be identified. The screened genes were found to be under purifying selection, suggesting that they play a crucial functional role not only in olfaction but also potentially in additional pathways.
2006
-
(2006) Human Genetics. 120, 4, p. 447-459 Abstract
Monoamine oxidase A (MAOA) catalyses the oxidative deamination of biogenic amines including neurotransmitters, mainly norepinephrine and serotonin in the brain and peripheral tissues. A nonsense mutation in the gene was shown to be involved in a rare X-linked behavioural syndrome, which includes impaired impulse control, aggression and borderline mental retardation (Brunner syndrome). Several recent studies have shown the association of genetic variation of a VNTR in the gene promoter with various pathological behavioural traits. In the present study the association of MAOA genetic variation with a large set of quantitative behavioural traits in normal individuals has been examined. DNA samples from 421 unrelated males were genotyped for 14 SNPs and for the promoter VNTR at the MAOA locus. An additional 16 SNPs were genotyped at apparently neutral loci across the X chromosome to serve as a genomic control for possible false positive associations due to population structure. Behavioural traits were measured using the NEO psychometric questionnaire, which is based on a 5-axis model of personality, and consists of 30 different quantitative traits. There was a robust association of the A2 ("straightforwardness") facet with common allelic variants at the promoter VNTR. Most of the tested traits were not associated with the VNTR despite reasonable power, thus demonstrating that the VNTR influence on quantitative behavioural traits in normal males may be very specific. In contrast, several traits of the C ("conscientiousness") axis were associated with less common SNP-defined haplotypes. Hence, it appears that common genetic variation at the VNTR contributes to the behavioural attribute of "straightforwardness", while rare haplotypes defined by SNPs downstream of the transcription start site may contribute to "conscientiousness". This study is used to address the validation, interpretation and limitation of genetic association studies of quantitative behavioural traits.
-
(2006) European Journal of Human Genetics. 14, 10, p. 1111-1119 Abstract
Schizophrenia, a severe neuropsychiatric disorder, is believed to involve multiple genetic factors. A significant body of evidence supports a pivotal role for abnormalities of brain development in the disorder. Linkage signals for schizophrenia map to human chromosome 6q. To obtain a finer localization, we genotyped 180 single nucleotide polymorphisms (SNPs) in a young, inbred Arab-Israeli family sample with a limited number of founders. The SNPs were mostly within a similar to 7Mb region around the strong linkage peak at 136.2Mb that we had previously mapped. The most significant genetic association with schizophrenia for single SNPs and haplotypes was within a 500 kb genomic region of high linkage disequilibrium (LD) at 135.85 Mb. In a different, outbred, nuclear family sample that was not appropriate for linkage analysis, under-transmitted haplotypes incorporating the same SNPs (but not the individual SNPs) were significantly associated with schizophrenia. The implicated genomic region harbors the Abelson Helper Integration Site 1 (AHI1) gene, which showed the strongest association signal, and an adjacent, primate-specific gene, C6orf217. Mutations in human AHI1 underlie the autosomal recessive Joubert Syndrome with brain malformation and mental retardation. Previous comparative genomic analysis has suggested accelerated evolution of AHI1 in the human lineage. C6orf217 has multiple splice isoforms and is expressed in brain but does not seem to encode a functional protein. The two genes appear in opposite orientations and their regulatory upstream regions overlap, which might affect their expression. Both, AHI1 and C6orf217 appear to be highly relevant candidate genes for schizophrenia.
-
(2006) BMC Genomics. 7, Abstract
Background: Quantitative variation in gene expression has been proposed to underlie phenotypic variation among human individuals. A facilitating step towards understanding the basis for gene expression variability is associating genome wide transcription patterns with potential cis modifiers of gene expression. Description: EXPOLDB, a novel Database, is a new effort addressing this need by providing information on gene expression levels variability across individuals, as well as the presence and features of potentially polymorphic ( TG/CA)(n) repeats. EXPOLDB thus enables associating transcription levels with the presence and length of ( TG/CA)(n) repeats. One of the unique features of this database is the display of expression data for 5 pairs of monozygotic twins, which allows identification of genes whose variability in expression, are influenced by non-genetic factors including environment. In addition to queries by gene name, EXPOLDB allows for queries by a pathway name. Users can also upload their list of HGNC ( HUGO ( The Human Genome Organisation) Gene Nomenclature Committee) symbols for interrogating expression patterns. The online application 'SimRep' can be used to find simple repeats in a given nucleotide sequence. To help illustrate primary applications, case examples of Housekeeping genes and the RUNX gene family, as well as one example of glycolytic pathway genes are provided. Conclusion: The uniqueness of EXPOLDB is in facilitating the association of genome wide transcription variations with the presence and type of polymorphic repeats while offering the feature for identifying genes whose expression variability are influenced by non genetic factors including environment. In addition, the database allows comprehensive querying including functional information on biochemical pathways of the human genes.
-
(2006) BMC Bioinformatics. 7, Abstract
Background: Olfactory receptors (ORs), the largest mammalian gene superfamily (900-1400 genes), has > 50% pseudogenes in humans. While most of these inactive genes are identified via coding frame (nonsense) disruptions, seemingly intact genes may also be inactive due to other deleterious (missense) mutations. An ultimate assessment of the actual size of the functional human OR repertoire thus requires an accurate distinction between genes and pseudogenes. Results: To characterize inactive ORs with intact open reading frame, we have developed a probabilistic Classifier for Olfactory Receptor Pseudogenes (CORP). This algorithm is based on deviations from a functionally crucial consensus, constituting sixty highly conserved positions identified by a comparison of two evolutionarily-constrained OR repertoires ( mouse and dog) with a small pseudogene fraction. We used a logistic regression analysis to assign appropriate coefficients to the conserved position and thus achieving maximal separation between active and inactive ORs. Consequently, the algorithms identified only 5% of the mouse functional ORs as pseudogenes, setting an upper limit of 0.05 to the false positive detection. Finally we used this algorithm to classify the 384 purportedly intact human OR genes. Of these, 135 were predicted as likely encoding non-functional proteins, and 38 were segregating between active and inactive forms due to missense polymorphisms. Conclusion: We demonstrated that the CORP algorithm is capable to distinguish between functional and non-functional OR genes with high precision even when the encoded protein would differ by a single amino acid. Using the CORP algorithm, we predict that similar to 70% of human OR genes are likely non-functional pseudogenes, a much higher number than hitherto suspected. The method we present may be employed for better annotation of inactive members in other gene families as well.
-
(2006) Cellular and Molecular Life Sciences. 63, 13, p. 1485-1493 Abstract
Of all five senses, olfaction is the most complex molecular mechanism, as it comprises hundreds of receptor proteins enabling it to detect and discriminate thousands of odorants. Until lately, the understanding of this highly sophisticated sensory neuronal pathway has been rather sketchy. The sequencing of the human genome and the consequent advent of new genomic tools have opened new opportunities to better understand this multifaceted biological system. Here, we present the relevant progresses made in the last decade and highlight the possible genetic mechanisms of human olfactory variability.
-
(2006) BMC Genomics. 7, Abstract
Background: Olfactory receptors (ORs) are the largest gene family in the human genome. Although they are expected to be expressed specifically in olfactory tissues, some ectopic expression has been reported, with special emphasis on sperm and testis. The present study systematically explores the expression patterns of OR genes in a large number of tissues and assesses the potential functional implication of such ectopic expression. Results: We analyzed the expression of hundreds of human and mouse OR transcripts, via EST and microarray data, in several dozens of human and mouse tissues. Different tissues had specific, relatively small OR gene subsets which had particularly high expression levels. In testis, average expression was not particularly high, and very few highly expressed genes were found, none corresponding to ORs previously implicated in sperm chemotaxis. Higher expression levels were more common for genes with a non-OR genomic neighbor. Importantly, no correlation in expression levels was detected for human-mouse orthologous pairs. Also, no significant difference in expression levels was seen between intact and pseudogenized ORs, except for the pseudogenes of subfamily 7E which has undergone a human-specific expansion. Conclusion: The OR superfamily as a whole, show widespread, locus-dependent and heterogeneous expression, in agreement with a neutral or near neutral evolutionary model for transcription control. These results cannot reject the possibility that small OR subsets might play functional roles in different tissues, however considerable care should be exerted when offering a functional interpretation for ectopic OR expression based only on transcription information.
-
(2006) British Journal of Cancer. 94, 10, p. 1537-1543 Abstract
While genetic factors clearly play a role in conferring breast cancer risk, the contribution of ATM gene mutations to breast cancer is still unsettled. To shed light on this issue, ATM haplotypes were constructed using eight SNPs spanning the ATM gene region (142 kb) in ethnically diverse non-Ashkenazi Jewish controls (n = 118) and high-risk ( n 142) women. Of the 28 haplotypes noted, four were encountered in frequencies of 5% or more and accounted for 85% of all haplotypes. Subsequently, ATM haplotyping of high-risk, non-Ashkenazi Jews was performed on 66 women with breast cancer and 76 asymptomatic. One SNP (rs228589) was significantly more prevalent among breast cancer cases compared with controls (P = 4 x 10(-9)), and one discriminative ATM haplotype was significantly more prevalent among breast cancer cases (33.3%) compared with controls (3.8%), (P
-
(2006) BioEssays. 28, 4, p. 399-412 Abstract
We hypothesize that life began not with the first self-reproducing molecule or metabolic network, but as a prebiotic ecology of co-evolving populations of macromolecular aggregates (composomes). Each composome species had a particular molecular composition resulting from molecular complementarity among environmentally available prebiotic compounds. Natural selection acted on composomal species that varied in properties and functions such as stability, catalysis, fission, fusion and selective accumulation of molecules from solution. Fission permitted molecular replication based on composition rather than linear structure, while fusion created composomal variability. Catalytic functions provided additional chemical novelty resulting eventually in autocatalytic and mutually catalytic networks within composomal species. Composomal autocatalysis and interdependence allowed the Darwinian co-evolution of content and control (metabolism). The existence of chemical interfaces within complex composomes created linear templates upon which self-reproducing molecules (such as RNA) could be synthesized, permitting the evolution of informational replication by molecular templating. Mathematical and experimental tests are proposed.
-
(2006) Molecular Psychiatry. 11, 3, p. 312-322 Abstract
Despite the health hazards, cigarette smoking is disproportionately frequent among young women. A significant contribution of genetic factors to smoking phenotypes is well established. Efforts to identify susceptibility genes do not generally take into account possible interaction with environment, life experience and psychological characteristics. We recruited 501 female Israeli students aged 20 - 30 years, obtained comprehensive background data and details of cigarette smoking and administered a battery of psychological instruments. Smoking initiators (n = 242) were divided into subgroups with high (n = 127) and low (n = 115) levels of nicotine dependence based on their scores on the Fagerstrom Tolerance Questionnaire and genotyped with noninitiators (n = 142) for single nucleotide polymorphisms (SNPs) in 11 nicotinic cholinergic receptor genes. We found nominally significant (P
-
-
(2006) GENOME BIOLOGY. 7, 10, Abstract
Background: Mammalian olfactory receptor (OR) genes reside in numerous genomic clusters of up to several dozen genes. Whole-genome sequence alignment nets of five mammals allow their comprehensive comparison, aimed at reconstructing the ancestral olfactory subgenome. Results: We developed a new and general tool for genome-wide definition of genomic gene clusters conserved in multiple species. Syntenic orthologs, defined as gene pairs showing conservation of both genomic location and coding sequence, were subjected to a graph theory algorithm for discovering CLICs (clusters in conservation). When applied to ORs in five mammals, including the marsupial opossum, more than 90% of the OR genes were found within a framework of 48 multi-species CLICs, invoking a general conservation of gene order and composition. A detailed analysis of individual CLICs revealed multiple differences among species, interpretable through species-specific genomic rearrangements and reflecting complex mammalian evolutionary dynamics. One significant instance involves CLIC #1, which lacks a human member, implying the human-specific deletion of an OR cluster, whose mouse counterpart has been tentatively associated with isovaleric acid odorant detection. Conclusion: The identified multi-species CLICs demonstrate that most of the mammalian OR clusters have a common ancestry, preceding the split between marsupials and placental mammals. However, only two of these CLICs were capable of incorporating chicken OR genes, parsimoniously implying that all other CLICs emerged subsequent to the avian-mammalian divergence.
2005
-
(2005) Nature Genetics. 37, 6, p. 588-589 Abstract
Gene duplication and alternative splicing are distinct evolutionary mechanisms that provide the raw material for new biological functions. We explored their relationships in human and mouse and found an inverse correlation between the size of a gene's family and its use of alternatively spliced isoforms. A cross-organism analysis suggests that selection for genome-wide genic proliferation might be interchangeably met by either evolutionary mechanism.
-
(2005) Trends in Genetics. 21, 4, p. 210-213 Abstract
We have systematically examined the domain composition across a comprehensive set of tissue-specific, midrange and housekeeping genes as defined by their mode of expression in 52 normal mouse tissues. We show a definite correlation between the number of domains and the degree of tissue specificity. This trend is further supported by a novel analysis involving the time of origin of each domain. Genes containing metazoan-specific domains are more prevalent in signal transduction and cell-communication pathways, and are depleted in primary metabolism. Our analyses suggest that highly modular gene products have been recruited for tissue-specific functions that are required in complex organisms.
-
LEMD3: The gene responsible for bone density disorders (osteopoikilosis)(2005) Israel Medical Association Journal. 7, 4, p. 273-274 Abstract
-
(2005) Origins of Life and Evolution of Biospheres. 35, 2, p. 111-133 Abstract
The basic Graded Autocatalysis Replication Domain (GARD) model consists of a repertoire of small molecules, typically amphiphiles, which join and leave a non-covalent micelle-like assembly. Its replication behavior is due to occasional fission, followed by a homeostatic growth process governed by the assembly's composition. Limitations of the basic GARD model are its small finite molecular repertoire and the lack of a clear path from a 'monomer world' towards polymer-based living entities. We have now devised an extension of the model (polymer GARD or P-GARD), where a monomer-based GARD serves as a 'scaffold' for oligomer formation, as a result of internal chemical rules. We tested this concept with computer simulations of a simple case of monovalent monomers, whereby more complex molecules (dimers) are formed internally, in a manner resembling biosynthetic metabolism. We have observed events of dimer 'take-over' - the formation of compositionally stable, replication-prone quasi stationary states (composomes) that have appreciable dimer content. The appearance of novel metabolism-like networks obeys a time-dependent power law, reminiscent of evolution under punctuated equilibrium. A simulation under constant population conditions shows the dynamics of takeover and extinction of different composomes, leading to the generation of different population distributions. The P-GARD model offers a scenario whereby biopolymer formation may be a result of rather than a prerequisite for early life-like processes.
-
(2005) International Journal of Cancer. 114, 1, p. 58-73 Abstract
While genetic factors clearly play a key role in colorectal cancer (CRC) pathogenesis and in determining its phenotypic features, the precise genes that involved are largely unknown. To gain insight into these genes, consecutive Israeli CRC patients were genotyped using SNPs from within candidate genes: APC, beta-Cate-nin, K-RAS, DCC, P16, PTEN, RB1, P15, APOE, ERCC2, P53, MTHFR and hMSH2. Genotyping of consecutive, unselected colorectal cancer patients was done mostly by utilizing the MassARRAY technology (Sequenom) and to a lesser extent DGGE, ARMS and direct DNA sequencing. Correlation of genotypes with specific phenotypic features was carried out for all patients and separately for the Ashkenazim. Overall, 456 patients were analyzed, the majority (64.25%) being of Ashkenazi origin; mean age at diagnosis was 65.6 +/- 14 (range 25-90 years), and the mean follow-up was 4.7 +/- 0.28 (range 0-30 years). Statistically significant associations were noted between SNPs in beta-catenin and APOE and a positive family history of cancer (beta-catenin: p=0.034, APOE: p = 0.033); tumor location and a DCC SNP (p = 0.038) and the P53 R72P mutation and survival (p=0.0336). In Ashkenazi patients, ERCC2 and MTHFR genes' SNPs were associated with age at diagnosis (ERCC2: p = 0.025, MTHFR: p = 0.0005); a P53 polymorphism, APOE and Rb SNPs with a family history of cancer (P53 p=0.034;APOE p=0.04, Rb p= 0.022); DCC SNP with tumor location (p=0.014); and p15 SNP with tumor grade (p=0.032). This preliminary study shows that genetic factors play a role in determining CRC phenotypic features and that a larger cohort with longer follow-up is clearly needed.
-
(2005) Physiological Genomics. 21, 1, p. 117-123 Abstract
Quantitative variation in gene expression in humans is the outcome of various factors, including differences in genetic background, gender, age, and environment. However, the extent of the influence of these factors on gene expression is not clear. We attempted to address this issue by carrying out gene expression profiling in blood leukocytes with 13 individuals (including 5 pairs of monozygotic twins) on 10,000 genes using HG-U95Av2 oligonucleotide microarrays. The proportion of differentially expressed genes between monozygotic twins was low (up to 1.76%). Most of the variations belonged to the least variable category. These genes, exhibiting "random variations," did not show clear preference to any functional class, although " signaling and communication" and " immune and related functions" generally topped the list. The extent of variation in gene expression increased in comparisons between unrelated individuals (up to 14.13%). Most of the genes (89%) exhibiting random variations in twins also varied in expression in unrelated individuals. As with twins, signaling and communication topped the list, and substantial variations were observed in all three categories: least variable, moderately variable, and most variable. An important outcome of this study was that the housekeeping genes were nearly insensitive to random variations but appeared to be more susceptible to genetic differences. However, the highly expressed housekeeping genes exhibited low variation and appeared to be insensitive to all known factors. Gene expression profiling in monozygotic twins can provide useful data for the assessment of natural variation in gene expression in humans.
-
(2005) Bioinformatics. 21, 5, p. 650-659 Abstract
Motivation: Genes are often characterized dichotomously as either housekeeping or single-tissue specific. We conjectured that crucial functional information resides in genes with midrange profiles of expression. Results: To obtain such novel information genome-wide, we have determined the mRNA expression levels for one of the largest hitherto analyzed set of 62 839 probesets in 12 representative normal human tissues. Indeed, when using a newly defined graded tissue specificity index tau, valued between 0 for housekeeping genes and 1 for tissue-specific genes, genes with midrange profiles having 0.15
-
GeneTide - Terra Incognita Discovery Endeavor: a new transcriptome focused member of the GeneCards/GeneNote suite of databases(2005) Nucleic Acids Research. 33, p. D556-D561 Abstract
GeneCards((R)) is an automatically mined database of human genes that strives to create, along with its auxiliary data bases-GeneLoc, GeneNote and GeneAnnot-the most inclusive resource of gene-centered information of the human genome. GeneTide, the Gene Terra Incognita Discovery Endeavor (http://genecards.weizmann.ac.ii/genetide/), the newest addition to this family, is a transcrip-tome-focused database which aims to enhance GeneCards with additional expressed sequence tag (EST)-based genes. This is achieved by comprehensively mapping >85% of the similar to5.6 million human ESTs currently available at dbEST to known genes by means of data mining and integration of genomic resources including UniGene, DoTS, AceView and in-house resources. GeneTide thus creates comprehensive links between ESTs and GeneCards genes. Furthermore, groups of unassociated transcripts serve as a basis for defining novel EST-based GeneCards Candidates (EGCs). These EGCs, nearly 25000 of which were defined in version 0.3 of GeneTide, are further annotated with various parameters, including splicing evidence and expression data extracted from the GeneNote database, to determine their validity as possible de novo genes.
-
Early systems biology and prebiotic networks(2005) Transactions On Computational Systems Biology I. p. 14-27 Abstract
Systems Biology constitutes tools and approaches aimed at deciphering complex biological entities. It is assumed that such complexity arose gradually, beginning from a few relatively simple molecules at life's inception, and culminating with the emergence of composite multicellular organisms billions of years later. The main point of the present paper is that very early in the evolution of life, molecular ensembles with high complexity may have arisen, which are best described and analyzed by the tools of Systems Biology. We show that modeled prebiotic mutually catalytic pathways have network attributes similar to those of present-day living cells. This includes network motifs and robustness attributes. We point out that early networks are weighted (graded), but that using a cutoff formalism one may probe their degree distribution and show that it approximate that of a random network. A question is then posed regarding the potential evolutionary mechanisms that may have led to the emergence of scale-free networks in modem cells.
-
(2005) GENOME BIOLOGY. 6, 7, Abstract
Genomic segments that do not code for proteins yet show high conservation among vertebrates have recently been identified by various computational methodologies. We refer to them as ANCORs (ancestral non-coding conserved regions). The frequency of individual ANCORs within the genome, along with their (correlated) inter-species identity scores, helps in assessing the probability that they function in transcription regulation or RNA coding.
2004
-
NIPHL gene responsible for Cornelia de Lange syndrome, a severe developmental disorder(2004) Israel Medical Association Journal. 6, 9, p. 571-572 Abstract
-
(2004) Genes and Immunity. 5, 6, p. 493-504 Abstract
Autoimmune diseases seem to have strong genetic attributes, and are affected to some extent by shared susceptibility loci. The latter potentially amount to hundreds of candidate genes (CG), creating the need for a prioritization strategy in genetic association studies. To form such a strategy, 26 autoimmune-related CG were genotyped for a total of 72 single nucleotide polymorphisms (SNPs) in three distinct Israeli ethnic populations: Ashkenazi Jews, Sephardic Jews and Arabs. Four quantitative criteria reflecting population stratification were analyzed: allele frequencies, haplotype frequencies, the F-st statistic for homozygotes distribution and linkage disequilibrium extents. According to the consequent interpopulation genomic diversity profiles, the genes were classified into conserved, intermediate and diversified gene groups. Our results demonstrate a correlation between the biological role of autoimmune-related CG and their interpopulation diversity profiles as classified by the different analyses. Annotation analysis suggests that genes more readily influenced by environmental conditions, such as immunological mediators, are 'population specific'. Conversely, genes showing genetic conservation across all populations are characterized by apoptotic and cleaving functions. We suggest a research strategy by which CG association studies should focus first on likely conserved gene categories, to increase the likelihood of attaining significant results and promote the development of gene-based therapies.
-
(2004) Biological Psychiatry. 56, 3, p. 169-176 Abstract
Background: The genes G72/G30 were recently implicated in schizophrenia in both Canadian and Russian populations. We hypothesized that 1) polymorphic changes in this gene region might be associated with schizophrenia in the Ashkenazi Jewish population and that 2) changes in G72/G30 gene expression might be expected in schizophrenic patients compared with control subjects. Methods: Eleven single nucleotide polymorphisms (SNPs) encompassing the G72/G30 genes were typed in the genomic deoxyribonucleic acid (DNA) from 60 schizophrenic patients and 130 matched control subjects of Ashkenazi ethnic origin. Case control comparisons were based on linkage disequilibrium (LD) and haploype frequency estimations. Gene expression analysis of G72 and G30 was performed on 88 postmortem dorsolateral prefrontal cortex samples. Results: Linkage disequilibrium analysis revealed two main SNP blocks. Haplotype analysis on block II, containing three SNPs external to the genes, demonstrated an association with schizophrenia. Gene expression analysis exhibited correlations between expression levels of the G72 and G30 genes, as well as a tendency toward overexpression of the G72 gene in schizophrenic brain samples of 44 schizophrenic patients compared with 44 control subjects. Conclusions: It is likely that the G72/G30 region is involved in susceptibility to schizophrenia in the Ashkenazi population. The elevation in expression of the G72 gene coincides with the glutamatergic theory of schizophrenia.
-
(2004) Chirality. 16, 6, p. 369-378 Abstract
Molecular Chirality is of central interest in biological studies because enantiomeric compounds, while indistinguishable by most inanimate systems, show profoundly different properties in biochemical environments. Enantioselective separation methods, based on the differential recognition of two optical isomers by a chiral selector, have been amply documented. Also, great effort has been directed towards a theoretical understanding of the fundamental mechanisms underlying the chiral recognition process. Here we report a comprehensive data examination of enantioseparation measurements for over 72,000 chiral selector-selectand pairs from the chiral selection compendium CHIRBASE. The distribution of alpha = k'(D)/k'(L) values was found to follow a power law, equivalent to an exponential decay for chiral differential free energies. This observation is experimentally relevant in terms of the number of different individual or combinatorial selectors that need to be screened in order to observe a values higher than a preset minimum. A string model for enantiorecognition (SMED) formalism is proposed to account for this observation on the basis of an extended Ogston three-point interaction model. Partially overlapping molecular interaction domains are analyzed in terms of a string complementarity model for ligand-receptor complementarity. The results suggest that chiral selection statistics may be interpreted in terms of more general concepts related to biomolecular recognition. (C) 2004 Wiley-Liss, Inc.
-
Recent innovations in the genecards suite(2004) Briefings in Bioinformatics. 5, 2, p. 204-205 Abstract
-
A new gene for the Charcot-Marie-Tooth disorder(2004) Israel Medical Association Journal. 6, 6, p. 376-377 Abstract
-
GeneAnnot: comprehensive two-way linking between oligonucleotide array probesets and GeneCards genes(2004) Bioinformatics. 20, 9, p. 1457-1458 Abstract
Motivation: High density oligonucleotide arrays are usually annotated in a one-to-one fashion, with each probeset assigned to one gene. However, in reality, subsets of oligonucleotides in a probeset may match sequences within more than one gene, potentially leading to misinterpretations. Moreover, a gene is often represented by more than one probeset, and analyzing probe matches at the mRNA level can help one deduce whether these probesets are derived from the same or different splice variants. Results: The GeneAnnot system comprehensively documents the many-to-many relationship between oligonucleotide array probesets and annotated genes in GeneCards(TM). It performs pairwise alignments between the probe sequences and gene transcripts, and assigns sensitivity and specificity scores to each probeset/gene pair.
-
5-lipoxygenase activating protein (ALOX5AP): Association with cardiovascular infarction and stroke(2004) Israel Medical Association Journal. 6, 5, p. 318-318 Abstract
-
(2004) Genomics. 83, 3, p. 361-372 Abstract
We identified 971 olfactory receptor (OR) genes in the dog genome, estimated to constitute similar to80% of the canine OR repertoire. This was achieved by directed genomic DNA cloning of olfactory sequence tags as well as by mining the Celera canine genome sequences. The dog OR subgenome is estimated to have 12% pseudogenes, suggesting a functional repertoire similar to that of mouse and considerably larger than for humans. No novel OR families were discovered, but as many as 34 gene subfamilies were unique to the dog. "Fish-like" Class I ancient ORs constituted 18% of the repertoire, significantly more than in human and mouse. A set of 122 dog-human-mouse ortholog triplets was identified, with a relatively high fraction of Class I ORs. The elucidation of a large portion of the canine olfactory receptor gene superfamily, with some dog-specific attributes, may help us understand the unique chemosensory capacities of this species. (C) 2003 Elsevier Inc. All rights reserved.
-
(2004) Origins of Life and Evolution of Biospheres. 34, 2-Jan, p. 181-194 Abstract
While the last century brought an exquisite understanding of the molecular basis of life, very little is known about the detailed chemical mechanisms that afforded the emergence of life on early earth. There is a broad agreement that the problem lies in the realm of chemistry, and likely resides in the formation and mutual interactions of carbon-based molecules in aqueous medium. Yet, present-day experimental approaches can only capture the synthesis and behavior of a few molecule types at a time. On the other hand, experimental simulations of prebiotic syntheses, as well as chemical analyses of carbonaceous meteorites, suggest that the early prebiotic hydrosphere contained many thousands of different compounds. The present paper explores the idea that given the limitations of test-tube approaches with regards to such a 'random chemistry' scenario, an alternative mode of analysis should be pursued. It is argued that as computational tools for the reconstruction of molecular interactions improve rapidly, it may soon become possible to perform adequate computer-based simulations of prebiotic evolution. We thus propose to launch a computational origin of life endeavor (http://ool.weizmann.ac.il/CORE), involving computer simulations of realistic complex prebiotic chemical networks. In the present paper we provide specific examples, based on a novel algorithmic approach, which constitutes a hybrid of molecular dynamics and stochastic chemistry. As one potential solution for the immense hardware requirements dictated by this approach, we have begun to implement an idle CPU harvesting scheme, under the title ool@home.
-
(2004) PLoS Biology. 2, 1, p. 120-125 Abstract
Olfactory receptor (OR) genes constitute the molecular basis for the sense of smell and are encoded by the largest gene family in mammalian genomes. Previous studies suggested that the proportion of pseudogenes in the OR gene family is significantly larger in humans than in other apes and significantly larger in apes than in the mouse. To investigate the process of degeneration of the olfactory repertoire in primates, we estimated the proportion of OR pseudogenes in 19 primate species by surveying randomly chosen subsets of 100 OR genes from each species. We find that apes, Old World monkeys and one New World monkey, the howler monkey, have a significantly higher proportion of OR pseudogenes than do other New World monkeys or the lemur (a prosimian). Strikingly, the howler monkey is also the only New World monkey to possess full trichromatic vision, along with Old World monkeys and apes. Our findings suggest that the deterioration of the olfactory repertoire occurred concomitant with the acquisition of full trichromatic color vision in primates.
-
(2004) Life In The Universe: From The Miller Experiment To The Search For Life On Other Worlds. Vol. 7. p. 111-114 Abstract
A widespread notion is that life arose from a single molecular replicator, probably a self-copying polynucleotide, in an RNA World (Joyce, 2002). We have proposed an alternative Lipid World scenario as an early evolutionary step in the emergence of cellular life on Earth (Segre et al., 2001). This concept combines the potential chemical activities of lipids and other amphiphiles, with their capacity to undergo spontaneous self-organization into supramolecular structures, such as micelles and bilayers. In quantitative, chemically-realistic computer simulations of our Graded Autocatalysis Replication Domain (GARD) model (Segre et al., 1998), we have shown that prebiotic molecular networks, potentially existing within assemblies of lipid-like molecules, manifest a behavior similar to self reproduction or self-replication.
-
(2004) Artificial Life IX: Proceedings of the Ninth International Conference on the Simulation and Synthesis of Living Systems. p. 501-506 Abstract
The question of the origin of life is addressed by artificial life research, particularly in the realm of artificial chemistry. Such artificial chemistry is described by our Graded Autocatalysis Replication Domain (GARD) model. GARD depicts an unorthodox scenario suggested for emergence of life - the 'lipid world'. The model concerns molecular assemblies with mutual catalysis in an environment containing a plethora of molecular species. Many aspects of GARD were amply discussed. Here we concentrate on the importance of size constraints as depicted by the basic model and several of its variants. Occasional fission of a GARD assembly, which restricts the assembly size, is crucial for generating compositional quasi-stationary states ('composomes'). In a spatial version of GARD, bounded environments yield spontaneous emergence of different ecologies. Limiting the size of a population of GARD assemblies gives rise to a complex population dynamics. The last example, with possible wider impact to chemistry and nano-technology, suggests that size limit can give rise to spontaneous symmetry breaking. This latter result is compared to the classic Frank's model for homo-chirality, which requires explicit inhibition. We conclude that size restrictions are fundamental in the field of origin of life and artificial life, not only in order to facilitate evolutionary processes, as previously suggested, but also, for augmenting the dynamics portrayed by different scenarios and models.
-
(2004) Protein Science. 13, 1, p. 240-254 Abstract
Olfactory receptors (ORs) are a large family of proteins involved in the recognition and discrimination of numerous odorants. These receptors belong to the G-protein coupled receptor (GPCR) hyperfamily, for which little structural data are available. In this study we predict the binding site residues of OR proteins by analyzing a set of 1441 OR protein sequences from mouse and human. The central insight utilized is that functional contact residues would be conserved among pairs of orthologous receptors, but considerably less conserved among paralogous pairs. Using judiciously selected subsets of 218 ortholog pairs and 518 paralog pairs, we have identified 22 sequence positions that are both highly conserved among the putative orthologs and variable among paralogs. These residues are disposed on transmembrane helices 2 to 7, and on the second extracellular loop of the receptor. Strikingly, although the prediction makes no assumption about the location of the binding site, these amino acid positions are clustered around a pocket in a structural homology model of ORs, mostly facing the inner lumen. We propose that the identified positions constitute the odorant binding site. This conclusion is supported by the observation that all but one of the predicted binding site residues correspond to ligand-contact positions in other rhodopsin-like GPCRs.
2003
-
(2003) Comptes Rendus Biologies. 326, 11-Oct, p. 1067-1072 Abstract
A novel data set, GeneNote (Gene Normal Tissue Expression), was produced to portray complete gene expression profiles in healthy human tissues using the Affymetrix GeneChip HG-U95 set, which includes 62 839 probe-sets. The hybridization intensities of two replicates were processed and analyzed to yield the complete transcriptome for twelve human tissues. Abundant novel information on tissue specificity provides a baseline for past and future expression studies related to diseases. The data is posted in GeneNote (http://genecards.weizmann.ac.il/genenote/), a widely used compendium of human genes (http://bioinfo.weizmann.ac.il/genecards). (C) 2003 Academie des sciences. Published by Elsevier SAS. All rights reserved.
-
(2003) American Journal of Human Genetics. 73, 3, p. 489-501 Abstract
The olfactory receptor ( OR) genes constitute the largest gene family in mammalian genomes. Humans have 11,000 OR genes, of which only similar to 40% have an intact coding region and are therefore putatively functional. In contrast, the fraction of intact OR genes in the genomes of the great apes is significantly greater (68% - 72%), suggesting that selective pressures on the OR repertoire vary among these species. We have examined the evolutionary forces that shaped the OR gene family in humans and chimpanzees by resequencing 20 OR genes in 16 humans, 16 chimpanzees, and one orangutan. We compared the variation at the OR genes with that at intergenic regions. In both humans and chimpanzees, OR pseudogenes seem to evolve neutrally. In chimpanzees, patterns of variability are consistent with purifying selection acting on intact OR genes, whereas, in humans, there is suggestive evidence for positive selection acting on intact OR genes. These observations are likely due to differences in lifestyle, between humans and great apes, that have led to distinct sensory needs.
-
(2003) Sensors And Actuators B-Chemical. 93, 3-Jan, p. 67-76 Abstract
We propose a new feature extraction method for use with chemical sensors. It is based on fitting a parametric analytic model of the sensor's response over time to the measured signal, and taking the set of best-fitting parameters as the features. The process of finding the features is fast and robust, and the resulting set of features is shown to significantly enhance the performance of subsequent classification algorithms. Moreover, the model that we have developed fits equally well to sensors of different technologies and embeddings, suggesting its applicability to a diverse repertoire of sensors and analytic devices. (C) 2003 Elsevier Science B.V. All rights reserved.
-
(2003) Sensors And Actuators B-Chemical. 93, 3-Jan, p. 77-83 Abstract
We propose an algorithm for use with multisensor systems that is capable of the following: (a) identify an analyte independently of its concentration; (b) estimate the concentration of the analyte, even if the system was not previously exposed to this concentration; (c) tell when an analyte is of a chemical type not previously presented to the system. The algorithm, based upon recent work of Hopfield, uses the multiplicity of sensors explicitly, and is intuitive and easy to implement. We have tested it against real data, and it exhibits high quality performance. (C) 2003 Elsevier Science B.V. All rights reserved.
-
(2003) Bioinformatics. 19, p. i222-i224 Abstract
Motivation: Despite the numerous available whole-genome mapping resources, no comprehensive, integrated map of the human genome yet exists. Results: GeneLoc, software adjunct to GeneCards and UDB, integrates gene lists by comparing genomic coordinates at the exon level and assigns unique and meaningful identifiers to each gene.
-
(2003) Nature Genetics. 34, 2, p. 143-144 Abstract
Of more than 1,000 human olfactory receptor genes, more than half seem to be pseudogenes. We investigated whether the most recent of these disruptions might still segregate with the intact form by genotyping 51 candidate genes in 189 ethnically diverse humans. The results show an unprecedented prevalence of segregating pseudogenes, identifying one of the most pronounced cases of functional population diversity in the human genome.
-
(2003) Current Opinion in Structural Biology. 13, 3, p. 353-358 Abstract
Groups of related genes abound in large eukaryotic genomes. In such 'subgenomes', homology modeling carried out for a few genes will probably have relevance to the entire group. Subgenomes also afford unique ways of determining protein structural information. In addition to analyses based on the quantification of residue variability in paralogs, two-way comparisons, both within and among species, help to disclose functional amino acids. Comparative studies of gene families throughout the mammalian genome will also help elucidate the functional significance of single nucleotide polymorphisms in coding regions.
-
(2003) Computational Biology and Chemistry. 27, 2, p. 121-133 Abstract
We propose a setup for an odor communication system. Its different parts are described, and ways to realize them are outlined. Our scheme enables an output device-the whiffer-to release an imitation of an odorant read in by an input device-the sniffer-upon command. The heart of the system is the novel algorithmic scheme that makes the scheme feasible. We are currently at work researching and developing some of the components that constitute the algorithm, and we hope that the description of the overall scheme in this paper will help to get other groups to join in this effort. (C) 2002 Elsevier Science Ltd. All rights reserved.
-
(2003) Advances in Complex Systems. 6, 1, p. 15-35 Abstract
In addition to the visible complexity expressed in the morphogenesis of multicellular organisms, two levels of microscopic complexity may be discerned within every living cell. The first level is related to covalently bonded structures, namely molecules. The second level has to do with the generation of non-covalent molecular assemblies. Origin of life research has largely focused on the first complexity level, i.e. the appearance of covalent biopolymers. We present a life emergence scenario based mainly on the second complexity level. We argue that homeostatic molecular ensembles, for which we have coined the term "mesobiotic," have assumed a half-way position between prebiotic organic synthesis and full-fledged cellular (biotic) life.
-
(2003) Proceedings of the National Academy of Sciences of the United States of America. 100, 6, p. 3324-3327 Abstract
Olfactory receptor (OR) genes constitute the basis for the sense of smell and are encoded by the largest mammalian gene superfamily of >1,000 genes. In humans, >60% of these are pseudogenes. In contrast, the mouse OR repertoire, although of roughly equal size, contains only approximate to20% pseudogenes. We asked whether the high fraction of nonfunctional OR genes is specific to humans or is a common feature of all primates. To this end, we have compared the sequences of 50 human OR coding regions, regardless of their functional annotations, to those of their putative orthologs in chimpanzees, gorillas, orangutans, and rhesus macaques. We found that humans have accumulated mutations that disrupt OR coding regions roughly 4-fold faster than any other species sampled. As a consequence, the fraction of OR pseudogenes in humans is almost twice as high as in the non-human primates, suggesting a human-specific process of OR gene disruption, likely due to a reduced chemosensory dependence relative to apes.
-
Computer simulation of protocells(2003) Computational Methods In Systems Biology, Proceedings. 2602, p. 194-197 Abstract
Keywords: Computer Science, Interdisciplinary Applications; Computer Science, Theory & Methods
-
-
Human Gene-Centric Databases at the Weizmann Institute of Science: GeneCards, UDB, CroW 21 and HORDE(2003) Nucleic Acids Research. 31, 1, p. 142-146 Abstract
Recent enhancements and current research in the GeneCards (GC) (http://bioinfo.weizmann.ac.il/cards/) project are described, including the addition of gene expression profiles and integrated gene locations. Also highlighted are the contributions of specialized associated human gene-centric databases developed at the Weizmann Institute. These include the Unified Database (UDB) (http://bioinfo.weizmann.ac.il/udb) for human genome mapping, the human Chromosome 21 database at the Weizmann Insti-tute (CroW 21) (http://bioinfo.weizmann.ac.il/crow21), and the Human Olfactory Receptor Data Explora-torium (HORDE) (http://bioinfo.weizmann.ac.il/HORDE). The synergistic relationships amongst these efforts have positively impacted the quality, quantity and usefulness of the GeneCards gene compendium.
2002
-
(2002) Drug News & Perspectives. 15, 9, p. 558-567 Abstract
The goal of pharmacogenetics is to identify "genetic fingerprints" that may predict a patient's response to pharmaceutical treatment. The use of pharmacogenetics replaces the trial-and-error strategy, which governs much of our clinical decision-making regarding treatment allocation in current medical practice, with individually tailored therapy. We review a pharmacogenetic research model, which implements high-throughput single nucleotide polymorphism technology to establish the correlation between drug-responsiveness and genetic polymorphisms of Copaxone(R)-treated multiple sclerosis patients. Implementation of similar pharmacogenetic approaches may promote the development of personalized medicine in multiple sclerosis as well as in other diseases. (C) 2002 Prous Science. All rights reserved.
-
(2002) Bioinformatics. 18, 11, p. 1542-1543 Abstract
Motivation: In the post-genomic era, functional analysis of genes requires a sophisticated interdisciplinary arsenal. Comprehensive resources are challenged to provide consistently improving, state-of-the-art tools. Results: GeneCards (Rebhan et al., 1998) has made innovative strides: (a) regular updates and enhancements incorporating new genes enriched with sequences, genomic locations, cDNA assemblies, orthologies, medical information, 3D protein structures, gene expression, and focused SNP summaries; (b) restructured software using object-oriented Perl, migration to schema-driven XML, and (c) pilot studies, introducing methods to produce cards for novel and predicted genes.
-
(2002) Genomics. 80, 3, p. 295-302 Abstract
We developed a novel efficient scheme, DEFOG (for "deciphering families of genes"), for determining sequences of numerous genes from a family of interest. The scheme provides a powerful means to obtain a gene family composition in species for which high-throughput genomic sequencing data are not available. DEFOG uses two key procedures. The first is a novel algorithm for designing highly degenerate primers based on a set of known genes from the family of interest. These primers are used in PCR reactions to amplify the members of the gene family. The second combines oligofingerprinting of the cloned PCR products with clustering of the clones based on their fingerprints. By selecting members from each cluster, a low-redundancy clone subset is chosen for sequencing. We applied the scheme to the human olfactory receptor (OR) genes. OR genes constitute the largest gene superfamily in the human genome, as well as in the genomes of other vertebrate species. DEFOG almost tripled the size of the initial repertoire of human ORs in a single experiment, and only 7% of the PCR clones had to be sequenced. Extremely high degeneracies, reaching over a billion combinations of distinct PCR primer pairs, proved to be very effective and yielded only 0.4% nonspecific products.
-
(2002) Neural Computation. 14, 9, p. 2201-2220 Abstract
We introduce and study an artificial neural network inspired by the probabilistic receptor affinity distribution model of olfaction. Our system consists of N sensory neurons whose outputs converge on a single processing linear threshold element. The system's aim is to model discrimination of a single target odorant from a large number p of background odorants within a range of odorant concentrations. We show that this is possible provided p does not exceed a critical value p(c) and calculate the critical capacity a(c) = p(c)/N. The critical capacity depends on the range of concentrations in which the discrimination is to be accomplished. If the olfactory bulb may be thought of as a collection of such processing elements, each responsible for the discrimination of a single odorant, our study provides a quantitative analysis of the potential computational properties of the olfactory bulb. The mathematical formulation of the problem we consider is one of determining the capacity for linear separability of continuous curves, embedded in a large-dimensional space. This is accomplished here by a numerical study, using a method that signals whether the discrimination task is realizable, together with a finite-size scaling analysis.
-
(2002) Journal of Molecular and Cellular Cardiology. 34, 6, p. A37-A37 Abstract
Keywords: Cardiac & Cardiovascular Systems; Cell Biology
-
(2002) Journal of Theoretical Biology. 216, 3, p. 327-336 Abstract
A chance encounter between members of a random repertoire and a molecular target is characteristic of different biological systems, including the immune and olfactory pathways as well as combinatorial libraries. In such systems, the affinity between the target and members of the repertoire is distributed with a probability function describing the propensity of. obtaining a particular affinity value. We have previously proposed a phenomenological receptor affinity distribution (RAD) formalism, which describes this probability function based on simple statistical considerations. In the present analysis, we use published data from diverse experimental systems, including phage display libraries, immunoglobulins and enzymes, to test the RAD model and to compare it to other affinity distribution formalisms. The RAD model is found to provide the best description for binding data for over eight orders of magnitude on the affinity scale, and to account for a relationship between repertoire size and the maximal obtainable affinity within different repertoires. This approach points to a potential universality of the rules that govern affinity distributions in biology. (C) 2002 Elsevier Science Ltd. All rights reserved.
-
(2002) European Journal of Human Genetics. 10, 6, p. 339-350 Abstract
Usher syndrome type 3 (USH3) is an autosomal recessive disorder characterised by the association of post-lingual progressive hearing loss, progressive visual loss due to retinitis pigmentosa and variable presence of vestibular dysfunction. Because the previously defined transcripts do not account for all USH3 cases, we performed further analysis and revealed the presence of additional exons embedded in longer human and mouse USH3A transcripts and three novel USH3A mutations. Expression of Ush3a transcripts was localised by whole mount in situ hybridisation to cochlear hair cells and spiral ganglion cells. The full length USH3A transcript encodes clarin-1, a four-transmembrane-domain protein, which defines a novel vertebrate-specific family of three paralogues. Limited sequence homology to stargazin, a cerebellar synapse four-transmembrane-domain protein, suggests a role for clarin-1 in hair cell and photoreceptor cell synapses, as well as a common pathophysiological pathway for different Usher syndromes.
-
(2002) Human Molecular Genetics. 11, 12, p. 1381-1390 Abstract
We investigated the population differences in patterns of single nucleotide polymorphisms (SNPs) for a 400 kb olfactory receptor (OR) gene cluster on human chromosome 17p13.3. Samples were drawn from 35 individuals, of four different ethnogeographical origins: Pygmies, Bedouins, Yemenite Jews and Ashkenazi Jews. Of the 74 SNPs identified, two segregated between pseudogenized and intact ORs, while a third involved a change in a highly conserved motif proposed to mediate ligand-induced signal transduction. Linkage disequilibrium (LD) was computed based on phase inference across the cluster using Clark's haplotype subtraction algorithm. We also calculated LD directly from the genotypes using the expectation-maximization (EM) algorithm. Both methods yielded very similar results. Our analyses revealed substantial differences in nucleotide diversity, haplotype distribution and LD patterns among the different human populations. In particular, the two Jewish populations had low haplotype diversity and negligible decay of LD across the entire genomic region. Intriguingly, the three functional SNPs segregated at different frequencies in the different ethnogeographical groups, with the Pygmies having higher frequencies of the intact OR genes. Our data suggests that OR genes may have evolved to create different functional repertoires in distinct human populations.
-
(2002) Proceedings of the National Academy of Sciences of the United States of America. 99, 2, p. 862-867 Abstract
We report the analysis of human nucleotide diversity at a genetic locus known to be involved in a behavioral phenotype, the monoamine oxidase A gene. Sequencing of five regions totaling 18.8 kb and spanning 90 kb of the monoamine oxidase A gene was carried out in 56 male individuals from seven different ethnogeographic groups. We uncovered 41 segregating sites, which formed 46 distinct haplotypes. A permutation test detected substantial population structure in these samples. Consistent with differentiation between populations, linkage disequilibrium is higher than expected under panmixia, with no evidence of a decay with distance. The extent of linkage disequilibrium is not typical of nuclear loci and suggests that the underlying population structure may have been accentuated by a selective sweep that fixed different haplotypes in different populations, or by local adaptation. In support of this suggestion, we find both a reduction in levels of diversity (as measured by a Hudson-Kreitman-Aguade test with the DMD44 locus) and an excess of high frequency-derived variants, as expected after a recent episode of positive selection.
2001
-
(2001) Journal of Theoretical Biology. 213, 3, p. 481-491 Abstract
Non-covalent compositional assemblies, made of monomeric mutually catalytic molecules, constitute an alternative to alphabet-based informational biopolymers as a mechanism of primordial inheritance. Such assemblies appear implicitly in many "Metabolism First" origin of life scenarios, and more explicitly in the Graded Autocatalysis Replication Domain (GARD) model [Segre et al. (2000). Proc. Natl Acad. Sci. U.S.A. 97, 4112-4117]. In the present work, we provide a detailed analysis of the quantitative molecular roots of such behavior. It is demonstrated that the fidelity of reproduction provided by a newly defined heritability measure eta (s)*, strongly depends on the values of molecular recognition parameters and on assembly size. We find that if the catalytic rate acceleration coefficients are distributed normally, transfer of compositional information becomes impossible, due to frequent "compositional error catastrophes". In contrast, if the catalytic acceleration rates obey a lognormal distribution, as actually predicted by a statistical formalism for molecular repertoires, high reproduction fidelity is obtained. There is also a clear dependence on assembly size N, whereby maximal eta is seen in a narrow range around N similar to 3.5N(G)/lambda, where N-G is the size of the primordial molecular repertoire and lambda is a molecular interaction statistical parameter. Such relationships help define the physicochemical conditions that could underlie the early steps in pre-biotic evolution. (C) 2001 Academic Press.
-
(2001) American Journal of Human Genetics. 69, 6, p. 1378-1384 Abstract
Catecholamine-induced polymorphic ventricular tachycardia (PVT) is characterized by episodes of syncope, seizures, or sudden death, in response to physical activity or emotional stress. Two modes of inheritance have been described: autosomal dominant and autosomal recessive. Mutations in the ryanodine receptor 2 gene (RYR2), which encodes a cardiac sarcoplasmic reticulum (SR) Ca2+-release channel, were recently shown to cause the autosomal dominant form of the disease. In the present report, we describe a missense mutation in a highly conserved region of the calsequestrin 2 gene (CASQ2) as the potential cause of the autosomal recessive form. The CASQ2 protein serves as the major Ca2+ reservoir within the SR of cardiac myocytes and is part of a protein complex that contains the ryanodine receptor. The mutation, which is in full segregation in seven Bedouin families affected by the disorder, converts a negatively charged aspartic acid into a positively charged histidine, in a highly negatively charged domain, and is likely to exert its deleterious effect by disrupting Ca2+ binding.
-
(2001) Gene. 279, 2, p. 221-232 Abstract
The RUNX3 gene belongs to the runt domain family of transcription factors that act as master regulators of gene expression in major developmental pathways. In mammals the family includes three genes, RUNX1, RUNX2 and RUNX3. Here, we describe a comparative analysis of the human chromosome 1p36.1 encoded RUNX3 and mouse chromosome 4 encoded Runx3 genomic regions. The analysis revealed high similarities between the two genes in the overall size and organization and showed that RUNX3/Runx3 is the smallest in the family, but nevertheless exhibits all the structural elements characterizing the RUNX family. It also revealed that RUNX3/Runx3 bears a high content of the ancient mammalian repeat MIR. Together, those data delineate RUNX3/Runx3 as the evolutionary founder of the mammalian RUNX family. Detailed sequence analysis placed the two genes at a GC-rich H3 isochore with a sharp transition of GC content between the gene sequence and the downstream intergenic region. Two large conserved CpG islands were found within both genes, one around exon 2 and the other at the beginning of exon 6. RUNX1, RUNX2 and RUNX3 gene products bind to the same DNA motif, hence their temporal and spatial expression during development should be tightly regulated. Structure/function analysis showed that two promoter regions, designated P1 and P2, regulate RUNX3 expression in a cell type-specific manner. Transfection experiments demonstrated that both promoters were highly active in the GM1500 B-cell line, which endogenously expresses RUNX3, but were inactive in the K562 myeloid cell line, which does not express RUNX3. (C) 2001 Elsevier Science B.V. All rights reserved.
-
(2001) Bulletin of Mathematical Biology. 63, 6, p. 1063-1078 Abstract
The concept of shape space, which has been successfully implemented in immunology, is used here to construct a model for the discrimination power of the olfactory system. Using reasonable assumptions on the behaviour of the biological system, we are able to estimate the number of distinct olfactory receptor types. Our estimated value of around 1000 receptor types is in good agreement with experimental data. (C) 2001 Society for Mathematical Biology.
-
(2001) Nature Genetics. 29, 1, p. 83-87 Abstract
Hereditary inclusion body myopathy (HIBM; OMIM 600737) is a unique group of neuromuscular disorders characterized by adult onset, slowly progressive distal and proximal weakness and a typical muscle pathology including rimmed vacuoles and filamentous inclusions'. The autosomal recessive form described in Jews of Persian descent(2) is the HIBM prototype. This myopathy affects mainly leg muscles, but with an unusual distribution that spares the quadriceps(3). This particular pattern of weakness distribution, termed quadriceps-sparing myopathy (QSM), was later found in Jews originating from other Middle Eastern countries as well as in non-Jews(4). We previously localized the gene causing HIBM in Middle Eastern Jews on chromosome 9p12-13 (ref. 5) within a genomic interval of about 700 kb (ref. 6). Haplotype analysis around the HIBM gene region of 104 affected people from 47 Middle Eastern families indicates one unique ancestral founder chromosome in this community. By contrast, single non-Jewish families from India, Georgia (USA) and the Bahamas, with QSM and linkage to the same 9p12-13 region, show three distinct haplotypes. After excluding other potential candidate genes, we eventually identified mutations in the UDP-N-acetylglucosamine-2-epimerase/N-acetylmannosamine kinase (GIVE) gene in the HIBM families: all patients from Middle Eastern descent shared a single homozygous missense mutation, whereas distinct compound heterozygotes were identified in affected individuals of families of other ethnic origins. our findings indicate that GIVE is the gene responsible for recessive HIBM.
-
-
(2001) Genome Research. 11, 5, p. 685-702 Abstract
Olfactory receptors likely constitute the largest gene superfamily in the vertebrate genome. Here we present the nearly complete human olfactory subgenome elucidated by mining the genome draft with gene discovery algorithms. Over 900 olfactory receptor genes and pseudogenes (ORs) were identified, two-thirds of which were not annotated previously. The number of extrapolated ORs is in good agreement with previous theoretical predictions. The sequence of at least 63% of the ORs is disrupted by what appears to be a random process of pseudogene formation. ORs constitute 17 gene families, 4 of which contain more than 100 members each. "Fish-like" Class I ORs, previously considered a relic in higher tetrapods, constitute as much as 10% of the human repertoire, all in one large cluster on chromosome ii. Their lower pseudogene fraction suggests a functional significance. ORs are disposed on all human chromosomes except 20 and Y, and nearly 80% are found in clusters of 6-138 genes. A novel comparative cluster analysis was used to trace the evolutionary path that may have led to OR proliferation and diversification throughout the genome. The results of this analysis suggest the following genome expansion history: first, the generation of a "tetrapod-specific" Class II OR cluster on chromosome 11 by local duplication, then a single-step duplication of this cluster to chromosome I, and finally an avalanche of duplication events out of chromosome 1 to most other chromosomes. The results of the data mining and characterization of ORs can be accessed at the Human Olfactory Receptor Data Exploratorium Web site (http://bioinfo.weizmann.ac.il/HORDE).
-
(2001) Genomics. 71, 3, p. 296-306 Abstract
The olfactory receptor (OR) subgenome harbors the largest known gene family in mammals, disposed in clusters on numerous chromosomes. One of the best characterized OR clusters, located at human chromosome 17p13.3, has previously been studied by us in human and in other primates, revealing a conserved set of 17 OR genes. Here, we report the identification of a syntenic OR cluster in the mouse and the partial DNA sequence of many of its OR genes. A probe for the mouse M5 gene, orthologous to one of the OR genes in the human cluster (OR17-25), was used to isolate six PAC clones, all mapping by in situ hybridization to mouse chromosome 11B3-11B5, a region of shared synteny with human chromosome 17p13.3. Thirteen mouse OR sequences amplified and sequenced from these PACs allowed us to construct a putative physical map of the OR gene cluster at the mouse Olfr1 locus. Several points of evidence, including a strong similarity in subfamily composition and at least four cases of gene orthology, suggest that the mouse Olfr1 and the human 17p13.3 clusters are orthologous. A detailed comparison of the OR sequences within the two clusters helps trace their independent evolutionary history in the two species. Two types of evolutionary scenarios are discerned: cases of "true orthologous genes" in which high sequence similarity suggests a shared conserved function, as opposed to instances in which orthologous genes may have undergone independent diversification in the realm of "free reign" repertoire expansion. (C) 2001 Academic Press.
-
(2001) Origins of Life and Evolution of Biospheres. 31, 2-Jan, p. 119-145 Abstract
The continuity of abiotically formed bilayer membranes with similar structures in contemporary cellular life, and the requirement for microenvironments in which large and small molecules could be compartmentalized, support the idea that amphiphilic boundary structures contributed to the emergence of life. As an extension of this notion, we propose here a 'Lipid World' scenario as an early evolutionary step in the emergence of cellular life on Earth. This concept combines the potential chemical activities of lipids and other amphiphiles, with their capacity to undergo spontaneous self-organization into supramolecular structures such as micelles and bilayers. In particular, the documented chemical rate enhancements within lipid assemblies suggest that energy-dependent synthetic reactions could lead to the growth and increased abundance of certain amphiphilic assemblies. We further propose that selective processes might act on such assemblies, as suggested by our computer simulations of mutual catalysis among amphiphiles. As demonstrated also by other researchers, such mutual catalysis within random molecular assemblies could have led to a primordial homeostatic system displaying rudimentary life-like properties. Taken together, these concepts provide a theoretical framework, and suggest experimental tests for a Lipid World model for the origin of life.
-
(2001) Nature. 409, 6822, p. 860-921 Abstract
The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.
-
(2001) Human Mutation. 17, 5, p. 397-402 Abstract
The gene MCOLN1 is mutated in Mucolipidosis type IV (MLIV), a neurodegenerative, recessive, lysosomal storage disorder. The disease is found in relatively high frequency among Ashkenazi Jews due to two founder mutations that comprise 95% of the MLIV alleles in this population [Bargal et al., 2000]. In this report we complete the mutation analysis of Jewish and non-Jewish MLIV patients whose DNA were a available to us. Four novel mutations were identified in the A MCOLN1 gone of severely affected patients: two missense, T232P and F465L; a nonsense, R322X; and an 11-hp insertion in exon 12. The nonsense mutation (R322X) was identified in two unrelated patients with different haplotypes in the MCOLN1 chromosomal region, indicating a mutation hotspot in this CpG site. An in frame deletion (F308del) was identified in a patient with unusual mild psychomotor retardation. The frequency of MLIV in the general Jewish Ashkenazi population was estimated in a sample of 2,000 anonymous, unrelated individuals assayed for the two founder mutations. This analysis indicated a heterozygotes frequency of about 1/100. A preferred nucleotide numbering system for MCOLN1 mutations is presented and the issue of a screening program for the detection of high risk families in the Jewish Ashkenazi population is discussed. Hum Mutat 17:397-402, 2001. (C) 2001 Wiley-Liss, Inc.
-
(2001) Human Genetics. 108, 1, p. 1-13 Abstract
Olfactory receptors (ORs) constitute the largest multigene family in multicellular organisms. Their evolutionary proliferation has been driven by the need to provide recognition capacity for millions of potential odorants with arbitrary chemical configurations. Human genome sequencing has provided a highly informative picture of die "olfactory subgenome", the repertoire of OR genes. We describe here an analysis of 224 human OR genes, a much larger number than hitherto systematically analyzed. These are derived by literature survey, data mining at 14 genomic clusters, and by an OR-targeted experimental sequencing strategy. The presented set contains at least 53% pseudogenes and is minimally divided into 11 gene families. One of these (no. 7) has undergone a particularly extensive expansion in primates. The analysis of this collection leads to insight into the origin of OR genes, suggesting a graded expansion through mammalian evolution. It also allows us to delineate a structural map of the respective proteins. A sequence database and analysis package is provided (http://bioinformatics.weizmann.ac.il/HORDE), which will be useful for analyzing human OR sequences genome-wide.
-
(2001) Gene. 262, 2-Jan, p. 23-33 Abstract
The RUNX1 gene on human chromosome 21q22.12 belongs to the 'runt domain' gene family of transcription Factors (also known as AML/CBFA/PEBP2 alpha). RUNX1 is a key regulator of hematopoiesis and a frequent target of leukemia associated chromosomal translocations. Here we present a detailed analysis of the RUNX1 locus based on its complete genomic sequence. RUNX1 spans 260 kb and its expression is regulated through two distinct promoter regions, that are 160 kb apart. A very large CpG island complex marks the proximal promoter (promoter-2), and an additional CpG island is located at the 3' end of the gene. Hitherto, 12 different alternatively spliced RUNX1 cDNAs have been identified. Genomic sequence analysis of intron/exon boundaries of these cDNAs has shown that all consist of properly spliced authentic coding regions. This indicates that the large repertoire of RUNX1 proteins, ranging in size between 20-52 kDa, are generated through usage of alternatively spliced exons some of which contain in frame stop codons. The gene's introns are largely depleted of repetitive sequences, especially of the LINE1 family. The RUNX1 locus marks the transition from a similar to1 Mb of gene-poor region containing only pseudogenes, to a gene-rich region containing several functional genes. A search for RUNX1 sequences that may be involved in the high frequency of chromosomal translocations revealed that a 555 bp long segment originating in chromosome 11 FLI1 gene was transposed into RUNX1 intron 3.1. This intron harbors the t(8;21) and t(3;21) chromosomal breakpoints involved in acute myeloid leukemia. Interestingly, the FLI1 homologous sequence contains a breakpoint of the t(11;22) translocation associated with Ewing's tumors, and may have a similar function in RUNX1. (C) 2001 Elsevier Science B.V. All rights: reserved.
2000
-
(2000) Gene. 260, 2-Jan, p. 87-94 Abstract
Single-nucleotide polymorphisms (SNPs) were studied in 15 olfactory receptor (OR) coding regions: one control region and two noncoding sequences all residing within a 412 kb OR gene cluster on human chromosome 17p13.3, as well as in other G-protein coupled receptors (GPCRs). A total of 26 SNPs were identified in ORs, 21 of which are coding SNPs (cSNPs). The mean nucleotide diversity of OR coding regions was 0.078% (ranging from 0 to 0.16%), which is about twice higher than that of other GPCRs, and similar to the nucleotide diversity levels of noncoding regions along the human genome. The high polymorphism level in the OR coding regions might be due to a weak positive selection pressure acting on the OR genes. In two cases, OR genes have been found to share the same cSNP. This could be explained by recent gene conversion events, which might be a part of a concerted evolution mechanism acting on the OR superfamily. Using the genotype data of 85 unrelated individuals in 15 SNPs, we found linkage disequilibrium (LD) between pairs of SNPs located on the centromeric part of the cluster. On the other hand, no LD was found between SNPs located on the telomeric part of the cluster, suggesting the presence of several hot-spots for recombination within this cluster. Thus, different regions of this gene cluster may have been subject to different recombination rates. (C) 2000 Elsevier Science B.V. All rights reserved.
-
(2000) Genomics. 70, 1, p. 49-61 Abstract
The genomic and cDNA structures were studied for eight human olfactory receptor (OR) genes within the chromosome 17p13.3 cluster. A common gene structure was revealed, which included an similar to1-kb intronless coding region terminated by a signal for polyadenylation and a variable number of upstream noncoding exons. The latter were found to be alternatively spliced, giving rise to different isoforms of OR mRNA. While the initial exons mostly agreed with previous computer predictions and were conserved within OR subfamilies, other upstream exons were novel and idiosyncratic. In some cases, repetitive sequences were involved in the generation of splice sites and putative transcription control elements. Such gene structure is consistent with early repertoire enhancement by retrogene generation, which was likely followed by extensive genomic duplication. Each OR gene had a unique signature of transcription factor elements, consistent with a combinatorial expression control mechanism. (C) 2000 Academic Press.
-
(2000) Mammalian Genome. 11, 11, p. 1016-1023 Abstract
The vertebrate olfactory receptor (OR) subgenome harbors the largest known gene family, which has been expanded by the need to provide recognition capacity for millions of potential odorants. We implemented an automated procedure to identify all OR coding regions from published sequences. This led us to the identification of 831 OR coding regions (including pseudogenes) from 24 vertebrate species. The resulting dataset was subjected to neighbor-joining phylogenetic analysis and classified into 32 distinct families, 14 of which include only genes from tetrapodan species (Class II ORs). We also report here the first identification of OR sequences from a marsupial (koala) and a monotreme (platypus). Analysis of these OR sequences suggests that the ancestral mammal had a small OR repertoire, which expanded independently in all three mammalian subclasses. Classification of "fishlike" (Class I) ORs indicates that some of these ancient ORs were maintained and even expanded in mammals. A nomenclature system for the OR gene superfamily is proposed, based on a divergence evolutionary model. The nomenclature consists of the root symbol 'OR', followed by a family numeral, subfamily letter(s), and a numeral representing the individual gene within the subfamily. For example, OR3A1 is an OR gene of family 3, subfamily A, and OR7E12P is an OR pseudogene of family 7, subfamily E. The symbol is to be preceded by a species indicator. We have assigned the proposed nomenclature symbols for all 330 human OR genes in the database. A WWW tool for automated name assignment is provided.
-
Dichotomy of single-nucleotide polymorphism haplotypes in olfactory receptor genes and pseudogenes(2000) Nature Genetics. 26, 2, p. 221-224 Abstract
Substantial efforts are focused on identifying single-nucleotide polymorphisms (SNPs) throughout the human genome, particularly in coding regions (cSNPs), for both linkage disequilibrium and association studies(1,2). Less attention, however, has been directed to the clarification of evolutionary processes that are responsible for the variability in nucleotide diversity among different regions of the genome(3). We report here the population sequence diversity of genomic segments within a 450-kb cluster(4,5) of olfactory receptor (OR) genes(6,7) on human chromosome 17. We found a dichotomy in the pattern of nucleotide diversity between OR pseudogenes and introns on the one hand and the closely interspersed intact genes on the other. We suggest that weak positive selection is responsible for the observed patterns of genetic variation. This is inferred from a lower ratio of polymorphism to divergence in genes compared with pseudogenes or introns, high non-synonymous substitution rates in OR genes. and a small but significant overall reduction in variability in the entire OR gene cluster compared with other genomic regions. The dichotomy among functionally different segments within a short genomic distance requires high recombination rates within this OR cluster. Our work demonstrates the impact of weak positive selection on human nucleotide diversity, and has implications for the evolution of the olfactory repertoire.
-
Identification of the gene causing mucolipidosis type IV(2000) Nature Genetics. 26, 1, p. 118-121 Abstract
Mucolipidosis type IV (MLIV) is an autosomal recessive, neurodegenerative, lysosomal storage disorder(1) characterized by psychomotor retardation and ophthalmological abnormalities including corneal opacities, retinal degeneration and strabismus. Most patients reach a maximal developmental level of 12-15 months(2). The disease was classified as a mucolipidosis following observations by electron microscopy indicating the lysosomal storage of lipids together with water-soluble, granulated substances(1,3-6). Over 80% of the MLIV patients diagnosed are Ashkenazi Jews, including severely affected and mildly affected patients(3,4). The gene causing MLIV was previously mapped to human chromosome 19p13.2-13.3 in a region of approximately 1 cM (ref. 7). Haplotype analysis in the MLIV gene region of over 70 MLIV Ashkenazi chromosomes indicated the existence of two founder chromosomes among 95% of the Ashkenazi MLIV families: a major haplotype in 72% and a minor haplotype in 23% of the MLIV chromosomes (ref. 7, and G.B., unpublished data). The remaining 5% are distinct haplotypes found only in single patients. The basic metabolic defect causing the lysosomal storage in MLIV has not yet been identified. Thus. positional cloning was an alternative to identify the MLIV gene. We report here the identification of a new gene in this human chromosomal region in which MLIV-specific mutations were identified.
-
Harvesting the human genome: the Israeli perspective(2000) Israel Medical Association Journal. 2, 9, p. 657-664 Abstract
-
(2000) EMBO Reports. 1, 3, p. 217-222 Abstract
Textbooks often assert that life began with specialized complex molecules, such as RNA, that are capable of making their own copies. This scenario has serious difficulties, but an alternative has remained elusive. Recent research and computer simulations have suggested that the first steps toward life may not have involved biopolymers. Rather, noncovalent protocellular assemblies, generated by catalyzed recruitment of diverse amphiphilic and hydrophobic compounds, could have constituted the first systems capable of information storage, inheritance and selection. A complex chain of evolutionary events, yet to be deciphered, could then have led to the common ancestors of today's free-living cells, and to the appearance of DNA, RNA and protein enzymes.
-
(2000) Bioinformatics. 16, 5, p. 482-483 Abstract
The GESALT Workbench is a WWW-based tool for genomic sequence analysis, comparisn and annotation, with strong emphasis on visualization. GESALT integrates graphically the output of diverse sequence analysis algorithms producing an information-rich, interactive genomic map. Availability: The GESALT Workbench, as well as a more detailed description, are available at http://bioinfo.weizmann.ac.il/GESALT/. ContactL Gustavo @ bioinfo.weizmann.ac.il, Doron.Lancet @ weizmann.ac.il.
-
(2000) Proceedings of the National Academy of Sciences of the United States of America. 97, 8, p. 4112-4117 Abstract
Mutually catalytic sets of simple organic molecules have been suggested to be capable of self-replication and rudimentary chemical evolution. Previous models for the behavior of such sets have analyzed the global properties of short biopolymer ensembles by using graph theory and a mean field approach. In parallel, experimental studies with the autocatalytic formation of amphiphilic assemblies (e.g., lipid vesicles or micelles) demonstrated self-replication properties resembling those of living cells. Combining these approaches, we analyze here the kinetic behavior of small heterogeneous assemblies of spontaneously aggregating molecules, of the type that could form readily under prebiotic conditions. A statistical formalism for mutual rate enhancement is used to numerically simulate the detailed chemical kinetics within such assemblies, We demonstrate that a straightforward set of assumptions about kinetically enhanced recruitment of simple amphiphilic molecules, as well as about the spontaneous growth and splitting of assemblies, results in a complex population behavior. The assemblies manifest a significant degree of homeostasis, resembling the previously predicted quasi-stationary states of biopolymer ensembles (Dyson, F. J, (1982) J. Mel. Evol. 18, 344-350). Such emergent catalysis-driven, compositionally biased entities may be viewed as having rudimentary "compositional genomes," Our analysis addresses the question of how mutually catalytic metabolic networks, devoid of sequence-based biopolymers, could exhibit transfer of chemical information and might undergo selection and evolution. This computed behavior may constitute a demonstration of natural selection in populations of molecules without genetic apparatus, suggesting a pathway from random molecular assemblies to a minimal protocell.
-
(2000) Genomics. 63, 2, p. 227-245 Abstract
The olfactory receptor (OR) gene cluster on human chromosome 17p13.3 was subjected to mixed shotgun automated DNA sequencing. The resulting 412 kb of genomic sequence include 17 OR coding regions, 6 of which are pseudogenes. Six of the coding regions were discovered only upon genomic sequencing, while the others were previously reported as partial sequences. A comparison of DNA sequences in the vicinity of the OR coding regions revealed a common gene structure with an intronless coding region and at least one upstream noncoding exon, Potential gene control regions including specific pyrimidine:purine tracts and Olf-1 sites have been identified. One of the pseudogenes apparently has evolved into a CpG island. Four extensive CpG islands can be discerned within the cluster, not coupled to specific OR genes. The cluster is flanked at its telomeric end by an unidentified open reading frame (C17orf2) with no significant similarity to any known protein. A high proportion of the cluster sequence (about 60%) belongs to various families of interspersed repetitive elements, with a clear predominance of LINE repeats. The OR genes in the cluster belong to two families and seven subfamilies, which show a relatively high degree of intermixing along the cluster, in seemingly random orientations. This genomic organization may be best accounted for by a complex series of evolutionary events. (C) 2000 Academic Press.
-
Prebiotic evolution of amphiphilic assemblies far from equilibrium: From compositional information to sequence-based biopolymers(2000) Bioastronomy'99, A New Era In Bioastronomy, Proceedings. 213, p. 373-+ Abstract
The primordial emergence of biopolymers, agents of the genetic machinery in modern cells, is not less enigmatic than the emergence of the genetic code itself. Here we discuss how potential early replicating protocellular systems based on a rudimentary form of inheritance, a "compositional genome", could evolve towards the emergence of "alphabetic" polymers, predating the genetic code. A computer simulated evolutionary process based on our previously proposed kinetic model may help understand the appearance of chemical combinatorics through early natural selection.
1999
-
(1999) Journal of Molecular Biology. 294, 4, p. 921-935 Abstract
Modeling of integral membrane proteins and the prediction of their functional sites requires the identification of transmembrane (TM) segments and the determination of their angular orientations. Hydrophobicity scales predict accurately the location of TM helices, but are less accurate in computing angular disposition. Estimating lipid-exposure propensities of the residues from statistics of solved membrane protein structures has the disadvantage of relying on relatively few proteins. As an alternative, we propose here a scale of knowledge-based Propensities for Residue Orientation in Transmembrane segments (kPROT), derived from the analysis of more than 5000 non-redundant protein sequences. We assume that residues that tend to be exposed to the membrane are more frequent in TM segments of single-span proteins, while residues that prefer to be buried in the transmembrane bundle interior are present mainly in multispan TMs. The kPROT value for each residue is thus defined as the logarithm of the ratio of its proportions in single and multiple TM spans. The scale is refined further by defining it for three discrete sections of the TM segment; namely, extracellular, central, and intracellular. The capacity of the kPROT scale to predict angular helical orientation was compared to that of alternative methods in a benchmark test, using a diversity of multi-span cc-helical transmembrane proteins with a solved 3D structure. kPROT yielded an average angular error of 41 degrees, significantly lower than that of alternative scales (62 degrees-68 degrees). The new scale thus provides a useful general tool for modeling and prediction of functional residues in membrane proteins. A WWW server (http://bioinfo.weizmann.ac.il/kPROT) is available for automatic helix orientation prediction with kPROT. (C) 1999 Academic Press.
-
(1999) Genomics. 61, 1, p. 24-36 Abstract
The olfactory receptor (OR) subgenome harbors the largest known gene family in mammals, disposed in clusters on numerous chromosomes. We have carried out a comparative evolutionary analysis of the best characterized genomic OR gene cluster, on human chromosome 17p13. Fifteen orthologs from chimpanzee (localized to chromosome 19p15), as well as key OR counterparts from other primates, have been identified and sequenced. Comparison among orthologs and paralogs revealed a multiplicity of gene conversion events, which occurred exclusively within OR subfamilies. These appear to lead to segment shuffling in the odorant binding site, an evolutionary process reminiscent of somatic combinatorial diversification in the immune system We also demonstrate that the functional mammalian OR repertoire has undergone a rapid decline in the past 10 million years: while for the common ancestor of all great apes an intact OR cluster is inferred, in present-day humans and great apes the cluster includes nearly 40% pseudogenes. (C) 1999 Academic Press.
-
The variable and conserved interfaces of modeled olfactory receptor proteins(1999) Protein Science. 8, 5, p. 969-977 Abstract
The accumulation of hundreds of olfactory receptor (OR) sequences, along with the recent availability of detailed models of other G-protein-coupled receptors, allows us to analyze the OR amino acid variability patterns in a structural context. A Fourier analysis of 197 multiply aligned olfactory receptor sequences showed an alpha-helical periodicity in the variability profile. This was particularly pronounced in the more variable transmembranal segments 3, 4, and 5. Rhodopsin-based homology modeling demonstrated that the inferred variable helical faces largely point to the interior of the receptor barrel. We propose that a set of 17 hypervariable residues. which point to the barrel interior and are more extracellular ly disposed, constitute the odorant complementarity determining regions. While 12 of these residues coincide with established ligand-binding contact postions in other G-protein-coupled receptors, the rest are suggested to form an olfactory-unique aspect of the binding pocket. Highly conserved olfactory receptor-specific sequence motifs, found in the second and third intracellular loops, may comprise the G-protein recognition epitope. The prediction of olfactory receptor functional sites provides concrete suggestions of site-directed mutagenesis experiments for altering ligand and G-protein specificity.
-
-
(1999) Instruments, Methods, And Missions For Astrobiology Ii. 3755, p. 144-162 Abstract
The Graded Autocatalysis Replication Domain (GARD) model described here depicts an early primordial. scenario, prior to the emergence of biopolymers, such as RNA or proteins. The model describes, with the help of statistical chemistry computer simulations, a collection of organic molecular species capable of rudimentary selection and evolution. The GARD model provides a rigorous kinetic analysis of simple sets of chemicals that manifest mutual catalysis. It is shown that catalytic closure can sustain self replication up to a critical dilution rate, related to the extent of mutual catalysis. The capacity for self replication in a mutually catalytic set is shown to be a graded property, quantitated by a critical parameter lambda(ci). GARD could be a simple model for a primordial scenario, in which replication and catalysis are performed by the same set of molecules. GARDobes are proposed to be entities that embody a GARD system, endowed with a non-DNA "compositional genome", and are presumed to have replicated slowly and imperfectly through mutually catalytic networks. Therefore, they are not bound by the standard cellular size constraints: GARDobes may be as small as a few nanometers, with 20-50 nanometers being rather large and elaborate. Active GARDobes, if ever found on earth or on other planets, would be distinguished by a highly biased organic chemistry, i.e. having only a small subset of the possible molecules of any given class. Their fossils might still bear the hallmarks of such a bias, with narrow spectra of molecules such as Polycyclic Aromatic Hydrocarbons or even with enantiomeric excesses.
-
(1999) Molecular Biology of the Brain. p. 93-104 Abstract
In order to elicit an olfactory response, a substance has to partition into the gas phase and diffuse into the nose. Such odorant molecules, usually low molecular-mass hydrophobic compounds, encounter the ciliated endings of sensory neuronal dendrites, which protrude into a mucus layer at the surface of the olfactory epithelium in the nasal cavity. Embedded in the membranes of such cilia are olfactory receptor (OR) proteins, which recognize odorants and elicit a transduction cascade that underlies the nerve cell response. The sensory axons project to the olfactory bulb in the brain, where they converge into synaptic structures called glomeruli. The specific convergence patterns of olfactory axons, which depend on OR expression, provide a model system for neuronal network development. Here, initial processing of odour information occurs, which is followed by additional analysis in higher olfactory brain centres.
1998
-
(1998) Origins of Life and Evolution of Biospheres. 28, 6-Apr, p. 501-514 Abstract
A Graded Autocatalysis Replication Domain (GARD) model is proposed, which provides a rigorous kinetic analysis of simple chemical sets that manifest mutual catalysis. It is shown that catalytic closure can sustain self-replication up to a critical dilution rate, lambda(c), related to the graded extent of mutual catalysis. We explore the behavior of vesicles containing GARD species whose mutual catalysis is governed by a previously published statistical distribution. In the population thus generated, some GARD vesicles display a significantly higher replication efficiency than most others. GARD thus represents a simple model for primordial chemical selection of mutually catalytic sets.
-
(1998) Genomics. 53, 1, p. 56-68 Abstract
Olfactory receptors (OR) are encoded by a large multigene family including hundreds of members dispersed throughout the human genome. Cloning and mapping studies have determined that a large proportion of the olfactory receptor genes are located on human chromosomes 6, 11, and 17, as well as distributed on other chromosomes. In this paper, we describe and characterize the organization of olfactory receptor genes on human chromosome 11 by using degenerate PCR-based probes to screen chromosome 11-specific and whole genome clone libraries for members of the OR gene family. OR genes were identified by DNA sequencing and then localized to regions of chromosome 11. Physical maps of several gene clusters were constructed to determine the chromosomal relationships between various members of the family. This work identified 25 new OR genes located on chromosome 11 in at least seven distinct regions. Three of these regions contain gene clusters that include additional members of this gene family not yet identified by sequencing. Phylogenetic analysis of the newly described OR genes suggests a mechanism for the generation of genetic diversity. (C) 1998 Academic Press.
-
(1998) Physica A. 249, 4-Jan, p. 558-564 Abstract
A thorough outlook on the origin of life needs to delineate a chemically rigorous, self-consistent path from highly heterogeneous, random ensembles of relatively simple organic molecules, to an entity that has rudimentary life-like characteristics. Such entity should be endowed with a capacity to express variation, undergo mutation-like changes and manifest a simple evolutionary process. For simulating such system we developed the Graded Autocatalysis Replication Domain (GARD) model for explicit kinetic analysis of mutual catalysis in sets of random oligomers derived from energized precursor monomers. The kinetic properties of the GARD model are based on vesicle enclosure and expansion. With the additional assumption of spontaneous vesicle splitting, a GARD evolution scenario is envisaged as a consequence of pure chemical kinetics. Here we show how the GARD model can serve as a platform for investigating the dynamics of self-organization mechanisms in molecular evolutionary processes. (C) 1998 Elsevier Science B.V. All rights reserved.
-
(1998) Olfaction And Taste Xii: An International Symposium. 855, p. 182-193 Abstract
The human olfactory subgenome represents several hundred olfactory receptor (OR) genes in a dozen or more clusters on several chromosomes. One OR gene cluster on human chromosome 17 has been characterized by us in detail. Based on a large-scale DNA sequence analysis, we have identified events of gene duplication and fusion as well as the generation of pseudogenes. The latter instances of 'gene death' could underlie the widespread phenomenon of human specific anosmias. Sixteen OR coding regions were found on this cluster, and six of them are pseudogenes. One of these pseudogenes, OB17-23, was found to be an intact open reading frame in an old world monkey. This may be a reflection of an OR repertoire diminution in man. A homology model of the OR protein was constructed by utilizing the rich information available on similar to 200 OR sequences, The putative odorant complementarity determining regions (CDR) was found to consist of 20 hypervariable residues facing an interior caving defined by transmembrane helices 3, 4 and 5. Such a model could be useful in analyzing additional OR gene sequences in the human genome in terms of odorant binding.
-
Mutually catalytic amphiphiles: Simulated chemical evolution and implications to exobiology(1998) Exobiology: Matter, Energy, And Information In The Origin And Evolution Of Life In The Universe. p. 123-131 Abstract
A description of the emergence of life should delineate a chemically rigorous gradual transition from random collections of simple organic molecules to spatially confined assemblies displaying rudimentary self-reproduction capacity. It has been suggested that large sets of mutually catalytic molecules, and not self-replicating information-carrying biopolymers, could have been the precursors of life. We present here a stochastic model in which the mutually catalytic molecules are spontaneously aggregating amphiphiles. When such amphiphiles exert on each other random catalytic effects, biased molecular compositions emerge, that are endowed with replication-like properties. This approach may have important consequences to the understanding of very early chemical evolution. It could also guide a search for extraterrestrial forms of very primitive life.
-
(1998) Bioinformatics. 14, 8, p. 656-664 Abstract
Motivation: Modem biology is shifting from the 'one gene one postdoc' approach to genomic analyses that include the simultaneous monitoring of thousands of genes. The importance of efficient access to concise and integrated biomedical information to support data analysis and decision making is therefore increasing rapidly, in both academic and industrial research. However, knowledge discovery in the widely scattered resources relevant for biomedical research is often a cumbersome and non-trivial cask. one that requires a significant amount of training and effort. Results: To develop a model for a new type of topic-specific overview resource that provides efficient access to distributed information we designed a database cc-riled 'GeneCards'. It is a freely accessible Web resource that offers one hypertext 'card' for each of the more than 7000 human genes that currently have an approved gene symbol published by the HUGO/GDB nomenclature committee. The presented information aims at giving immediate insight into current knowledge about the respective gene, including a focus on its functions in health and disease. It is compiled by Perl scripts that automatically extract relevant information from several databases including SWISS-PROT OMIM, Genatlas and GDB. Analyses of the interactions of users with the Web interface of GeneCards triggered development of easy-to-scan displays optimized for human browsing. Also, we developed algorithms that offer 'rearly-to-click' query reformulation support to facilitate information retrieval and exploration. Many of the long-term users turn To GeneCnrds to quickly access information about the function of very large sets of genes, for example in the realm of large-scale expression studies using 'DNA chip' technology or two-dimensional protein electrophoresis.
1997
-
-
(1997) Pharmacogenetics. 7, 4, p. 255-269 Abstract
This review cc represents an update of the nomenclature system for the UDP glucuronosyltransferase gene superfamily, which is based on divergent evolution. Since the previous review in 1991, sequences of many related UDP glycosyltransferases from lower organisms have appeared in the database, which expand our database considerably. At latest count, in animals, yeast, plants and bacteria there are 110 distinct cDNAs/genes whose protein products all contain a characteristic 'signature sequence' and, thus, are regarded as members of the same superfamily, Comparison of a relatedness tree of proteins leads to the definition of 33 families, It should be emphasized that at least six cloned UDP-GlcNAc N-acetylglucosaminyltransferases are not sufficiently homologous to be included as members of this superfamily and may represent an example of convergent evolution, For naming each gene, it is recommended that the root symbol UGT for human (Ugt for mouse and Drosophila), denoting 'UDP glycosyltransferase,' be followed by an Arabic number representing the family, a letter designating the subfamily, and an Arabic numeral denoting the individual gene within the family or subfamily, e.g. 'human UGT2B4' and 'mouse Ugt2b5'. We recommend the name 'UDP glycosyltransferase' because many of the proteins do not preferentially use UDP glucuronic acid, or their nucleotide sugar preference is unknown. Whereas the gene is italicized, the corresponding cDNA, transcript, protein and enzyme activity should be written with upper-case letters and without italics, e.g. 'human or mouse UGT1A1. 'The UGT1 gene (spanning > 500 kb) contains at least 12 promoters/first exons, which call be spliced and joined with common exons 2 through 5, leading to different N-terminal halves but identical C-terminal halves of the gene products; in this scheme each first exon is regarded as a distinct gene (e.g. UGT1A1, UGT1A2, ... UGT1A12). When an orthologous gene between species cannot be identified with certaint
-
1996
-
(1996) Genomics. 37, 2, p. 147-160 Abstract
A cosmid clone covering a region of high olfactory receptor (OR) gene density inside the OR gene cluster on human chromosome 17 (17p13.3) was subjected to shotgun automated DNA sequencing. The resulting 40-kb sequence revealed three known OR coding regions, as well as a new OR pseudogene (OR17-25), fused to one of the previously identified OR genes (OR17-24). The suggested mechanism for the generation of this doublet structure involves an initial duplication mediated by flanking repeats and a subsequent deletion via nonhomologous recombination. Sequence analysis further suggests that the two other OR genes present in the cosmid (OR17-40 and OR17-228) may have evolved by ancient tandem duplication of an 11-kb fragment, mediated by recombination between mammalian-wide interspersed repeats. The duplicated genes appear to be complete and potentially functional. Their conserved structure reveals a long upstream intron and a previously uncharacterized 5' noncoding exon. No additional genes could be discerned in the cosmid, suggesting that the cluster may be part of a dedicated OR subgenome. (C) 1996 Academic Press, Inc.
-
(1996) FEBS Journal. 238, 1, p. 28-37 Abstract
The superfamily of olfactory receptor genes, whose products are thought to be activated by odorant ligands, is critical for odor recognition. Two-olfactory receptors, olp4 from rat and OR17-4 from human, were overexpressed in Sf9 insect cells, The presence of the proteins in cell membranes was monitored by immunoblotting with peptide-specific polyclonal antibodies directed against the C-terminal sequences of these receptors and with a mAb against an N-terminal octapeptide epitope tag. A DNA sequence that codes for a His(6) tag, which binds tightly to a Ni2+-chelale-affinity column, was incorporated into the N-termini of both genes. The expressed olfactory receptors were found mainly in the cell-membrane fraction. The proteins were difficult to solubilize by many detergents and only lysophosphatidylcholine was found to be both suitable for efficient solubilization of the overexpressed olfactory receptors and compatible with the purification system used. After solubilization, the olfactory receptors were purified to near homogeneity by affinity chromatography on nickel nitrilotriacetic acid resin and by cation-exchange chromatography. Electrophoresis of the purified proteins and visualization with Coomassie Blue staining or by immunoblotting with specific antibodies, revealed bands of 32, 69 and 94 kDa, which were identified as the monomeric, dimeric and trimeric forms of the receptor proteins. The oligomeric forms were resistant to reduction and alkylation, and are therefore thought to be held together by non-covalent hydrophobic interactions that are resistant to SDS. This finding is similar to previous observations for other guanine-nucleotide-binding-regulator-protein-coupled receptors. Reconstitution in phospholipid vesicles showed that the purified olfactory receptors insert specifically into the lipid bilayer. This provides a means to study functional reconstitution with putative transduction components such as olfactory guanine-nucleotide-binding-regulatory
-
Positive selection moments identify potential functional residues in human olfactory receptors(1996) Receptors & Channels. 4, 3, p. 141-147 Abstract
Correlated mutation analysis and molecular models of olfactory receptors have provided evidence that residues in the transmembrane domains form a binding pocket for odor ligands. As an independent test of these results, we have calculated positive selection moments for the alpha-helical sixth transmembrane domain (TM6) of human olfactory receptors. The moments can be used to identify residues that have been preferentially affected by positive selection and are thus likely to interact with odor ligands. The results suggest that residue 622, which is commonly a serine or threonine, could form critical H-bonds. In some receptors a dual-serine subsite, formed by residues 622 and 625, could bind hydroxyl determinants on odor ligands. The potential importance of these residues is further supported by site-directed mutagenesis in the beta-adrenergic receptor. The findings should be of practical value for future physiological studies, binding assays, and site-directed mutagenesis.
1994
-
-
(1994) FEBS Journal. 225, 3, p. 1157-1168 Abstract
A rat olfactory epithelium cDNA library was screened for olfactory receptor clones. One of the positively hybridizing cDNA clones was sequenced and found to encode a new member of the olfactory receptor superfamily. This cDNA, termed olp4, was used as a model of olfactory receptor for expression, both in vitro and in vivo. Expression of olp4, as well as of another previously cloned olfactory receptor (F5), was monitored by immunoprecipitation with a monoclonal antibody directed against a Flag peptide epitope tag, inserted at the N-terminus of the open reading frame, and a specific polyclonal antibody against a C-terminal peptide of olp4. Translation in vitro, followed by immunoprecipitation, showed a major olp4-specific band of 27-29 kDa. The olp4 and F5 polypeptides were found to be inserted into microsomal membranes as expected for integral membrane proteins. Expression in vivo of Flag-olp4 in Sf9 insect cells, using the baculovirus expression system, showed a specific polypeptide of the same size as the in vitro species, with an additional band of 34 kDa, which is most likely a glycosylated form. Fluorescence cytometry and immunohistochemical assays demonstrated the localization of the Flag-olp4 product on the cell surface of the infected host Sf9 cells, with the N-terminus and C-terminus in the proper orientation. Affinity chromatography was used for the partial purification of the olp4 polypeptide from infected Sf9 cells. The identification and purification of this expressed olfactory receptor polypeptide could open the way for further characterization and functional studies of the olfactory receptor superfamily members.
-
EMERGENCE OF ORDER IN SMALL AUTOCATALYTIC SETS MAINTAINED FAR FROM EQUILIBRIUM - APPLICATION OF A PROBABILISTIC RECEPTOR AFFINITY DISTRIBUTION (RAD) MODEL(1994) Berichte Der Bunsen-Gesellschaft-Physical Chemistry Chemical Physics. 98, 9, p. 1166-1169 Abstract
We examined the behavior of auto-catalytic sets of polymers by a computer simulation. Polymers are allowed to interact with each other, whereby each polymer molecule may catalyze the formation and degradation of others. The system is subjected to a set of thermodynamic and kinetic constraints, including a constant influx of free energy, which keeps the system away from chemical equilibrium and thus enables the effect of catalysis. The system is found to continuously change and probe many possible values in the composition space. In this simulation we make use of a Receptor Affinity Distribution (RAD) model to predict the probabilities of interaction and catalysis. Our results indicate that initially random sets of polymers, under the assumptions of the model, might accumulate information (i.e., clustering in the composition space). Sets will occupy a limited region of composition space, and temporarily reproduce themselves or disperse and give rise to other sets.
-
(1994) Human Molecular Genetics. 3, 2, p. 229-235 Abstract
A gene superfamily of olfactory receptors (ORs) has recently been identified in a number of species. These receptors share a seven transmembrane domain structure with many neurotransmitter and hormone receptors, and are likely to underlie the recognition and G-protein-mediated transduction of odorant signals. Previously, OR genes cloned in different species were from random locations in the respective genomes. We report here the cloning of 16 human OR genes, air from chromosome 17(17p13.3). The intronless coding regions are physically mapped (on 35 cosmids) in one 0.35Mb long contiguous cluster, with an average intergenic separation of 15kb. The human OR genes in the cluster belong to four different gene subfamilies, displaying as much sequence variability as any randomly selected group of ORs. This suggests that the cluster identified may be one of several copies of an ancestral OR gene repertoire whose existence may predate the divergence of mammals. The latter may have duplicated in some species to form the present mammalian OR gene repertoire, with several hundred genes. The human chromosome 17 OR gene cluster may thus be a good model for understanding human olfaction, as well as the ontogeny and phylogeny of the OR gene superfamily.
1993
-
-
GLUTATHIONE S-TRANSFERASES IN RAT OLFACTORY EPITHELIUM - PURIFICATION, MOLECULAR-PROPERTIES AND ODORANT BIOTRANSFORMATION(1993) Biochemical Journal. 292, p. 379-384 Abstract
The olfactory epithelium is exposed to a variety of xenobiotic chemicals, including odorants and airborne toxic compounds. Recently, two novel, highly abundant, olfactory-specific biotransformation enzymes have been identified: cytochrome P-450olf1 and olfactory UDP-glucuronosyltransferase (UGT(olf)). The latter is a phase II biotransformation enzyme which catalyses the glucuronidation of alcohols, thiols, amines and carboxylic acids. Such covalent modification, which markedly affects lipid solubility and agonist potency, may be particularly important in the rapid termination of odorant signals. We report here the identification and characterization of a second olfactory phase II biotransformation enzyme, a glutathione S-transferase (GST). The olfactory epithelial cytosol shows the highest GST activity among the extrahepatic tissues examined. Significantly, olfactory epithelium had an activity 4-7 times higher than in other airway tissues, suggesting a role for this enzyme in chemoreception. The olfactory GST has been affinity-purified to homogeneity, and shown by h.p.l.c. and N-terminal amino acid sequencing to constitute mainly the Yb1 and Yb2 subunits, different from most other tissues that have mixtures of more enzyme classes. The identity of the olfactory enzymes was confirmed by PCR cloning and restriction enzyme analysis. Most importantly, the olfactory GSTs were found to catalyse glutathione.conjugation of several odorant classes, including many unsaturated aldehydes and ketones, as well as epoxides. Together with UGT(olf), olfactory GST provides the necessary broad coverage of covalent modification capacity, which may be crucial for the acuity of the olfactory process.
-
(1993) Developmental Brain Research. 73, 1, p. 7-16 Abstract
The molecular components of olfactory reception and regulation are expressed in a tissue-specific manner. The functional attributes mediated by some of these proteins have been previously shown to display a well-defined developmental emergence during the last week of rat gestation. To gain a better understanding of the relations between chemosensory function and neuronal development, we studied the ontogeny of 7 olfactory-specific genes by quantitative PCR. Relative levels of expression during rat development were determined for each gene, starting at embryonic day 15 (E15) and ending at postnatal day 35 (P35). In addition, the level of expression of the different genes was quantified in juvenile rats. The onset of expression for olfactory receptors and the olfactory cation channel at embryonic day 19 (E19) coincides with the functional maturation of the sensory neurons. Olfactory G-protein and adenylyl cyclase are expressed earlier (approximately E16) while olfactory biotransformation enzymes appear later (E20-E21), just before birth. The sequence of developmental expression of olfactory receptor genes has possible implications to the establishment of neuronal connectivity in this sensory pathway.
-
(1993) Proceedings of the National Academy of Sciences of the United States of America. 90, 8, p. 3715-3719 Abstract
A generalized phenomenological model is presented for stereospecific recognition between biological receptors and their ligands. We ask what is the distribution of binding constants PSI(K) between an arbitrary ligand and members of a large receptor repertoire, such as immunoglobulins or olfactory receptors. For binding surfaces with B potential subsite and S different types of subsite configurations, the number of successful elementary interactions obeys a binomial distribution. The discrete probability function PSI(K) is then derived with assumptions on alpha, the free energy contribution per elementary interaction. The functional form of PSI(K) may be universal, although the parameter values could vary for different ligand types. An estimate of the parameter values of PSI(K) for iodovanillin, an analog of odorants and immunological haptens, is obtained by equilibrium dialysis experiments with nonimmune antibodies. Based on a simple relationship, predicted by the model, between the size of a receptor repertoire and its average maximal affinity toward an arbitrary ligand, the size of the olfactory receptor repertoire (N(olf)) is calculated as 300-1000, in very good agreement with recent molecular biological studies. A very similar estimate, N(olf) = 500, is independently derived by relating a theoretical distribution of maxima for PSI(K) with published human olfactory threshold variations. The present model also has implications to the question of olfactory coding and to the analysis of specific anosmias, genetic deficits in perceiving particular odorants. More generally, the proposed model provides a better understanding of ligand specificity in biological receptors and could help in understanding their evolution.
-
(1993) Chemical Senses. 18, 2, p. 217-225 Abstract
Keywords: MOLECULAR-BASIS; RECOGNITION; NEURONS; GENES; CDNA
-
(1993) Molecular Basis Of Smell And Taste Transduction. p. 131-146 Abstract
The emerging understanding of the molecular basis of olfactory mechanisms allows one to answer some long-standing questions regarding the complex recognition machinery involved. The ability of the olfactory system to detect chemicals at sub-nanomolar concentrations is explained by a plethora of amplification devices, including the coupling of receptors to second messenger generation through GTP-binding proteins. Specificity and selectivity may be understood in terms of a diverse repertoire of olfactory receptors of the seven-transmembrane-domain receptor superfamily, which are probably disposed on olfactory sensory neurons according to a clonal exclusion rule. Signal termination may be related to sets of biotransformation enzymes that process odorant molecules, as well as to receptor desensitization. Many of the underlying molecular components show specific expression in olfactory epithelium, with a well-orchestrated developmental sequence of emergence, possibly related to sensory neuronal function and connectivity requirements. A general model for molecular recognition in biological receptor repertoires allows a prediction of the number of olfactory receptors necessary to achieve efficient detection and sheds light on the analogy between the immune and olfactory systems. The molecular cloning and mapping of a human genomic olfactory receptor cluster on chromosome 17 provides insight into olfactory receptor diversity, polymorphism and evolution. Combined with future genotype-phenotype correlation, with particular reference to specific anosmia, as well as with computer-based molecular modelling, these studies may provide insight into the odorant specificity of olfactory receptors.
1992
-
(1992) Neuroscience Letters. 141, 1, p. 115-118 Abstract
Olfactory thresholds for four odorants were determined in groups of monozygotic and dizygotic human twins. Odorants were presented in an ascending dilution series in odorless solvent, using a three-way forced choice method. For two of the tested odorants, 5-alpha-androst-16-en-3-one and isoamyl acetate, the thresholds showed a strong genetic component. This was demonstrated by respective values of 0.78 and 0.73 for the intraclass correlation difference, and of z = 3.69 and z = 2.71 in a within-pair difference analysis. The results for isoamyl acetate are novel, and suggest that genetic polymorphism in the affinity of odorant receptor proteins contributes to the (nearly normal) threshold distribution for this odorant.
1991
-
-
(1991) DNA and Cell Biology. 10, 7, p. 487-494 Abstract
A nomenclature system for the UDP glucuronosyltransferase superfamily is proposed, based on divergent evolution of the genes. A total of 26 distinct cDNAs in five mammalian species have been sequenced to date. Comparison of the deduced amino acid sequences leads to the definition of two families and a total of three subfamilies. For naming each gene, we propose that the root symbol UGT for human (Ugt for mouse), representing "UDP glucuronosyltransferase," be followed by an Arabic number denoting the family, a letter designating the subfamily, and an Arabic numeral representing the individual gene within the family or subfamily (hyphen before the Arabic number for mouse), e.g., human UGT2B1 and murine Ugt2b-1. Whereas the gene and cDNA should be italicized, the corresponding transcript, protein, and enzyme activity should not be written with lowercase letters or in italics, e.g., human or murine UGT2B1. Recent experimental evidence suggests that several exons of the UGT1 gene might be shared, indicating that distinct UGT1 transcripts and proteins may arise via alternative splicing; the gene and gene product of alternative splicing will be designated with an asterisk, e.g., UGT1*6 and UGT1*6, respectively. When an orthologous gene between species cannot be identified with certainty, as occurs in the UGT2B subfamily, we recommend sequential naming of the genes chronologically as they become characterized. We suggest that the human nomenclature system be used for species other than the mouse. We anticipate that this UGT gene nomenclature system will require updating on a regular basis.
-
-
(1991) Nature. 349, 6312, p. 790-793 Abstract
THE onset of olfactory transduction has been extensively studied 1-7, but considerably less is known about the molecular basis of olfactory signal termination 6,8,9. It has been suggested that the highly active cytochrome P450 monooxygenases of olfactory neuroepithelium 10-12 are termination enzymes 5,8,11,12, a notion supported by the identification and molecular cloning of olfactory-specific cytochrome P450s (refs. 13-16). But as reactions catalysed by cytochrome P450 (refs 17, 18) often do not significantly alter volatility, lipophilicity or odour properties 9,11, cytochrome P450 may not be solely responsible for olfactory signal termination. In liver and other tissues, drug hydroxylation by cytochrome P450 is frequently followed by phase II biotransformation, for example by UDP glucuronosyl transferase (UGT), resulting in a major change of solubility and chemical properties 19. We report here the molecular cloning and expression of an olfactory-specific UGT. The olfactory enzyme, but not the one in liver microsomes, shows preference for odorants over standard UGT substrates. Furthermore, glucuronic acid conjugation abolishes the ability of odorants 1,20 to stimulate olfactory adenylyl cyclase. This, together with the known broad spectrum of drug-detoxification enzymes 17,19, supports a role for olfactory UGT in terminating diverse odorant signals.
-
(1991) FEBS Journal. 196, 1, p. 51-58 Abstract
Previously, we described two olfactory-specific cytochromes P-450: rat cytochrome P-450olf1 (IIG1), identified by cDNA cloning, and bovine cytochrome P-450olf2 (IIA), identified by peptide microsequencing of a transmembranal polypeptide (p52). Here we describe the preparation of polyclonal antisera against peptide sequences of these proteins and their use in the immunolocalization of cytochromes P-450olf1 and P-450olf2 in rat olfactory mucosa. Immunoreactivities related to both enzymes are found in the subepithelial Bowman's glands of olfactory mucosa. Practically no immunoreactivity was found in other rat tissues, including liver, lung, kidney and respiratory mucosa. In addition, double-labeling experiments demonstrated that cytochromes P-450olf1 and P-450olf2 are present in the same population of Bowman's glands. The olfactory-specific localization of cytochromes P-450olf1 and P-450olf2 is consistent with a role for these enzymes in the modification or clearance of odorants from the chemosensory tissue.
-
(1991) Sweeteners. Vol. 450. p. 226-236 (trueACS Symposium Series). Abstract
While the chemistry of sweet tasting compounds has been extensively studied (1-5), precious little has been known until recently on the cellular mechanisms of sweet taste transduction. Work in the authors' laboratory, as well as in several others, has begun to shed light on this problem. Specifically, evidence has accumulated in the last three years, suggesting that sweet taste receptor proteins (as yet unidentified) activate a membrane transduction cascade. This molecular chain of events appears to be very similar to that which is associated with receptors for hormones and neurotransmitters, as well as visual photoreceptors and olfactory receptors (6-8). The proposed transduction cascade includes (see Figure 1):(1) A transmembrane protein receptor that binds sweet compounds stereospecifically and subsequently undergoes a conformational transition.(2) A membrane amplifier GTP-binding protein (G-protein) of the stimulatory type (Gs).(3) The membrane enzyme adenylyl cyclase, that produces an intracellular second messenger cyclic AMP (cAMP).
1990
1989
-
OLFACTORY FUNCTION FOLLOWING LATE REPAIR OF CHOANAL ATRESIA(1989) Laryngoscope. 99, 11, p. 1165-1166 Abstract
-
OLFACTORY ADENYLYL CYCLASE - IDENTIFICATION AND PURIFICATION OF A NOVEL ENZYME FORM(1989) Journal of Biological Chemistry. 264, 31, p. 18803-18807 Abstract
-
-
SWEET TASTANTS STIMULATE ADENYLATE-CYCLASE COUPLED TO GTP-BINDING PROTEIN IN RAT TONGUE MEMBRANES(1989) Biochemical Journal. 260, 1, p. 121-126 Abstract
-
OLFACTORY-SPECIFIC CYTOCHROME-P-450 - CDNA CLONING OF A NOVEL NEUROEPITHELIAL ENZYME POSSIBLY INVOLVED IN CHEMORECEPTION(1989) Journal of Biological Chemistry. 264, 12, p. 6780-6785 Abstract
-
OLFACTION IN PROLONGED ADMINISTRATION OF PYRIDOSTYGMINE(1989) Journal of Clinical Pharmacology. 29, 4, p. 370-372 Abstract
1988
1987
1986
-
-
ISOLATED FROG OLFACTORY CILIA - A PREPARATION OF DENDRITIC MEMBRANES FROM CHEMOSENSORY NEURONS(1986) Journal of Neuroscience. 6, 8, p. 2146-2154 Abstract
-
(1986) Proceedings of the National Academy of Sciences of the United States of America. 83, 13, p. 4947-4951 Abstract
-
-
CHANGES IN OLFACTORY ACUITY INDUCED BY TOTAL INFERIOR TURBINECTOMY(1986) Archives Of Otolaryngology-Head & Neck Surgery. 112, 2, p. 195-197 Abstract
-
POLYPEPTIDE GP95 - A UNIQUE GLYCOPROTEIN OF OLFACTORY CILIA WITH TRANSMEMBRANE RECEPTOR PROPERTIES(1986) Journal of Biological Chemistry. 261, 3, p. 1299-1305 Abstract
-
1985
1984
-
-
(1984) Proceedings Of The National Academy Of Sciences Of The United States Of America-Biological Sciences. 81, 6, p. 1859-1863 Abstract
1980
1979
1978
-
-
-
HAPTEN-LINKED CONFORMATIONAL EQUILIBRIA IN IMMUNOGLOBULINS XRPC-24 AND J-539 OBSERVED BY CHEMICAL RELAXATION(1978) Biophysical Journal. 24, 1, p. 161-174 Abstract
-
ALLOSTERY IN AN IMMUNOGLOBULIN LIGHT-CHAIN DIMER - CHEMICAL RELAXATION STUDY(1978) Biophysical Journal. 24, 1, p. 247-249 Abstract
Keywords: Biophysics
1977
1976
-
(1976) Proceedings of the National Academy of Sciences of the United States of America. 73, 10, p. 3549-3553 Abstract
-
OXIDATIVE-TITRATIONS OF RHUS-VERNICIFERA LACCASE AND ITS SPECIFIC INTERACTION WITH HYDROGEN-PEROXIDE(1976) Biochemical and Biophysical Research Communications. 73, 2, p. 494-500 Abstract