BIOINFORMATICS<-->STRUCTURE
Jerusalem, Israel, November 17-21, 1996

Abstract


Three Dimensional Database of biomacromolecular structures (3DBase): the logic of an object/relational design

Enrique E. Abola (1), Joel Sussman (1,2) and Jaime Prilusky (3)

(1) Protein Data Bank, Biology Department, Brookhaven National Laboratory, Upton, NY 11973 USA
(2) Department of Structural Biology, Weizmann Institute of Science, Rehovot 76100, Israel
(3) Bioinformatics Unit, The Weizmann Institute of Science, Rehovot 76100, Israel

abola1@bnl.gov

The Protein Data Bank (PDB) is an archive of experimentally- determined three-dimensional structures of proteins, nucleic acids, and other biological macromolecules. PDB has a 25-year history of service to a global community of researchers, educators, and students in a variety of scientific disciplines. The common interest shared by this community is the desire to access information that can relate the biological functions of macromolecules to their three-dimensional structures. We now report the construction of a new relational database, 3DBase, that provides access to knowledge and information on macromolecules using a high-level query language.

The complexity of PDB entries and their use by a multi-disciplinary community required the construction of a database that represents structural, biological, chemical, and bibliographic information. In addition to all coordinate entries found in PDB, the database contains semantic links to entries found in other databases. For example, 3DBase represents the relationships between sequences found in PDB with those in SWISSPROT, GSDB, or GenBank.

3DBase is constructed with the SYBASE DBMS, the Object Protocol Model (OPM), and the OPM data management tools developed by Victor Markowitz's group at Lawrence Berkeley National Laboratory. SYBASE provides a powerful and robust environment for data management, the OPM tools allow rapid development of SYBASE databases, while OPM's object-oriented view provides a scientifically intuitive representation of data. Two primary objects in 3DBase are oExperiment and oMacroMolecule. These objects describe the experiment and the biologically active molecule, extending the current view found in PDB entries.

Database interoperability is addressed through the use of schema sharing and support for a variety of data interchange formats. 3DBase uses the CitDB schema developed by GDB to store literature references. In the near future the CitDB at PDB will be merged with GDB's data, thus making available a single CitDB database containing all the references of interest to the genomic community. In addition, 3DBase adopted GDB's approach to supporting community annotation of individual objects in the database.

The PDB is supported by a combination of Federal Government Agency funds and user fees. Support is provided by the U.S. National Science Foundation, the U.S. Public Health Service, National Institutes of Health, National Center for Research Resources, National Institutes of General Medical Sciences, National Library of Medicine, and the U.S. Department of Energy under contract DE-AC02-76CH00016.


Back to the Abstract Index.