BIOINFORMATICS<-->STRUCTURE
Jerusalem, Israel, November 17-21, 1996

Abstract


A gap penalty scheme for increased selectivity in profile searching

Terry Farrah

ZymoGenetics, 1201 Eastlake Av East Seattle, Washington, USA

farrah@zgi.com


Profiles - position-specific scoring matrices - were introduced nine years ago by Gribskov as a method for searching databases using multiple related sequences at once. Gribskov's conception of profiles allows gaps, and gap penalties are also position specific. Although search techniques allowing gaps lack rigor due to the absence of mathematical theory regarding gaps, they are valuable because they can be more sensitive than non-gapping techniques. This sensitivity is bought at the cost of increased output noise, however.

I demonstrate that some of the noise in profile search results is due to the uniform application of a single set of gap initiation/extension penalties at each position in the profile. I argue that any given position, a sequence insertion has different significance than a sequence deletion. Using an extended profile syntax, I introduce a system of gap penalties that distinguishes between sequence insertions and deletions. The resulting profiles show increased performance on a small set of test protein families.


Back to the Abstract Index.