The phrap module contains three pieces of related software: phrap, cross_match, and swat.
phrap is a program for assembling shotgun DNA sequence data. Among other features, it allows use of the entire read and not just the trimmed high quality part, it uses a combination of user-supplied and internally computed data quality information to improve assembly accuracy in the presence of repeats, it constructs the contig sequence as a mosaic of the highest quality read segments rather than a consensus, it provides extensive assembly information to assist in trouble-shooting assembly problems, and it handles large datasets.
cross_match is a general purpose utility for comparing any two DNA sequence sets using a 'banded' version of swat. For example, it can be used to compare a set of reads to a set of vector sequences and produce vector-masked versions of the reads, a set of cDNA sequences to a set of cosmids, contig sequences found by two alternative assembly procedures (for example, phrap and xbap) to each other, or phrap contigs to the final edited cosmid sequence. It is slower but more sensitive than BLAST.
swat is a program for searching one or more DNA or protein query sequences, or a query profile, against a sequence database, using an efficient implementation of the Smith-Waterman or Needleman-Wunsch algorithms with linear (affine) gap penalties. For each match an empirical measure of statistical significance derived from the observed score distribution is computed.