Local Alignment 2 - Similarly Sized Proteins


While very short sequences can be used for local alignments, particularly when searching databases, local comparisons of proteins having similar lengths can produce results that must be considered closely. Here we will once again use the TPP enzymes we have been working with.

  1. Go to the Main Protein Tools page, select ILV1_TOBAC and ILVB_ARATH, and open the LALIGN tool.

  2. On the LALIGN setup page, choose the Blosum62 scoring matrix and set the gap penalties to -12/-2. Click "Submit".

The first local alignment returned is *almost* a global alignment missing only a few of the N-terminal portions of each sequence. This is expected because we know a priori that these two sequences are very similar and the proteins are definitely homologous. [check local alignment].

The remaining local alignments, however, point up a few issues to keep in mind when interpreting local alignments.

  1. First consider the lengths of the local alignments. In the 6th alignment (similarity score of 34 and 75.0% identity), even though the sequence identity is very high, it is derived from only an 8 residue overlap. Small stretches of sequence don't usually define significant structural features in proteins nor do they provide much evidence for homology.

    Local alignment number 8 exhibits only 33% identity and has a slightly lower similarity score, but the alignment contains a much longer overlap (27 aa). So, even though the identity and similarity scores might suggest differently, local alignment 8 may actually be a better alignment than #6.
  2. Second, consider the relative locations of the matches. Again looking at the 6th local alignment, the end of ILV1 (residues 629 - 636) is aligned with part of the first half of ILVB (residues 247 - 254), which doesn't seem likely for members of the same protein subfamily.

    This type of alignment may, however, suggest the presence of a conserved structural motif or domain, but argues against overall protein homology. To resolve this possibility, it would be prudent to do a global alignment of the identified sequence with the query sequence, or, if several more likely homologs have been found, a multiple sequence alignment of these sequences with the suspicious hit would be in order. Obviously

    This particular alignment is due to the presence of an incomplete sequence repeat in the proteins which is found by local alignment. (cf. local alignment 5) So, even though we gain some information about locally similar regions using local alignments, we can also lose information about overall relationships.both global and local alignments are important in defining protein relationships.

In the next exercises, we will begin to look at database searches using local similarity.


Next - Smith-Waterman Database Searches
Previous - Local Alignments 1
Up - Main Page


summer_w_sm.jpg (9409 bytes)NCSAsm.gif (1758 bytes)
Developed and Maintained by Mark S. Whitsitt
Last Updated: Saturday, June 06, 1998 12:29 PM