Local Alignment 2 - Similarly Sized Proteins
While very short sequences can be used for local alignments, particularly when
searching databases, local comparisons of proteins having similar lengths can produce
results that must be considered closely. Here we will once again use the TPP enzymes we
have been working with.
- Go to the Main Protein Tools page, select ILV1_TOBAC and ILVB_ARATH, and open the LALIGN
tool.
- On the LALIGN setup page, choose the Blosum62 scoring matrix and set the gap penalties
to -12/-2. Click "Submit".
The first local alignment returned is *almost* a global alignment missing only a few of the N-terminal portions of each
sequence. This is expected because we know a priori that these two sequences are
very similar and the proteins are definitely homologous. [check local alignment].
The remaining local alignments, however, point up a few issues to keep in mind when
interpreting local alignments.
- First consider the lengths of the local alignments. In the 6th alignment (similarity
score of 34 and 75.0% identity), even though the sequence identity is very high, it is
derived from only an 8 residue overlap. Small stretches of sequence don't usually define
significant structural features in proteins nor do they provide much evidence for
homology.
Local alignment number 8 exhibits only 33% identity and has a slightly lower similarity
score, but the alignment contains a much longer overlap (27 aa). So, even though the
identity and similarity scores might suggest differently, local alignment 8 may actually
be a better alignment than #6.
- Second, consider the relative locations of the matches. Again looking at the 6th local
alignment, the end of ILV1 (residues 629 - 636) is aligned with part of the first half of
ILVB (residues 247 - 254), which doesn't seem likely for members of the same protein
subfamily.
This type of alignment may, however, suggest the presence of a conserved structural motif
or domain, but argues against overall protein homology. To resolve this possibility, it
would be prudent to do a global alignment of the identified sequence with the query
sequence, or, if several more likely homologs have been found, a multiple sequence
alignment of these sequences with the suspicious hit would be in order. Obviously
This particular alignment is due to the presence of an incomplete sequence repeat in the
proteins which is found by local alignment. (cf. local alignment 5) So, even though we
gain some information about locally similar regions using local alignments, we can also
lose information about overall relationships.both global and local alignments are
important in defining protein relationships.
In the next exercises, we will begin to look at database searches using local
similarity.
Next - Smith-Waterman Database Searches
Previous - Local Alignments 1
Up - Main Page
|