Search This Blog

Saturday, September 10, 2011

Basic Local Alignment Search Tool(Blast)


The comparison of nucleotide or protein sequences from the same or different organisms is a very powerful tool in molecular biology. By finding similarities between sequences, scientists can infer the function of newly sequenced genes, predict new members of gene families, and explore evolutionary relationships. Now that whole genomes are being sequenced, sequence similarity searching can be used to predict the location and function of protein-coding and transcription-regulation regions in genomic DNA.

It is the tool most frequently used for calculating sequence similarity. Blast comes in variations for use with different query sequences against different databases. 
Blast uses heuristics(experience-based learning techniques) to align a query sequence with all sequences in a database. The objective is to find high-scoring ungapped segments among related sequences. The existence of such segments above a given threshold indicates pairwise similarity beyond random chance, which helps to discriminate related sequences from unrelated sequences in a database.

Varients of Blast:
Blastx: Search protein database using a translated nucleotide query.
Tblastn: Search translated nucleotide database using a protein query.
Tblastx: Search translated nucleotide database using a translated 
               nucleotide query.
Protein blast: Search protein database using a protein query.
Nucleotide blast: Search a nucleotide database using a nucleotide query.
Explanation for blast Algorithm with an Example :
1. Taken a query protein sequence :  VRDKMLTYS
2.Parse every three residues used in Blast word database searching. 
3. suppose one of the three residues in given word finds matches in the database.
   Query              ............DMK     DMK       DMK      DMK............
   Database         ............DMK     DTK        DHK       DML............
4. Calculate sums of match scores on BLOSUM62 matrix.

    Query              ............DMK     DMK       DMK      DMK............

    Database         ............DMK     DTK        DHK       DML............

    Sum of score                   15          12             10           10 

5. Find the database sequence corresponding to the highest score word match and extend   alignment in both the directions. 


   Query             .............VR      DMK     LTYS............

    Database         ............VK     DMK     LTRS............

                
6. Determine high score segment above a threshold(minimum required) score
    Query           ............V R    D M K    L T Y S......

     Database     ............V K    D M K    L  T R S............

                                      2  3        15       1 -1 -3 2

total score :   19



Web Address for blast tool:

No comments:

Post a Comment