EZRetrieve Banner

About | Single Retrieval | Multiple Retrievals | Multiple TFSearch | User Folders | Comments | Links | Help | Job History


  1. Current tools ...
  2. Details about sources of information ...
  3. An Detailed explanation for TFSearch score and threshold ...
  4. iupac code for nucleotides ...
top Current tools include:
top Sources include (use human as an example):
top TFSearch Scoring Scheme:
TFSEARCH is using the following scheme to calculate the score. We are going to use an example to illustrate how the calculation works and how changes in the value of threshold will affect the results of TFSEARCH.
scoring scheme: score = 100.0 * ('weighted sum' - min) / (max - min)
Where "max" and "min" are the sum of possible maximum or minimum values of each position of the weighted matrix, respectively. The "weighted sum" is the value calculated by comparing the sequence being evaluated to the weighted matrix.
Example:
To the right is part of the TFSEARCH output file. Let's see how is the score 86.1 is calculated.
... ...

51 CTGGTCGTCC TCTGTGAGGG GGCCCCAGTC CCCCTGCAGG CAGCAGGACT entry        score
                                 <--------                M00083 MZF1   86.1
... ...
The matrix for M00083 (ID V$MZF1_01):
A       C       G       T        
8       3       6       3       N
4       2       11       3       G
3       3       5       9       N
0       0       20       0       G
1       0       18       1       G
0       0       19       1       G
0       0       20       0       G
18       1       1       0       A
max:

max = 8 + 11 + 9 + 20 + 18 + 19 + 20 + 18 = 123

(sum of possible maximum value at each position of the weighted matrix)
A       C       G       T        
8       3       6       3       N
4       2       11       3       G
3       3       5       9       N
0       0       20       0       G
1       0       18       1       G
0       0       19       1       G
0       0       20       0       G
18       1       1       0       A
min:

min = 3 + 2 + 3 + 0 + 0 + 0 + 0 + 0 = 8

(sum of possible minimum value at each position of the weighted matrix)
A       C       G       T        
8       3       6       3       N
4       2       11       3       G
3       3       5       9       N
0       0       20       0       G
1       0       18       1       G
0       0       19       1       G
0       0       20       0       G
18       1       1       0       A
weighted sum: (The arrow indicate we should use opposite strand in this case)

'weighted sum' = 3 + 4 + 5 + 20 + 18 + 19 + 20 +18 = 107
A       C       G       T               opposite strand
8       3       6       3       N       C<--G
4       2       11       3       G       A<--T
3       3       5       9       N       G<--C
0       0       20       0       G       G<--C
1       0       18       1       G       G<--C
0       0       19       1       G       G<--C
0       0       20       0       G       G<--C
18       1       1       0       A       A<--T
score: score = 100.0 * ('weighted sum' - min) / (max - min) = 100.0 * (107 - 8)/(123 - 8) = 86.1
The default threshold score for TFSEARCH is 85.0. From the example illustrated above, we can clearly see that increasing the threshold will lead the program to find the binding site that in favor of high score at each position in the matrix. In this case, if we set the threshold to 100.0, only site that exactly match "AGTGGGGA"(or opposite strand) will be found, as highlighted by GREEN color in the matrix above.

The scoring scheme is a gauge of how well a string matches with the pattern specified by the weighted matrix. Since there is no probability involved, the score does not reflect statistical significance.




Haibo Zhang
Copyright © Center for Applied Genomics & NJIT 2001 - All rights reserved. EZ-Retrieve version 2.0.
Last modified: Date: 06/29/2008 14:48:25
tomcat mysql