The following is a typical output and explanations:
First the output includes the user-specfied job information as a reminder. It also has the jobID, time submitted, the parameters used in the job, and information on how to generate a sequence logo from reported motifs.
Job information: FirstRun W16 a1 yeast_int
JobID: 0; Submit date: Thu Apr 22 11:24:59 PDT 2004
Job parameters: Blk1 width 16; Motif occurs in each seq; Report 3 motifs; Background yeast_int.bg;
To generate sequence logo of motifs,
paste the aligned sequences below each motif to the following website:
http://weblogo.berkeley.edu/logo.cgi
****************************************
*
*
* BioProspector Search
Result *
*
*
****************************************
The program searches for motifs from sequences a number of times and reports the number of top-scoring motifs you specified. Each motif looks like the following:
The highest scoring 3 motifs are:
Motif width (blk 1, blk 2); Gap [min gap, max gap]; Motif score (= log(number of sites) * (relative entropy of the motif) / motif width); Number of motif sites.
Motif #1:
******************************
Width (15, 0); Gap [0, 0]; MotifScore 1.826; Sites 16
The motif matrix in TRANSFAC format, each line represent a motif column. Motif base probability at a column, Consensus (the most abudant base), Reverse (compliment) Consensus, Degenerate (consider all bases with >= 25% abudance, represented in IUPAC), Reverse degenerate.
Blk1 A C G T Con rCon Deg rDeg 1 0.43 0.37 0.37 98.82 T A T A 2 6.22 0.37 87.19 6.22 G C G C 3 12.00 0.37 0.37 87.25 T A T A 4 0.43 0.37 92.98 6.22 G C G C 5 75.67 6.16 17.74 0.43 A T A T 6 46.73 35.10 17.74 0.43 A T M K 7 40.94 23.53 23.53 12.00 A T A T 8 12.00 29.31 0.37 58.31 T A Y R 9 29.37 35.10 6.16 29.37 C G H D 10 17.79 17.74 64.04 0.43 G C G C 11 40.94 23.53 11.95 23.58 A T A T 12 0.43 6.16 29.31 64.09 T A K M 13 6.22 87.19 0.37 6.22 C G C G 14 64.09 0.37 35.10 0.43 A T R Y 15 29.37 58.25 11.95 0.43 C G M K
Sequences contributed to this alignment: sequence name, sequence length, site number for that sequence (e.g. sequence 1 contributed 2 sites to the motif, one at r 79, the other at f 20), orientation and location of each site (f 20 means 20 bp from beginning of sequence in forward strand, r 79 means 79 bp from beginning of sequence but in reverse strand), sequence of the aligned sites.
XXX This site is f 3. 5' ---------- 3' 3' ---------- 5' XXX This site is r 4.
>1 len 105 site #1 r 79 TGTGAAAACGATCAA >1 len 105 site #2 f 20 TGTGGCATCGGGCGA >2 len 105 site #1 r 73 TGTGACGCCGTGCAA >3 len 105 site #1 r 94 TATGACCATGCTCAC >4 len 105 site #1 r 81 TGTGACGTCCTTTGC >5 len 105 site #1 f 53 TGTTAAATTGATCAC >6 len 105 site #1 f 10 TTTGAACCAGATCGC >8 len 105 site #1 r 57 TGAGGGGTTGATCAC >9 len 105 site #1 f 12 TGTGAGTTAGCTCAC >9 len 105 site #2 f 76 TGTGGAATTGTGAGC >11 len 105 site #1 f 32 TGTGAACTAAACCGA >12 len 105 site #1 f 44 TGTGACACAGTGCAA >13 len 105 site #1 r 66 TGTGAACTCCGTCAG >14 len 105 site #1 r 89 TGTGAATCGAATCAC >16 len 105 site #1 f 56 TGTGAAATACCGCAC >17 len 105 site #1 f 87 TGAGACGTTGATCGG >18 len 105 site #1 r 96 TGTGCGACCACTCAC ******************************
For two-block non-palindrome motifs, the motifs have two blocks and each aligned segment contribute one sub-site to block 1, and one sub-site to block 2.
Motif #1: (TGTGATCG/CGATCACA, TCACACTT/AAGTGTGA) ****************************** Width (8, 8); Gap [0, 3]; MotifScore 1.803; Sites 21 Blk1 A C G T Con rCon Deg rDeg 1 0.39 14.34 0.27 85.00 T A T A 2 0.39 4.94 66.07 28.60 G C K M 3 0.39 9.64 0.27 89.69 T A T A 4 28.59 0.24 70.77 0.40 G C R Y 5 84.99 4.94 0.27 9.80 A T A T 6 28.59 19.04 19.07 33.30 T A W W 7 14.49 33.14 23.77 28.60 C G Y R 8 5.09 23.74 51.97 19.20 G C G C Blk2 A C G T Con rCon Deg rDeg 1 0.39 0.24 19.07 80.30 T A T A 2 23.89 56.64 0.27 19.20 C G C G 3 66.19 4.94 23.77 5.10 A T A T 4 0.39 94.23 0.27 5.10 C G C G 5 80.29 4.94 9.67 5.10 A T A T 6 28.59 47.24 4.97 19.20 C G M K 7 47.39 0.24 0.27 52.10 T A W W 8 42.69 0.24 0.27 56.80 T A W W >1 len 105 site #1 f 64 f 75 TTTGATCG TCACAAAA >2 len 105 site #1 f 58 f 69 TTTGCACG TCACACTT >3 len 105 site #1 f 79 f 90 TGTGAGCA TCATATTT >4 len 105 site #1 f 66 f 77 TGCAAAGG TCACATTA >5 len 105 site #1 f 12 f 20 CTTGTATG TAGCGCAT >6 len 105 site #1 f 10 f 21 TTTGAACC TCGCATTA >7 len 105 site #1 f 45 f 56 TTTATTCC TCACACTT >7 len 105 site #2 f 27 f 36 TGTAAACG TTCCACTA >8 len 105 site #1 f 25 f 34 TGCAATTC GTACAAAA >9 len 105 site #1 f 84 f 94 TGTGAGCG TAACAATT >9 len 105 site #2 f 12 f 23 TGTGAGTT TCACTCAT >10 len 105 site #1 f 17 f 28 TGTAACAG TCACACAA >11 len 105 site #1 f 64 f 75 CGTGATGT TTGCAAAA >12 len 105 site #1 f 44 f 53 TGTGACAC GTGCAAAT >13 len 105 site #1 f 51 f 62 CCTGACGG TCACACTT >14 len 105 site #1 f 74 f 85 TGTGATTC TCACATTT >15 len 105 site #1 f 20 f 31 TGTGATGT TAACCCAA >16 len 105 site #1 f 56 f 67 TGTGAAAT GCACAGAT >17 len 105 site #1 f 2 f 12 TGTGACGG GATCACTT >18 len 105 site #1 f 81 f 90 TGTGAGTG TCGCACAT >18 len 105 site #2 f 37 f 48 TTTAATTG TAACGATA ******************************
For palindrom motifs, there is only one motif matrix, but each aligned site actually contributed two sub-sites, one from forward strand, the other from backward strand.
Motif #1: (TGTGAACT/AGTTCACA) ****************************** Width (8, 8); Gap [0, 3]; MotifScore 1.877; Sites 38 Blk1 A C G T Con rCon Deg rDeg 1 8.06 10.59 2.75 78.60 T A T A 2 5.45 2.75 73.29 18.51 G C G C 3 0.22 10.59 0.14 89.05 T A T A 4 5.45 0.14 86.35 8.06 G C G C 5 86.43 13.20 0.14 0.22 A T A T 6 31.57 18.43 18.43 31.57 A T W W 7 23.73 36.72 28.88 10.67 C G S S 8 13.28 21.04 26.27 39.41 T A K M >1 len 105 site #1 f 64 r 79 TTTGATCG TGTGAAAA >2 len 105 site #1 f 58 r 73 TTTGCACG TGTGACGC >3 len 105 site #1 f 79 r 94 TGTGAGCA TATGACCA >4 len 105 site #1 f 66 r 81 TGCAAAGG TGTGACGT >5 len 105 site #1 f 53 r 68 TGTTAAAT CGTGATCA >6 len 105 site #1 f 10 r 25 TTTGAACC TGCGATCT >6 len 105 site #2 f 63 r 78 TGTGATGT TTCGATAC >7 len 105 site #1 f 79 r 96 TGCTATGG TATGAAAT >8 len 105 site #1 f 42 r 59 CGTGATCA ATTGAGGG >9 len 105 site #1 f 12 r 27 TGTGAGTT AGTGAGCT >10 len 105 site #1 f 17 r 32 TGTAACAG TGTGATCT >11 len 105 site #1 f 64 r 81 CGTGATGT TTTGCAAG >12 len 105 site #1 f 44 r 59 TGTGACAC TTTGCACT >13 len 105 site #1 f 51 r 66 CCTGACGG TGTGAACT >14 len 105 site #1 f 74 r 89 TGTGATTC TGTGAATC >15 len 105 site #1 f 20 r 35 TGTGATGT GGTTAACC >16 len 105 site #1 f 56 r 71 TGTGAAAT TGTGCGGT >17 len 105 site #1 f 2 r 18 TGTGACGG AGTGATCT >18 len 105 site #1 f 81 r 96 TGTGAGTG TGTGCGAC ******************************
Thanks for using BioProspector.
For questions, please contact X. Shirley Liu at xsliu@jimmy.harvard.edu.