BioProspector Output

The following is a typical output and explanations:

First the output includes the user-specfied job information as a reminder. It also has the jobID, time submitted, the parameters used in the job, and information on how to generate a sequence logo from reported motifs.

Job information: FirstRun W16 a1 yeast_int
JobID: 0; Submit date: Thu Apr 22 11:24:59 PDT 2004
Job parameters: Blk1 width 16; Motif occurs in each seq; Report 3 motifs; Background yeast_int.bg;
To generate sequence logo of motifs, paste the aligned sequences below each motif to the following website:
http://weblogo.berkeley.edu/logo.cgi

****************************************
*                                      *
*      BioProspector Search Result     *
*                                      *
****************************************

The program searches for motifs from sequences a number of times and reports the number of top-scoring motifs you specified. Each motif looks like the following:

The highest scoring 3 motifs are:

Motif width (blk 1, blk 2); Gap [min gap, max gap]; Motif score (= log(number of sites) * (relative entropy of the motif) / motif width); Number of motif sites.

Motif #1:
******************************
Width (15, 0); Gap [0, 0]; MotifScore 1.826; Sites 16

The motif matrix in TRANSFAC format, each line represent a motif column. Motif base probability at a column, Consensus (the most abudant base), Reverse (compliment) Consensus, Degenerate (consider all bases with >= 25% abudance, represented in IUPAC), Reverse degenerate.

Blk1    A      C      G      T         Con  rCon Deg  rDeg
1      0.43   0.37   0.37  98.82        T    A    T    A
2      6.22   0.37  87.19   6.22        G    C    G    C
3     12.00   0.37   0.37  87.25        T    A    T    A
4      0.43   0.37  92.98   6.22        G    C    G    C
5     75.67   6.16  17.74   0.43        A    T    A    T
6     46.73  35.10  17.74   0.43        A    T    M    K
7     40.94  23.53  23.53  12.00        A    T    A    T
8     12.00  29.31   0.37  58.31        T    A    Y    R
9     29.37  35.10   6.16  29.37        C    G    H    D
10    17.79  17.74  64.04   0.43        G    C    G    C
11    40.94  23.53  11.95  23.58        A    T    A    T
12     0.43   6.16  29.31  64.09        T    A    K    M
13     6.22  87.19   0.37   6.22        C    G    C    G
14    64.09   0.37  35.10   0.43        A    T    R    Y
15    29.37  58.25  11.95   0.43        C    G    M    K

Sequences contributed to this alignment: sequence name, sequence length, site number for that sequence (e.g. sequence 1 contributed 2 sites to the motif, one at r 79, the other at f 20), orientation and location of each site (f 20 means 20 bp from beginning of sequence in forward strand, r 79 means 79 bp from beginning of sequence but in reverse strand), sequence of the aligned sites.

     XXX        This site is f 3.
5' ---------- 3'
3' ---------- 5'
      XXX       This site is r 4.
>1	len 105	site #1	r 79
TGTGAAAACGATCAA
>1	len 105	site #2	f 20
TGTGGCATCGGGCGA
>2	len 105	site #1	r 73
TGTGACGCCGTGCAA
>3	len 105	site #1	r 94
TATGACCATGCTCAC
>4	len 105	site #1	r 81
TGTGACGTCCTTTGC
>5	len 105	site #1	f 53
TGTTAAATTGATCAC
>6	len 105	site #1	f 10
TTTGAACCAGATCGC
>8	len 105	site #1	r 57
TGAGGGGTTGATCAC
>9	len 105	site #1	f 12
TGTGAGTTAGCTCAC
>9	len 105	site #2	f 76
TGTGGAATTGTGAGC
>11	len 105	site #1	f 32
TGTGAACTAAACCGA
>12	len 105	site #1	f 44
TGTGACACAGTGCAA
>13	len 105	site #1	r 66
TGTGAACTCCGTCAG
>14	len 105	site #1	r 89
TGTGAATCGAATCAC
>16	len 105	site #1	f 56
TGTGAAATACCGCAC
>17	len 105	site #1	f 87
TGAGACGTTGATCGG
>18	len 105	site #1	r 96
TGTGCGACCACTCAC
******************************

For two-block non-palindrome motifs, the motifs have two blocks and each aligned segment contribute one sub-site to block 1, and one sub-site to block 2.

Motif #1: (TGTGATCG/CGATCACA, TCACACTT/AAGTGTGA)
******************************
Width (8, 8); Gap [0, 3]; MotifScore 1.803; Sites 21

Blk1    A      C      G      T         Con  rCon Deg  rDeg 
1      0.39  14.34   0.27  85.00        T    A    T    A
2      0.39   4.94  66.07  28.60        G    C    K    M
3      0.39   9.64   0.27  89.69        T    A    T    A
4     28.59   0.24  70.77   0.40        G    C    R    Y
5     84.99   4.94   0.27   9.80        A    T    A    T
6     28.59  19.04  19.07  33.30        T    A    W    W
7     14.49  33.14  23.77  28.60        C    G    Y    R
8      5.09  23.74  51.97  19.20        G    C    G    C

Blk2    A      C      G      T         Con  rCon Deg  rDeg 
1      0.39   0.24  19.07  80.30        T    A    T    A
2     23.89  56.64   0.27  19.20        C    G    C    G
3     66.19   4.94  23.77   5.10        A    T    A    T
4      0.39  94.23   0.27   5.10        C    G    C    G
5     80.29   4.94   9.67   5.10        A    T    A    T
6     28.59  47.24   4.97  19.20        C    G    M    K
7     47.39   0.24   0.27  52.10        T    A    W    W
8     42.69   0.24   0.27  56.80        T    A    W    W

>1	len 105	site #1	f 64	f 75
TTTGATCG TCACAAAA
>2	len 105	site #1	f 58	f 69
TTTGCACG TCACACTT
>3	len 105	site #1	f 79	f 90
TGTGAGCA TCATATTT
>4	len 105	site #1	f 66	f 77
TGCAAAGG TCACATTA
>5	len 105	site #1	f 12	f 20
CTTGTATG TAGCGCAT
>6	len 105	site #1	f 10	f 21
TTTGAACC TCGCATTA
>7	len 105	site #1	f 45	f 56
TTTATTCC TCACACTT
>7	len 105	site #2	f 27	f 36
TGTAAACG TTCCACTA
>8	len 105	site #1	f 25	f 34
TGCAATTC GTACAAAA
>9	len 105	site #1	f 84	f 94
TGTGAGCG TAACAATT
>9	len 105	site #2	f 12	f 23
TGTGAGTT TCACTCAT
>10	len 105	site #1	f 17	f 28
TGTAACAG TCACACAA
>11	len 105	site #1	f 64	f 75
CGTGATGT TTGCAAAA
>12	len 105	site #1	f 44	f 53
TGTGACAC GTGCAAAT
>13	len 105	site #1	f 51	f 62
CCTGACGG TCACACTT
>14	len 105	site #1	f 74	f 85
TGTGATTC TCACATTT
>15	len 105	site #1	f 20	f 31
TGTGATGT TAACCCAA
>16	len 105	site #1	f 56	f 67
TGTGAAAT GCACAGAT
>17	len 105	site #1	f 2	f 12
TGTGACGG GATCACTT
>18	len 105	site #1	f 81	f 90
TGTGAGTG TCGCACAT
>18	len 105	site #2	f 37	f 48
TTTAATTG TAACGATA
******************************

For palindrom motifs, there is only one motif matrix, but each aligned site actually contributed two sub-sites, one from forward strand, the other from backward strand.

Motif #1: (TGTGAACT/AGTTCACA)
******************************
Width (8, 8); Gap [0, 3]; MotifScore 1.877; Sites 38

Blk1    A      C      G      T         Con  rCon Deg  rDeg 
1      8.06  10.59   2.75  78.60        T    A    T    A
2      5.45   2.75  73.29  18.51        G    C    G    C
3      0.22  10.59   0.14  89.05        T    A    T    A
4      5.45   0.14  86.35   8.06        G    C    G    C
5     86.43  13.20   0.14   0.22        A    T    A    T
6     31.57  18.43  18.43  31.57        A    T    W    W
7     23.73  36.72  28.88  10.67        C    G    S    S
8     13.28  21.04  26.27  39.41        T    A    K    M

>1	len 105	site #1	f 64	r 79
TTTGATCG TGTGAAAA
>2	len 105	site #1	f 58	r 73
TTTGCACG TGTGACGC
>3	len 105	site #1	f 79	r 94
TGTGAGCA TATGACCA
>4	len 105	site #1	f 66	r 81
TGCAAAGG TGTGACGT
>5	len 105	site #1	f 53	r 68
TGTTAAAT CGTGATCA
>6	len 105	site #1	f 10	r 25
TTTGAACC TGCGATCT
>6	len 105	site #2	f 63	r 78
TGTGATGT TTCGATAC
>7	len 105	site #1	f 79	r 96
TGCTATGG TATGAAAT
>8	len 105	site #1	f 42	r 59
CGTGATCA ATTGAGGG
>9	len 105	site #1	f 12	r 27
TGTGAGTT AGTGAGCT
>10	len 105	site #1	f 17	r 32
TGTAACAG TGTGATCT
>11	len 105	site #1	f 64	r 81
CGTGATGT TTTGCAAG
>12	len 105	site #1	f 44	r 59
TGTGACAC TTTGCACT
>13	len 105	site #1	f 51	r 66
CCTGACGG TGTGAACT
>14	len 105	site #1	f 74	r 89
TGTGATTC TGTGAATC
>15	len 105	site #1	f 20	r 35
TGTGATGT GGTTAACC
>16	len 105	site #1	f 56	r 71
TGTGAAAT TGTGCGGT
>17	len 105	site #1	f 2	r 18
TGTGACGG AGTGATCT
>18	len 105	site #1	f 81	r 96
TGTGAGTG TGTGCGAC
******************************

Thanks for using BioProspector.
For questions, please contact X. Shirley Liu at xsliu@jimmy.harvard.edu.