1. General introduction

Our web server is an integrated DNA motif analyses suite, whose framework is shown on the front page (see below). The users can kick off with a try of motif finding by push the rectangle at the left-upper corner, which can lead them to the submit page. If the users are working on any sequenced prokaryotic species, click the "Select from DOOR" title, then they can select their query genome and operons, our server will prepare the promoter sequences automatically for the following-up motif analyses. The largest dark-green rectangle indicates our de-novo motif finding function, which is also the most important and common function in motif analysis. Its results are the basis of the advanced motif analyses: motif refinement, motif comparison and clustering; and motif occurrence. Each of the advanced functions is shown in the most right part, and the users can access the specific functional analysis page by clicking the corresponding logos. The motif database cylinder links a collection of annotated motifs for both prokaryotic and eukaryotic species, in case the users want to start with some documented motifs for the advanced analyses functions.
Alternatively, the users can push the submit button on the navigation bar to access the job submission page, where they can upload their data to our server respect to a specific function.
And our server supports the searching engine when the users go back with a specific job ID, which can link them to the results of the job.

2. How to submit a job for De-novo motif finding

The default page for submitting a job is designed for de-novo motif finding (see Fig. S2a).Totally, there are up to four steps to complete a job submission.
Totally, there are up to four steps to complete a job submission.
Step 1:Input query sequences: The requirement of sequence format can be found in FAQ 4. The users have three ways to upload the sequence:
(i) paste the sequences in the corresponding box, see an example by selecting " sample " .
(ii) let DOOR2 prepare the promoter sequences if they focus on bacteria;
(iii) upload a local file containing the query sequence. Please see the details of how to submit sequence using DOOR2 database in Tutorial S3.
Step 2:Include control sequences. It is optional and allow the user to include a set of background sequences as control, see details in FAQ 12. We can further evaluate the predicted motifs besides their P-values using formula (1) in FAQ 10. The format and submission requirement of background sequence is same to above query sequence.
Step 3: Algorithm parameters. This one is optional.
Step 4: Submit job. Before selecting "Submit" button, the users can leave their emails, which will be contacted when the job is done. However this action is optional.

3. How to submit sequence using DOOR2 database

Whenever the users select a DOOR2 database logo, they will see the following page.

Take E. coli K12 as an example. Firstly, type "NC_000913" in the searching bar at the right-upper corner, and you will get

Select "NC_000913(C)" and then select the operons you are interested in, say the first 15 operons in the following page, and the select "Get promoters"
Then the users will be linked to the default submit page with your promoters pasted in the correspond box, as following. And the users can go ahead to submit jobs, see details in Tutorial 2.


4. How to interpret the results page of De-novo motif finding

The following result page is generated using the sample sequences provided by our server:

There are six columns for each predicted motif shown in a result table:
(i)summary, which a motif web logo is listed as default and more details can be found if the users select "motif logo". In this page, we first map the predicted motifs to the query sequences, by which the users can easily get the relative positions of the motifs to corresponding down-stream genes.

Also some detailed information of this motif is provided, e.g. PWM, PSSM, consensus and so on (see relevant explanation in FAQ 6-8).

(ii)The motif length, which can be sorted in practice.
(iii)The P-value of this motif, see details in FAQ 9 and formula 6 in FAQ 11.
(iv)The Z-score of this motif, which only appear when background sequence or comparative genomics strategy are used, see details in FAQ 10.
(v)Number of motif instances for this motif, which can be sorted in practice.
(vi)Details of each motif instances, including start and end positions in corresponding query sequences. Specifically, the motif instances in orange are the most conserved ones and the yellow ones are identified by our P-value framework, see details in FAQ 9 and 11.
Besides above six kinds of information, we can do some follow-up analyses for any selected motifs in this results page. For example, we select the first five motifs in this page. And we can do

(i)Motif scan, which can automatically submit selected motifs to submission page for further scanning analysis in to-be-scanned sequences. See details in Tutorial 5.
(ii)Motif compare, which can automatically submit selected motifs to submission page for motif comparison. See details in Tutorial 7.
(iii)Motif co-occurrence, which can map these five motif to query sequences. Hence, the users can easily observe which motifs prefer to occur together or not. See details in Tutorial 8.
(iv)Trans-format, which can transfer current motif format to MEME and Uniprobe formats. Hence, the users can easily do further motif analysis using other servers.
Also the input sequences and all the results are ready for downloading indicated on the right-upper corner.

5. How to submit a job for motif scanning

Similar to do-novo motif finding submission page, there are four steps for submitting a motif scanning job:

Step 1: Select motif format. Three kinds of format are accepted by our server, see details in FAQ 5. And the users can submit background sequences, if have, see details in FAQ 11-12.
Step 2: Enter query sequence. In this step, the users are required to submit two parts of input: (i) the motifs with selected format in Step 1; and (ii) the to-be-scanned DNA sequences in FASTA format. In addtion, the background sequences should be submitted if the users select "Yes" in Step 1.
Step 3: same to the step 3 in Tutorial 2. Step 4: same to the step 4 in Tutorial 2.

6. How to interpret the results page of motif scanning

Basically, the results page of motif scanning is same to de-novo motif finding, see below, and please see relative explanation in Tutorial 4.


7. How to submit a job for motif comparison and clustering

Similar to Tutorial 2 and 5, there are four steps to submit a job for motif comparison and clustering:

Step 1: Motif compare function. If the users can provide the origin sequences for the query motifs, our method can improve the motif comparison performance by considering the weak conserved signals of motifs flanking regions, see details in FAQ 13. However this is optional and the default option is "no".
Step 2: Enter query sequence. In this step, the users are only allowed to submit the motif alignments; and the origin sequences of submited motifs (in FASTA format) should be submitted if the users select "Yes" in Step 1.
Step 3: same to the step 3 in Tutorial 2.
Step 4: same to the step 4 in Tutorial 2.

8. How to do motif occurrence analysis

his function is only available to the users as a follow-up analysis in the results page of motif finding and motif scanning. See details in Tutorial 4 and 6. The results page is as follows,

In above table, the co-occurrence P-values of each pair of motifs are shown in the third column, which are calculated based on the hyper-geometric disctribution, see details in FAQ 15. Other relative number in the hyper-geometric disctribution are shown in the last four columns. In this example results page, using the sample data on our server, we can see that motif-2 and motif-5 prefer to occur together, hence may regulate downstream genes together with high probability, see the mapping picture below:


9. How to interpret the results page of motif comparison and clustering

(i)The paired similarity, which is table containing the similarity score between any pair of submitted motifs. All of the three columns can be sorted in practice. Hence, the users can easily get the answer to "which pair of motifs has the highest similarity score" and "for one of the motif, which is the most simiar one?". In addition, the two motif logos will be shown if the users click the similarity score.

(ii)The similarity matrix, where the users can get the overall view of the similarities between any pair of motifs. In addition, we color each element with the scale of corresponding similarity score. The darker of the color is, the more similar the corresponding two motifs are. The server provides sort operation on each column to check the most similar motifs for target motif.