The Fraenkel Lab Biological Engineering MIT
 
Overview
Input
MD programs Scoring Int. Output Clustering Output Try it!
WebMOTIFS takes two types of input, described below:

Names of coregulated genes or bound probes

WebMOTIFS searches for motifs (repeated elements) in DNA sequences, specified by a list of genes or probes provided by the user. WebMOTIFS retrieves the corresponding sequences, which are automatically passed to the requested motif discovery programs.

WebMOTIFS currently supports motif discovery for DNA from S. cerevisiae, M. musculus, and H. sapiens.

WebMOTIFS can analyze either promoter regions or ChIP-chip data.

  • Promoter regions: search a specified region of DNA from a list of promoters. For example, for yeast, WebMOTIFS gives the option of searching sequences from 500 bases upstream to 200 bases downstream of the transcription start site, or searching the full intergenic region.
    The promoters can be specified using gene names or by entering probe names from a transcriptional profiling array. For a list of supported transcriptional profiling arrays, see here. If you would like us to add an array not already listed, please contact tamo@mit.ecu.
  • ChIP-chip data: search the DNA surrounding a set of bound probes from a ChIP-chip experiment. The data should be entered as a list of probe IDs from one of the array platforms listed here. These ChIP-chip array platforms include probes that cover promoters from yeast, mouse, and human.
    If you would like us to add an array not already listed, please contact tamo@mit.edu.

How sequences are retrieved: For analysis of promoter regions, WebMOTIFS retrieves a user-specified region (for example, from -500 to +200 bp) around the transcription start site for every gene specified.
For analysis of ChIP-chip data, WebMOTIFS retrieves a sequence window centered around the specified probe. For PCR-based arrays, the sequence windows consist of the entire probe sequence (and often some additional base pairs upstream and downstream of the probe). For oligonucleotide arrays, the sequence windows are centered on the probe sequence, with several hundred base pairs retrieved upstream and downstream of the probe. The sizes of the sequence windows retrieved for each array are listed here, along with other details about ChIP-chip array platforms.
WebMOTIFS avoids retrieving overlapping sequences. Thus, if the user enters a series of closely spaced probes (such that the sequence windows corresponding to the probes overlap), WebMOTIFS will retrieve a sequence corresponding to the entire bound region.

WebMOTIFS submits these sequences to individual motif discovery programs without further processing. In particular, repeats and coding exons are not masked. Individual motif discovery programs may therefore report simple repeats in their results, but these results are largely eliminated during postprocessing.

Format for input: Users can either paste a list of gene/probe names into a text box or upload a text file with a list of names. Names should be entered one per line. If you upload a file of names, it must be saved as plain text. (Other formats, such as *.rtf and *.doc, will make WebMOTIFS unable to recognize your gene/probe names.)

Seeds: Initial hypotheses for motif discovery

Bayesian motif discovery with WebMOTIFS requires the user to select one or more initial hypotheses (seeds) for the motif; WebMOTIFS then refines these hypotheses.

Currently, the hypotheses offered by WebMOTIFS are the Family Binding Profiles(FBPs) compiled by Macisaac et al. Family Binding Profiles summarize the binding specificities of the most common DNA-binding domains. A list of the family binding profiles is available here.

If you know the DNA binding domain of your transcription factor of interest, you may select the corresponding FBP as an initial hypothesis. If you know nothing about the structure of your transcription factor of interest, you can run Bayesian motif discovery using all of the FBPs as possible initial hypotheses.


Questions or comments? Please email tamo@mit.edu.

Website created by Katherine Romer, MIT class of 2008
Last updated 1/15/2007