Motif analysis application for 3T3-L1 DNase hypersensitivity data.
The site (http://fraenkel.mit.edu/adipo_sight/) accepts as input a list of mouse genes in the form of MGI official gene symbols and calculates DNA sequence motif enrichment from four different sets of condition-specific DNase hypersensitivity regions (dexamethasone-induced, high insulin-induced, hypoxia-induced, and TNFα-induced) in 3T3-L1 adipocytes and a set of non-conditional specific DNase hypersensitive regions from untreated mature 3T3-L1 adipocytes (general adipocyte DHS). The condition-specific DNase hypersensitive regions are called using the MACS software (Zhang, Liu et al. 2008) with the treated sample as the foreground and the untreated 3T3-L1 adipocytes as the background. The general adipocyte DNase hypersensitive regions are called with the untreated 3T3-L1 adipocytes as the foreground and total genomic DNA treated with the same amount of DNase I as background (Sabo, Kuehn et al. 2006). Hypersensitive peaks are mapped to genes if any portion falls within 10kb of a gene’s transcription start site. Motifs with high information content from the TRANSFAC database (Matys, Fricke et al. 2003) are used to find enrichment. Enrichment is calculated using the hypergeometric distribution as the likelihood that a query with k out of n sites containing a motif are drawn from all sites N with m motif matches. A site is considered to contain a motif if the maximum score is above a precomputed threshold. To calculate these thresholds, an empirical probability mass function (epmf) for all scores in the general adipocyte DHS set was built for each motif. The motif score that contains the top percentages of the probability mass (equivalent to mass above 1,2, or 3 standard deviation of a normal distribution) was chosen as the thresholds. For each motif, a p-value is calculated from the hypergeometric distribution and it is corrected for multiple hypotheses with Benjamini-Hochberg correction. Motif results are only reported if input genes map to 10 or more hypersensitive sites. Only p-values less than 0.05 were displayed, and the output for each condition can be sorted by p-value by clicking on the condition name.
MacIsaac, K. D., K. A. Lo, et al. (2010). "A quantitative model of transcriptional regulation reveals the influence of binding location on expression." PLoS Comput Biol. 6(4): e1000773.
Sabo, P. J., M. S. Kuehn, et al. (2006). "Genome-scale mapping of DNase I sensitivity in vivo using tiling DNA microarrays." Nat Meth 3(7): 511-518.
Zhang, Y., T. Liu, et al. (2008). "Model-based analysis of ChIP-Seq (MACS)." Genome Biol 9(9): R137.
Matys, V., E. Fricke, et al. (2003). "TRANSFAC: transcriptional regulation, from patterns to profiles." Nucleic Acids Res 31(1): 374-378.