TF-8mer glossary and GENRE background construction method

Online manuals, source codes, and executables



Abstract

Transcription factors (TFs) control cellular processes by binding specific DNA motifs to modulate gene expression. Motif enrichment analysis of regulatory regions can identify direct and indirect TF binding sites. Here, we created a glossary of 108 non-redundant TF-8mer "modules" of shared specificity for 671 metazoan TFs from publicly available and new universal protein binding microarray data. Analysis of 239 ENCODE TF chromatin immunoprecipitation sequencing datasets and associated RNA sequencing profiles suggest the 8mer modules are more precise than position weight matrices in identifying indirect binding motifs and their associated tethering TFs. We also developed GENRE (genomically equivalent negative regions), a tunable tool for construction of matched genomic background sequences for analysis of regulatory regions. GENRE outperformed four state-of-the-art approaches to background sequence construction. We used our TF-8mer glossary and GENRE in the analysis of the indirect binding motifs for the co-occurrence of tethering factors, suggesting novel TF-TF interactions. We anticipate that these tools will aid in elucidating tissue-specific gene-regulatory programs.

Paper
Mariani L, Weinand K, Vedenko A, Barrera LA, Bulyk ML. Identification of human lineage-specific transcriptional co-regulators enabled by a glossary of binding modules and tunable genomic backgrounds. Cell Systems. 2017 Sep 27;5(3):187-201
Documentation
This document acts as a tutorial for using GENRE and the glossary.
Software
GENRE software is copyrighted by (c) 2017 The Brigham and Women's Hospital, Inc.
glossary software is copyrighted by (c) 2017 The Brigham and Women's Hospital, Inc.
Licensing
The software provided herein is free for academic instruction and research use only. Commercial licenses are available to legal entities, including companies and organizations (both for-profit and non-profit), requiring the software for general commercial use. To obtain a commercial license please, contact us via e-mail.
Disclaimer
This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. The code should not be modified and/or redistributed without the permission of the authors.

By downloading this software, you are agreeing to the above licensing information and disclaimer.

For questions, comments, and concerns, please contact Dr. Martha Bulyk at: mlbulyk [at] genetics [dot] med [dot] harvard [dot] edu


NEW! This software has been incorporated into a new motif enrichment tool for chromatin accessibility! For more information, see here: MEDEA









This page was last updated May 4, 2020