CRACR_v2.m: Usage: CRACR_v2('TFname', 'TFscoredfile', 'exprfile', 'geneidfile', 'finalout') For example (using sample files online): CRACR_v2('YBL054W', 'Pbf1-avgv9v11-0.35bgdsub_lteq600.txt', '1693cond_alllog2.txt', 'gene_IDs_1693.txt', 'Pbf1CRACR_results.txt') Command Line Inputs: 1) TFname: the name of the TF whose binding specificities you are using to score genes. Use the same nomenclature as the gene expression data file (for yeast = systematic name, eg YBL054W) so the expression value of the TF itself at each condition can be found. 2) TFscoredfile: A file containing gene names and corresponding scores for TF binding according to TF specificity data (from PBM or other source). CRACR uses ranks, so only the ranking, not the actual score values, will be taken into account. (NOTE: sorts genes from high to low scores, assuming higher is better than lower.) 3) exprfile: Expression data in tab delimited format with conditions in columns and genes in rows Condition names should be in first row and gene names in first column. (assumed to be log2 for relative colors in graph outputs, but any measure of expression will work because CRACR considers only relative expression levels) 4) geneidfile: A file labeling each gene with an ID number (used for keeping track of genes in 'textless' MATLAB matrices 5) finalout: An output file for reporting the significant expression conditions for each TF. Other variables that can be changed within the program: 1) window: The window size of genes to be considered at a time. See original MSB manuscript for discussion of the effect of window size. 200 has been most commonly used for yeast. 2) areathresh: The threshold for the area statistic. The default threshold of 0.095 corresponds to p<0.001 when using the 1693 yeast gene expression conditions. The threshold corresponding to a certain p-value may vary for different datasets. 3) req_length: Number of windows that must cross significance threshold before condition is reported as significant. (Default = 1). 4) plot_num: The program will plot basic result graphs but because there may be many significant conditions resulting in an overwhelming number of graphs, this variable limits the number that will be plotted automatically-- others can be plotted later using graphconds.m (this will allow colored background as well) Outputs: 1) The output file you named in the input will contain the area statistic, whether the enrichment occurred for induced or repressed genes, and the condition index number for each condition crossing the area statistic threshold selected. 2) The full calculated area statistic data for all conditions and all windows will be stored in a .mat file based on the input TF scored genes filename. This can be used for later graphing and re-analysis. ------------------------------ graphconds.m Usage: graphconds(TFname, TFfile, matfile, exprfile, geneidfile, condinds) this program will take a *mat output file, load it, and graph chosen conditions (saved in variable condinds before running program) for the TF condinds should be a matrix of indices of conditions to graph like [2 4 6] (first condition = 2, second = 3 etc. as in column number in expression file)