LRpath - Pathway Analysis using Logistic Regression

Pathway Analysis using Logistic Regression

^NEW!!!! RNA-Enrich option for RNA-seq data

Why use our RNA-Enrich version? In tests for differential expression (DE) in RNA-seq data, there is often a relationship between gene read count and the statistical power to detect DE. This relationship has been shown to bias gene set enrichment testing. RNA-Enrich accounts for this bias empirically.

Like our standard LRpath test, RNA-Enrich does not require a p-value cut-off to define differentially expressed genes, and it works well even with small sample sized experiments.
Adjusting for read counts per gene improves the type 1 error rate and power of the test.
RNA-Enrich runs significantly faster than the standard LRpath
When no relationship between gene read count and power exists, the results approximate the standard LRpath results.

Overview

LRpath performs gene set enrichment testing, an approach used to test for predefined biologically-relevant gene sets that contain more significant genes from an experimental dataset than expected by chance. Given a high-throughput dataset with continuous significance values (i.e. p-values), LRpath tests for gene sets (termed concepts) that have significantly higher significance values (e.g. for differential expression) than expected at random. LRpath can identify both concepts that have a few genes with very significant differential expression and concepts containing many genes with only moderate differential expression. This user interface provides a user-friendly implementation of LRpath, and greatly expands the set of concepts available to test from the original publication1. Genes are mapped to concepts using their Entrez Gene IDs. The pre-defined gene sets (concept databases) available to test depend on the species, but for human, mouse, and rat include all those used in ConceptGen. The use of logistic regression allows the data to remain on a continuous scale while maintaining the interpretation of results in terms of an odds ratio , as is used with the standard Fisher's Exact test. Detailed methods are provided here. When LRpath is run for multiple comparisons in an experiment, it can be useful to visualize the results in a clustering heatmap (see example). To cluster your own LRpath results, scroll down to the bottom of the page to the Clustering section.

You can watch a tutorial on LRPath here.

Input

LRpath Clustering Analysis

LRpath Cluster Analysis allows you to integrate your LRpath results from multiple experiments in order to interactively view and explore the enrichment profiles of a set of concepts across experiments. It provides a user-friendly method for filtering, merging, and clustering LRpath results using several options. The output of this section is a set of files required to view the hierarchical clustering with each row corresponding to a concept, and each column corresponding to an experiment. In order to view and interact with the results of the cluster analysis you can use the freely available TreeView software. Simply save the output files from the cluster analysis in one folder, and then once TreeView is installed, start the program, and open the saved .cdt file. For more help, see the Java TreeView Documentation. An example of the resulting clustering is provided here.

Analysis Form

Select value to cluster by:

Select method for distance matrix:

Select link for clustering:

Cluster concepts with < in at least LRpath comparisons.
cannot exeed the number of URLs provided

URL	Comparison Name
URL	Comparison Name

Enter two or more URLs for LRpath text results to cluster, and a name for each comparison/LRpath result (must be in same order). Example URL: external link: http://lrpath.ncibi.org/result/download999999999.txt

Reference

Please reference the following publication when citing LRpath:

Kim JH, Karnovsky A, Mahavisno V, Weymouth T, Pande M, Dolinoy DC, Rozek LS, Sartor MA. (2012) LRpath analysis reveals common pathways dysregulated via DNA methylation across cancer types, BMC Genomics, 13, 526.

Lee, C, Patil S, Sartor MA. RNA-Enrich: A cut-off free functional enrichment testing method for RNA-seq with improved power. In progress.

Newton MA, Quintana FA, Boon JA, Sengupta S and Ahlquist P:Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis,Ann. Appl. Stat.Volume 1, Number 1 (2007), 85-106

For support and questions email: snehal@med.umich.edu

Copyright 2010 The University of Michigan

Grant # R01 LM008106 ("Representing and Acquiring Knowledge of Genome
Regulation") and the National Center for Integrative Biomedical
Informatics (NCIBI), NIH Grant # U54 DA021519 01A1
Terms of Use

Species
Database	Functional Annotations Biocarta Pathway EHMN metabolic pathways GO GO Biological Process GO Cellular Component GO Molecular Function KEGG Pathway Panther Pathway pFAM Reactome Literature Derived MeSH Targets Drug Bank miRBase Transcription Factors Interaction Protein Interaction (MiMI) Other Metabolite Cytoband ^NEW!!!!Custom Custom To test custom gene sets, please provide a text file with 2 columns in the following order:(1) gene set ID or name. An example is provided here. Selecting multiple, or a large, concept database may require several minutes of computation time. For approximate LRPath running times against different databases view this table.
Methods	LRpath LRpath is best for microarray data or RNA-seq data known not to exhibit any relationship between gene read counts and differential gene expression p-values. RNA-Enrich RNA-Enrich is faster than LRpath and best for RNA-seq data that exhibit a relationship between gene read counts and significance of differential gene expression (e.g. genes with higher read counts may have more significant p-values, and vice versa). Unless a thorough analysis has been done, we recommend RNA-Enrich for RNA-seq data. Random Sets The random sets method provides an approximation of LRpath also with faster calculation time.
Directional test?	Yes No Yes - A test will be performed that allows the user to distinguish between 'Up' or 'Down' regulated concepts. A directional test requires the user to specify a direction for each gene in the input file. No - A test will be performed that allows the user to distinguish between 'Enriched' and 'Depleted' concepts, but not between concepts enriched with 'Up' versus 'Down' regulated genes.
Select input file	LRpath or Random sets: Tab-delimited text with columns: (1) Entrez Gene ID or official gene symbol (Entrez Gene ID is recommended), (2) p-value, and (3) if a directional test is selected, a column indicating Up-regulation (any positive value) or Down (any negative value). RNA-Enrich: Same as above but also include (4) either a column with average read count for each gene, or multiple columns with read counts for each gene and each subject/sample. See top of an example input file here Drosophila with KEGG: FlyBase IDs are used instead of Entrez Gene IDs (ex: FBgn0036605). Yeast: use SGD IDs (Ex: YBL091C) for all databases
	Entrez Gene Id Official Gene Symbol Other Identifier
Significance cut-off for reporting the driving genes
Analysis Name	Please provide a meaningful name for this analysis.
Email	Please provide your email address if you wish to be notified when the analysis has been completed.

Options to filter which concepts are tested:
Maximum number of genes in concept		Minimum number of genes in concept
Statistical Options:
Low value used to calculate odds ratio		High value used to calculate odds ratio