https://github.com/greenelab/gea_community_detection

Overrepresentation analysis for KEGG and PID pathways using community detection
https://github.com/greenelab/gea_community_detection

analysis detection-network enrichment-analysis gene-expression kegg-pathway networks pathway pathway-analysis pid

Last synced: about 1 year ago
JSON representation

Overrepresentation analysis for KEGG and PID pathways using community detection

Host: GitHub
URL: https://github.com/greenelab/gea_community_detection
Owner: greenelab
License: bsd-3-clause
Created: 2016-06-04T17:16:19.000Z (about 10 years ago)
Default Branch: master
Last Pushed: 2018-01-07T00:47:34.000Z (over 8 years ago)
Last Synced: 2025-04-09T04:51:23.059Z (over 1 year ago)
Topics: analysis, detection-network, enrichment-analysis, gene-expression, kegg-pathway, networks, pathway, pathway-analysis, pid
Language: Python
Homepage:
Size: 5 MB
Stars: 7
Watchers: 4
Forks: 6
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE.txt

Awesome Lists containing this project

README

# GEA_Community_Detection

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.830568.svg)](https://doi.org/10.5281/zenodo.830568)

## Summary

This repository performs gene enrichment analysis using either the KEGG,
or PID databases. The experiment is set up to contain both a control
and experimental arm where the control arm is enrichment of a gene list of *m*
pathways using only *p*\% of the genes in each pathway with *a*\% additional random
genes from the ontology. This gene list is then subjected to enrichment analysis
and the relevant enriched pathways are determined. The experimental condition is
just like the control except that community detection is performed before enrichment
analysis. In particular, one can select Fastgreedy, Walktrap, Infomap, or Multilevel
as the possible grouping method. For all methods, the F1-score, false positive ratio,
and false negative ratio are returned.

All figures from the simulations are included in the Paper_Figs folder and results
from the simulations are included in the Data folder as all_iterations_data.csv.

![GEA Flowchart](Paper_Figs/flow_chart.png?raw=true)

## Reproducibility

To reproduce all analyses including simulations and HGSC applications:

```bash
# Create and activate reproducible conda environment
conda env create --force --file environment.yml
source activate gea_community_detection

# Data for this project can be downloaded using the script and URL text file
# located in the Data folder. This is required before running the pipeline.
bash Data/data_files.sh

# Reproduce all results
bash Scripts/gea_pipeline.sh
```

## Contact

* About the code: Lia Harrington (lia.x.harrington.gr@dartmouth.edu)

* About the project or collaboration: Jennifer Doherty
(jennifer.a.doherty@dartmouth.edu) or
Casey Greene at (csgreene@mail.med.upenn.edu).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/greenelab/gea_community_detection

Awesome Lists containing this project

README