https://github.com/jcaperella29/atac_app
Interactive Shiny app for annotating MACS2 ATAC-seq peaks and visualizing functional enrichment (GO, KEGG, Reactome) โ fully containerized for Docker & HPC.
https://github.com/jcaperella29/atac_app
atac-seq bioinformatics chipseeker functional-enrichment genomics-tools r shiny singularity
Last synced: about 2 months ago
JSON representation
Interactive Shiny app for annotating MACS2 ATAC-seq peaks and visualizing functional enrichment (GO, KEGG, Reactome) โ fully containerized for Docker & HPC.
- Host: GitHub
- URL: https://github.com/jcaperella29/atac_app
- Owner: jcaperella29
- Created: 2025-03-20T02:13:42.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-30T18:58:42.000Z (12 months ago)
- Last Synced: 2025-06-30T19:28:28.886Z (12 months ago)
- Topics: atac-seq, bioinformatics, chipseeker, functional-enrichment, genomics-tools, r, shiny, singularity
- Language: R
- Homepage:
- Size: 2.06 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ๐ฌ JCAP_ATAC_SEQ APP
An interactive R Shiny app for analyzing and visualizing ATAC-seq peak data.
Upload BED files, generate consensus peaks, annotate regions, run motif enrichment, perform machine learning, simulate power, and more โ **no coding required**.
---
## ๐ Features
- ๐ **Upload** ZIPs of BED files to generate consensus peaks
- ๐งฌ **Peak annotation** (nearest gene, distance to TSS, etc)
- ๐ **Motif enrichment** (using motifmatchr + JASPAR2020)
- ๐งช **Differential accessibility** (DAA) with DESeq2
- ๐ **PCA, UMAP, heatmaps** for exploratory analysis
- ๐ค **Random Forest** classifier (AUC, feature importance)
- ๐ฌ **Power analysis** for experimental design
- ๐พ **Downloadable** results for all major outputs
- ๐จ **Anime-inspired UI** (with fairy_tail.css)
- ๐ฅ๏ธ Runs in RStudio, Shiny Server, or on HPC (Singularity)
---
## ๐งช Sample Data
To test the appโs full workflow, use the provided files:
| File | Purpose |
|----------------------------|---------------------------------|
| `focused_promoter_peaks.zip` | Consensus peaks (ZIP of BED) |
| `simulated_counts (1).csv` | Simulated peak ร sample counts |
| `simulated_metadata (1).csv` | Sample annotations (incl. group)|
**How to use:**
Upload each file via the corresponding input in the sidebar.
---
## ๐ฆ Requirements
You need R (โฅ 4.3.x) and the following R packages:
```r
install.packages(c(
"shiny", "shinyjs", "plotly", "DT", "markdown",
"randomForest", "pROC", "enrichR","uwot", "umap"
))
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
BiocManager::install(c(
"GenomicRanges", "GenomicFeatures", "org.Hs.eg.db",
"TxDb.Hsapiens.UCSC.hg38.knownGene", "TFBSTools",
"JASPAR2020", "BSgenome.Hsapiens.UCSC.hg38", "motifmatchr"
))
๐งฌ How to Run the App
Option 1: Local RStudio
Download/clone this repository.
Open RStudio, set your working directory to the project folder:
and run this in your console
setwd("/path/to/ATAC_APP")
library(shiny)
runApp(".")
The app will open in your browser (e.g. http://localhost:8787 if using Shiny Server).
HPC / Singularity Deployment
Option 2: HPC (with SLURM & Singularity)
1.Build and run from the bash command line:
singularity build atac-shiny.sif Singularity.def
2.Submit your job with SLURM:
sbatch run_atac_app.sh
Monitor the output and error logs as specified in the batch script.
3.Forward port 8787 to your local machine for browser access:
ssh -L 8787:localhost:8787 user@cluster
Open the app in your browser:
http://localhost:8787
Tip:
Edit run_atac_app.sh to adjust resource requirements or output locations for your cluster.
See example SLURM script (run_atac_app.sh) included in the repo.
๐๏ธ How to Use the App
Consensus Peaks:
Upload a ZIP of BED files and click Make Consensus Peaks.
Peak Annotation:
Click Run Peak Annotation.
Results: Peak Annotation Table, Pie Chart.
๐งฌ Enrichr Pathway Analysis
Run Enrichr
Run DAA first
Select a contrast (e.g. Treated_vs_Control)
Choose a gene set (DAA_ALL, DAA_UP, or DAA_DOWN)
Click Run Enrichr
The app will:
Map ATAC peaks โ genes
Send the gene set to Enrichr
Retrieve enriched GO, Reactome, KEGG, and regulatory pathways
Rank results by Combined Score (or Adjusted p-value when needed)
Filter out weak single-gene overlaps
Results appear in:
Enrichr Table (full pathway list)
Enrichr Barplot (top ranked biological programs)
This converts chromatin accessibility changes into interpretable biological pathways for downstream analysis and LLM triage.
Motif Enrichment:
Select a JASPAR family, click Run Motif Enrichment.
Results: Motif Table, Motif Barplot.
Counts & Metadata:
Upload count matrix (CSV) and metadata (CSV).
Click Run DAA on Uploaded Counts to see DESeq2 results.
.
Exploratory Visualizations:
PCA: Click Run PCA
UMAP: Click Run UMAP
Heatmap: Click Plot Heatmap
Random Forest Classifier:
Click Run Random Forest Classifier.
Results: AUC, Accuracy, Feature Importance.
Power Analysis:
Set effect size, FDR, and replicates. Click Run Power Analysis.
Downloadable Results:
All major tables have Download buttons.
| Tab Name | Description |
| ----------------------------------- | -------------------------------------------------------------------------------------- |
| **README** | Usage guide, pipeline overview, and interpretation notes |
| **Peak Annotation Table** | Annotated consensus peaks with nearest gene, TSS distance, and peak IDs (downloadable) |
| **Annotation Pie Chart** | Visual distribution of peaks by nearest gene (top hits) |
| **Motif Enrichment Table** | Transcription factor motifs matched to accessible regions |
| **Motif Enrichment Plot** | Barplot of top enriched transcription factor motifs |
| **DAA Results (per condition)** | Differential accessibility results from DESeq2 for each contrast |
| **DAA Gene Sets (ALL / UP / DOWN)** | Enrichr-ready gene sets built from significant ATAC peaks |
| **Enrichr Pathways** | Pathway enrichment of ATAC-derived gene sets (GO, Reactome, etc.) |
| **Enrichr Barplot** | Ranked pathway barplot (Combined Score / Adjusted p-value) |
| **PCA Plot** | Principal component analysis of ATAC peak counts |
| **UMAP Plot** | UMAP projection of samples based on chromatin accessibility |
| **Heatmap** | Clustered heatmap of top variable peaks (log-scaled counts) |
| **RF Metrics** | Random forest classifier performance (AUC, accuracy) |
| **Feature Importance** | Peaks ranked by Gini importance from the RF model |
| **ROC Curve** | Receiver-operating characteristic curve for classification |
| **Power Plot** | Statistical power vs. number of replicates |
| **Power Table** | Power estimates with confidence intervals (downloadable) |
๐ ๏ธ Developer Notes
Error logging: All errors are appended to error_log.txt.
Logs: Use email_log.R to manage or email logs if desired.
ATAC_APP/
โโโ app.R # Main Shiny app
โโโ run.sh # Launch script (local)
โโโ run_atac_app.sh # SLURM batch script for HPC
โโโ email_log.R # Log emailer
โโโ error_log.txt # Error logs
โโโ www/
โ โโโ fairy_tail.css # Themed UI
โโโ sample_data/
โ โโโ focused_promoter_peaks.zip
โ โโโ simulated_counts (1).csv
โ โโโ simulated_metadata (1).csv
โโโ logs/ # Archived logs
โโโ Singularity.def # Singularity definition
โโโ README.md
๐ง FAQ
Can I use BAM files?
No โ only BED/peak-level data are supported. Use MACS2 or similar to call peaks first.
Is this cloud deployable?
Not trivially โ Bioconductor packages and large genome data require persistent local storage. HPC, local, or Docker/Singularity are recommended.
How big can my datasets be?
The app is robust for datasets up to ~10k peaks ร 100 samples. Larger sets may need more RAM.