{"id":17174924,"url":"https://github.com/tobiasrausch/atacseq","last_synced_at":"2026-03-18T01:35:18.206Z","repository":{"id":67180245,"uuid":"67402562","full_name":"tobiasrausch/ATACseq","owner":"tobiasrausch","description":"Analysis Workflow for Assay for Transposase-Accessible Chromatin using sequencing (ATAC-Seq)","archived":false,"fork":false,"pushed_at":"2023-08-03T08:44:45.000Z","size":338,"stargazers_count":75,"open_issues_count":1,"forks_count":36,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-05-30T00:08:15.940Z","etag":null,"topics":["atac-seq","atac-seq-pipeline","chromatin","next-generation-sequencing","nucleosome-positioning","peak-detection","sequencing"],"latest_commit_sha":null,"homepage":"https://tobiasrausch.com/courses/atac/","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tobiasrausch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2016-09-05T08:29:44.000Z","updated_at":"2025-01-11T14:38:34.000Z","dependencies_parsed_at":null,"dependency_job_id":"f9d1afb6-c48b-45be-a218-418f15bf5fe5","html_url":"https://github.com/tobiasrausch/ATACseq","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/tobiasrausch/ATACseq","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobiasrausch%2FATACseq","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobiasrausch%2FATACseq/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobiasrausch%2FATACseq/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobiasrausch%2FATACseq/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tobiasrausch","download_url":"https://codeload.github.com/tobiasrausch/ATACseq/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobiasrausch%2FATACseq/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30640254,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-18T00:09:27.587Z","status":"ssl_error","status_checked_at":"2026-03-18T00:09:26.123Z","response_time":56,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["atac-seq","atac-seq-pipeline","chromatin","next-generation-sequencing","nucleosome-positioning","peak-detection","sequencing"],"created_at":"2024-10-14T23:55:16.155Z","updated_at":"2026-03-18T01:35:18.164Z","avatar_url":"https://github.com/tobiasrausch.png","language":"Shell","readme":"ATAC-Seq Pipeline Installation\n------------------------------\n\n`git clone https://github.com/tobiasrausch/ATACseq.git`\n\n`cd ATACseq`\n\n`make all`\n\nIf one of the above commands fail your operating system probably lacks some build essentials. These are usually pre-installed but if you lack them you need to install these. For instance, for Ubuntu this would require:\n\n`apt-get install build-essential g++ git wget unzip`\n\n\nBuilding promoter regions for QC and downloading motifs\n-------------------------------------------------------\n\nTo annotate motifs and estimate TSS enrichments some simple scripts are included in this repository to download these databases.\n\n`cd bed/ \u0026\u0026 Rscript promoter.R \u0026\u0026 cd ..`\n\n`cd motif/ \u0026\u0026 ./downloadMotifs.sh \u0026\u0026 cd ..`\n\n\nRunning the ATAC-Seq analysis pipeline for a single sample\n----------------------------------------------------------\n\n`./src/atac.sh \u003chg38|hg19|mm10\u003e \u003cread1.fq.gz\u003e \u003cread2.fq.gz\u003e \u003cgenome.fa\u003e \u003coutput prefix\u003e`\n\n\nPlotting the key ATAC-Seq Quality Control metrics\n-------------------------------------------------\n\nThe pipeline produces at various steps JSON QC files (`*.json.gz`). You can upload and interactively browse these files at [https://gear-genomics.embl.de/alfred/](https://gear-genomics.embl.de/alfred/). In addition, the pipeline produces a succinct QC file for each sample. If you have multiple output folders (one for each ATAC-Seq sample) you can simply concatenate the QC metrics of each sample.\n\n`head -n 1 ./*/*.key.metrics | grep \"TssEnrichment\" | uniq \u003e summary.tsv`\n\n`cat ./*/*.key.metrics | grep -v \"TssEnrichment\" \u003e\u003e summary.tsv`\n\nTo plot the distribution for all QC parameters.\n\n`Rscript R/metrics.R summary.tsv`\n\n\nATAC-Seq pipeline output files\n------------------------------\n\nThe ATAC-Seq pipeline produces various output files.\n\n* [Bowtie](https://github.com/BenLangmead/bowtie) BAM alignment files filtered for duplicates and mitochondrial reads.\n* Quality control output files from [alfred](https://github.com/tobiasrausch/alfred), [samtools](http://www.htslib.org/), [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and cutadapt adapter filter metrics.\n* [Macs](https://github.com/taoliu/MACS) peak calling files and [IDR](https://www.encodeproject.org/software/idr/) filtered peak lists.\n* Succinct browser tracks in bedGraph format and IGV's tdf format.\n* Footprint track of nucleosome positions and/or transcription factor bound DNA.\n* [Homer](http://homer.ucsd.edu/homer/motif/) motif finding results.\n\n\nDifferential peak calling\n-------------------------\n\nMerge peaks across samples and create a raw count matrix.\n\n`ls ./Sample1/Sample1.peaks ./Sample2/Sample2.peaks ./SampleN/SampleN.peaks \u003e peaks.lst`\n\n`ls ./Sample1/Sample1.bam ./Sample2/Sample2.bam ./SampleN/SampleN.bam \u003e bams.lst`\n\n`./src/count.sh hg19 peaks.lst bams.lst \u003coutput prefix\u003e`\n\nTo call differential peaks on a count matrix for TSS peaks, called counts.tss.gz, using DESeq2 we first need to create a file with sample level information (sample.info). For instance, if you have 2 replicates per condition:\n\n`echo -e \"name\\tcondition\" \u003e sample.info`\n\n`zcat counts.tss.gz | head -n 1 | cut -f 5- | tr '\\t' '\\n' | sed 's/.final$//' | awk '{print $0\"\\t\"int((NR-1)/2);}' \u003e\u003e sample.info`\n\n`Rscript R/dpeaks.R counts.tss.gz sample.info`\n\n\nIntersecting peaks with annotation tracks\n-----------------------------------------\n\nPeaks can of course be intersected with enhancer or conserved element tracks, i.e.:\n\n`cd tracks/ \u0026\u0026 downloadTracks.sh`\n\n`bedtools intersect -a ./Sample2/Sample2.peaks -b tracks/conserved.bed`\n\n\nPlotting peak density along all chromosomes\n-------------------------------------------\n\nThere is a basic Rscript available for plotting peak densities.\n\n`Rscript R/karyoplot.R input.peaks`\n\n\nCitation\n--------\n\nTobias Rausch, Markus Hsi-Yang Fritz, Jan O Korbel, Vladimir Benes.       \n[Alfred: Interactive multi-sample BAM alignment statistics, feature counting and feature annotation for long- and short-read sequencing.](https://academic.oup.com/bioinformatics/advance-article-abstract/doi/10.1093/bioinformatics/bty1007/5232224)      \nBioinformatics. 2018 Dec 6.\n\nB Erarslan, JB Kunz, T Rausch, P Richter-Pechanska et al.             \n[Chromatin accessibility landscape of pediatric T‐lymphoblastic leukemia and human T‐cell precursors](https://doi.org/10.15252/emmm.202012104)            \nEMBO Mol Med (2020)          \n\n\nLicense\n-------\nThis ATAC-Seq pipeline is distributed under the [BSD 3-Clause license](https://github.com/tobiasrausch/ATACseq/blob/main/LICENSE).\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftobiasrausch%2Fatacseq","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftobiasrausch%2Fatacseq","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftobiasrausch%2Fatacseq/lists"}