{"id":33187104,"url":"https://showteeth.github.io/ggcoverage/","last_synced_at":"2025-11-25T18:00:39.783Z","repository":{"id":63600049,"uuid":"515089359","full_name":"showteeth/ggcoverage","owner":"showteeth","description":"Visualize and annotate genomic coverage with ggplot2","archived":false,"fork":false,"pushed_at":"2025-01-17T06:14:54.000Z","size":35438,"stargazers_count":244,"open_issues_count":2,"forks_count":20,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-01-17T07:21:02.654Z","etag":null,"topics":["base-frequency","coverage","gc-content","gene","genome-coverage","ggplot2","ideogram","peak","track","transcripts","visualization"],"latest_commit_sha":null,"homepage":"https://showteeth.github.io/ggcoverage","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/showteeth.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-07-18T08:03:59.000Z","updated_at":"2025-01-17T06:14:56.000Z","dependencies_parsed_at":"2024-01-18T04:08:05.289Z","dependency_job_id":"594ae8bd-23b3-4dba-8bd9-50ad52418812","html_url":"https://github.com/showteeth/ggcoverage","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/showteeth/ggcoverage","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/showteeth%2Fggcoverage","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/showteeth%2Fggcoverage/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/showteeth%2Fggcoverage/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/showteeth%2Fggcoverage/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/showteeth","download_url":"https://codeload.github.com/showteeth/ggcoverage/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/showteeth%2Fggcoverage/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286079811,"owners_count":27282121,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-25T02:00:05.816Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["base-frequency","coverage","gc-content","gene","genome-coverage","ggplot2","ideogram","peak","track","transcripts","visualization"],"created_at":"2025-11-16T05:00:30.371Z","updated_at":"2025-11-25T18:00:39.776Z","avatar_url":"https://github.com/showteeth.png","language":"R","funding_links":[],"categories":["Data and models"],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\",\n  dpi = 60,\n  crop = NULL\n)\n```\n\n# ggcoverage - Visualize and annotate omics coverage with ggplot2\n\n\u003cimg src = \"man/figures/ggcoverage.png\" align = \"right\" width = \"200\"/\u003e\n\n[![CRAN](https://www.r-pkg.org/badges/version/ggcoverage?color=orange)](https://cran.r-project.org/package=ggcoverage)\n[![R-CMD-check](https://github.com/showteeth/ggcoverage/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/showteeth/ggcoverage/actions/workflows/R-CMD-check.yaml)\n![GitHub issues](https://img.shields.io/github/issues/showteeth/ggcoverage)\n![GitHub last commit](https://img.shields.io/github/last-commit/showteeth/ggcoverage)\n![License](https://img.shields.io/badge/license-MIT-green)\n[![CODE_SIZE](https://img.shields.io/github/languages/code-size/showteeth/ggcoverage.svg)](https://github.com/showteeth/ggcoverage)\n\n## Introduction\n\nThe goal of `ggcoverage` is to visualize coverage tracks from genomics, transcriptomics or proteomics data. It contains functions to load data from BAM, BigWig, BedGraph, txt, or xlsx files, create genome/protein coverage plots, and add various annotations including base and amino acid composition, GC content, copy number variation (CNV), genes, transcripts, ideograms, peak highlights, HiC contact maps, contact links and protein features. It is based on and integrates well with `ggplot2`.\n\nIt contains three main parts:\n\n* **Load the data**: `ggcoverage` can load BAM, BigWig (.bw), BedGraph, txt/xlsx files from various omics data, including WGS, RNA-seq, ChIP-seq, ATAC-seq, proteomics, et al.\n* **Create omics coverage plot**\n* **Add annotations**: `ggcoverage` supports six different annotations:\n  * **base and amino acid annotation**: Visualize genome coverage at single-nucleotide level with bases and amino acids.\n  * **GC annotation**: Visualize genome coverage with GC content\n  * **CNV annotation**: Visualize genome coverage with copy number variation (CNV)\n  * **gene annotation**: Visualize genome coverage across genes\n  * **transcription annotation**: Visualize genome coverage across different transcripts\n  * **ideogram annotation**: Visualize the region showing on whole chromosome\n  * **peak annotation**: Visualize genome coverage and peak identified\n  * **contact map annotation**: Visualize genome coverage with Hi-C contact map\n  * **link annotation**: Visualize genome coverage with contacts\n  * **peotein feature annotation**: Visualize protein coverage with features\n\n## Installation\n\n`ggcoverage` is an R package distributed as part of the [CRAN repository](https://cran.r-project.org/).\nTo install the package, start R and enter one of the following commands:\n  \n```{r install, eval = FALSE}\n# install via CRAN (not yet available)\ninstall.packages(\"ggcoverage\")\n\n# OR install via Github\ninstall.package(\"remotes\")\nremotes::install_github(\"showteeth/ggcoverage\")\n```\n\nIn general, it is **recommended** to install from the [Github repository](https://github.com/showteeth/ggcoverage) (updated more regularly).\n\nOnce `ggcoverage` is installed, it can be loaded like every other package:\n\n```{r library, message = FALSE, warning = FALSE}\nlibrary(\"ggcoverage\")\n```\n\n## Manual\n\n`ggcoverage` provides two [vignettes](https://showteeth.github.io/ggcoverage/): \n\n* **detailed manual**: step-by-step usage\n* **customize the plot**: customize the plot and add additional layers\n\n\n## RNA-seq data\n\n### Load the data\n\nThe RNA-seq data used here is from [Transcription profiling by high throughput sequencing of HNRNPC knockdown and control HeLa cells](https://bioconductor.org/packages/release/data/experiment/html/RNAseqData.HNRNPC.bam.chr14.html). We select four samples to use as example: `ERR127307_chr14`, `ERR127306_chr14`, `ERR127303_chr14`, `ERR127302_chr14`, and all bam files were converted to bigwig files with [deeptools](https://deeptools.readthedocs.io/en/develop/).\n\nLoad metadata:\n\n```{r load_metadata}\n# load metadata\nmeta_file \u003c-\n  system.file(\"extdata\", \"RNA-seq\", \"meta_info.csv\", package = \"ggcoverage\")\nsample_meta \u003c- read.csv(meta_file)\nsample_meta\n```\n\nLoad track files:\n\n```{r load_track}\n# track folder\ntrack_folder \u003c- system.file(\"extdata\", \"RNA-seq\", package = \"ggcoverage\")\n# load bigwig file\ntrack_df \u003c- LoadTrackFile(\n  track.folder = track_folder,\n  format = \"bw\",\n  region = \"chr14:21,677,306-21,737,601\",\n  extend = 2000,\n  meta.info = sample_meta\n)\n# check data\nhead(track_df)\n```\n\nPrepare mark region:\n\n```{r prepare_mark}\n# create mark region\nmark_region \u003c- data.frame(\n  start = c(21678900, 21732001, 21737590),\n  end = c(21679900, 21732400, 21737650),\n  label = c(\"M1\", \"M2\", \"M3\")\n)\n# check data\nmark_region\n```\n\n### Load GTF\n\nTo add **gene annotation**, the gtf file should contain **gene_type** and **gene_name** attributes in **column 9**; to add **transcript annotation**, the gtf file should contain a **transcript_name** attribute in **column 9**.\n\n```{r load_gtf}\ngtf_file \u003c-\n  system.file(\"extdata\", \"used_hg19.gtf\", package = \"ggcoverage\")\ngtf_gr \u003c- rtracklayer::import.gff(con = gtf_file, format = \"gtf\")\n```\n\n### Basic coverage\n\nThe basic coverage plot has **two types**: \n\n* **facet**: Create subplot for every track (specified by `facet.key`). This is default.\n* **joint**: Visualize all tracks in a single plot.\n\n#### joint view\n\nCreate line plot for **every sample** (`facet.key = \"Type\"`) and color by **every sample** (`group.key = \"Type\"`):\n\n```{r basic_coverage_joint, warning = FALSE, fig.height = 4, fig.width = 12, fig.align = \"center\"}\nbasic_coverage \u003c- ggcoverage(\n  data = track_df,\n  plot.type = \"joint\",\n  facet.key = \"Type\",\n  group.key = \"Type\",\n  mark.region = mark_region,\n  range.position = \"out\"\n)\n\nbasic_coverage\n```\n\nCreate **group average line plot** (sample is indicated by `facet.key = \"Type\"`, group is indicated by `group.key = \"Group\"`):\n\n```{r basic_coverage_joint_avg, warning = FALSE, fig.height = 4, fig.width = 12, fig.align = \"center\"}\nbasic_coverage \u003c- ggcoverage(\n  data = track_df,\n  plot.type = \"joint\",\n  facet.key = \"Type\",\n  group.key = \"Group\",\n  joint.avg = TRUE,\n  mark.region = mark_region,\n  range.position = \"out\"\n)\n\nbasic_coverage\n```\n\n#### Facet view\n\n```{r basic_coverage, warning = FALSE, fig.height = 6, fig.width = 12, fig.align = \"center\"}\nbasic_coverage \u003c- ggcoverage(\n  data = track_df,\n  plot.type = \"facet\",\n  mark.region = mark_region,\n  range.position = \"out\"\n)\n\nbasic_coverage\n```\n\n#### Custom Y-axis style\n\n**Change the Y-axis scale label in/out of plot region with `range.position`**:\n\n```{r basic_coverage_2, warning = FALSE, fig.height = 6, fig.width = 12, fig.align = \"center\"}\nbasic_coverage \u003c- ggcoverage(\n  data = track_df,\n  plot.type = \"facet\",\n  mark.region = mark_region,\n  range.position = \"in\"\n)\n\nbasic_coverage\n```\n\n**Shared/Free Y-axis scale with `facet.y.scale`**:\n\n```{r basic_coverage_3, warning = FALSE, fig.height = 6, fig.width = 12, fig.align = \"center\"}\nbasic_coverage \u003c- ggcoverage(\n  data = track_df,\n  plot.type = \"facet\",\n  mark.region = mark_region,\n  range.position = \"in\",\n  facet.y.scale = \"fixed\"\n)\n\nbasic_coverage\n```\n\n### Add gene annotation\n\n- default behavior is to draw genes (transcripts), exons and UTRs with different line width\n- can bec adjusted using `gene.size`, `exon.size` and `utr.size` parameters\n- frequency of intermittent arrows (light color) can be adjusted using the `arrow.num` and `arrow.gap` parameters\n- genomic features are colored by `strand` by default, which can be changed using the `color.by` parameter\n\n```{r gene_coverage, warning = FALSE, fig.height = 8, fig.width = 12, fig.align = \"center\"}\nbasic_coverage +\n  geom_gene(gtf.gr = gtf_gr)\n```\n\n\n### Add transcript annotation\n\n**In \"loose\" style (default style; each transcript occupies one line)**:\n\n```{r transcript_coverage, warning = FALSE, fig.height = 12, fig.width = 12, fig.align = \"center\"}\nbasic_coverage +\n  geom_transcript(gtf.gr = gtf_gr, label.vjust = 1.5)\n```\n\n**In \"tight\" style (attempted to place non-overlapping transcripts in one line)**:\n\n```{r transcript_coverage_tight, warning = FALSE, fig.height = 12, fig.width = 12, fig.align = \"center\"}\nbasic_coverage +\n  geom_transcript(\n    gtf.gr = gtf_gr,\n    overlap.style = \"tight\",\n    label.vjust = 1.5\n  )\n```\n\n### Add ideogram\n\nThe ideogram is an overview plot about the respective position on a chromosome.\nThe plotting of the ideogram is implemented by the `ggbio` package.\nThis package needs to be installed separately (it is only 'Suggested' by `ggcoverage`).\n\n```{r ideogram_coverage_1, warning = FALSE, fig.height = 10, fig.width = 12, fig.align = \"center\"}\nlibrary(ggbio)\n\nbasic_coverage +\n  geom_gene(gtf.gr = gtf_gr) +\n  geom_ideogram(genome = \"hg19\", plot.space = 0)\n```\n\n```{r ideogram_coverage_2, warning = FALSE, fig.height = 14, fig.width = 12, fig.align = \"center\"}\nbasic_coverage +\n  geom_transcript(gtf.gr = gtf_gr, label.vjust = 1.5) +\n  geom_ideogram(genome = \"hg19\", plot.space = 0)\n```\n\n## DNA-seq data\n\n### CNV\n\n#### Example 1\n\n##### Load the data\n\nThe DNA-seq data used here are from [Copy number work flow](https://bioconductor.org/help/course-materials/2014/SeattleOct2014/B02.2.3_CopyNumber.html), we select tumor sample, and get bin counts with `cn.mops::getReadCountsFromBAM` with `WL` 1000.\n\n```{r load_bin_counts}\n# prepare metafile\ncnv_meta_info \u003c- data.frame(\n  SampleName = c(\"CNV_example\"),\n  Type = c(\"tumor\"),\n  Group = c(\"tumor\")\n)\n\n# track file\ntrack_file \u003c- system.file(\"extdata\",\n  \"DNA-seq\", \"CNV_example.txt\",\n  package = \"ggcoverage\"\n)\n\n# load txt file\ntrack_df \u003c- LoadTrackFile(\n  track.file = track_file,\n  format = \"txt\",\n  region = \"chr4:61750000-62,700,000\",\n  meta.info = cnv_meta_info\n)\n\n# check data\nhead(track_df)\n```\n\n##### Basic coverage\n\n```{r basic_coverage_dna, warning = FALSE, fig.height = 6, fig.width = 12, fig.align = \"center\"}\nbasic_coverage \u003c- ggcoverage(\n  data = track_df,\n  color = \"grey\",\n  mark.region = NULL,\n  range.position = \"out\"\n)\nbasic_coverage\n```\n\n##### Add GC annotations\n\nAdd **GC**, **ideogram** and **gene** annotations.\nThe plotting of the GC content requires the genome annotation package `BSgenome.Hsapiens.UCSC.hg19`.\nThis package needs to be installed separately (it is only 'Suggested' by `ggcoverage`).\n\n```{r gc_coverage, warning = FALSE, fig.height = 10, fig.width = 12, fig.align = \"center\"}\n# load genome data\nlibrary(\"BSgenome.Hsapiens.UCSC.hg19\")\n\n# create plot\nbasic_coverage +\n  geom_gc(bs.fa.seq = BSgenome.Hsapiens.UCSC.hg19) +\n  geom_gene(gtf.gr = gtf_gr) +\n  geom_ideogram(genome = \"hg19\")\n```\n\n#### Example 2\n\n##### Load the data\n\nThe DNA-seq data used here are from [Genome-wide copy number analysis of single cells](https://www.nature.com/articles/nprot.2012.039), and the accession number is [SRR054616](https://trace.ncbi.nlm.nih.gov/Traces/index.html?run=SRR054616).\n\n```{r cnv_load_track_file}\n# track file\ntrack_file \u003c-\n  system.file(\"extdata\", \"DNA-seq\", \"SRR054616.bw\", package = \"ggcoverage\")\n\n# load track\ntrack_df \u003c- LoadTrackFile(\n  track.file = track_file,\n  format = \"bw\",\n  region = \"4:1-160000000\"\n)\n\n# add chr prefix\ntrack_df$seqnames \u003c- paste0(\"chr\", track_df$seqnames)\n\n# check data\nhead(track_df)\n```\n\n##### Basic coverage\n\n```{r cnv_basic_coverage_dna}\nbasic_coverage \u003c- ggcoverage(\n  data = track_df,\n  color = \"grey\",\n  mark.region = NULL,\n  range.position = \"out\"\n)\n\nbasic_coverage\n```\n\n##### Load CNV file\n\n```{r cnv_load_cnv}\n# prepare files\ncnv_file \u003c-\n  system.file(\"extdata\", \"DNA-seq\", \"SRR054616_copynumber.txt\",\n    package = \"ggcoverage\"\n  )\n\n# read CNV\ncnv_df \u003c- read.table(file = cnv_file, sep = \"\\t\", header = TRUE)\n\n# check data\nhead(cnv_df)\n```\n\n##### Add annotations\n\nAdd **GC**, **ideogram** and **CNV** annotations.\n\n```{r cnv_gc_coverage}\n# create plot\nbasic_coverage +\n  geom_gc(bs.fa.seq = BSgenome.Hsapiens.UCSC.hg19) +\n  geom_cnv(\n    cnv.df = cnv_df,\n    bin.col = 3,\n    cn.col = 4\n  ) +\n  geom_ideogram(\n    genome = \"hg19\",\n    plot.space = 0,\n    highlight.centromere = TRUE\n  )\n```\n\n\n### Single-nucleotide level\n\n#### Load the data\n\n```{r load_single_nuc}\n# prepare sample metadata\nsample_meta \u003c- data.frame(\n  SampleName = c(\"tumorA.chr4.selected\"),\n  Type = c(\"tumorA\"),\n  Group = c(\"tumorA\")\n)\n\n# load bam file\nbam_file \u003c- system.file(\"extdata\",\n  \"DNA-seq\", \"tumorA.chr4.selected.bam\",\n  package = \"ggcoverage\"\n)\n\ntrack_df \u003c- LoadTrackFile(\n  track.file = bam_file,\n  meta.info = sample_meta,\n  single.nuc = TRUE,\n  single.nuc.region = \"chr4:62474235-62474295\"\n)\n\nhead(track_df)\n```\n\n#### Default color scheme\n\nFor base and amino acid annotation, the package comes with the following default color schemes. Color schemes can be changed with `nuc.color` and `aa.color` parameters.\n\nTHe default color scheme for base annotation is `Clustal-style`, more popular color schemes are available [here](https://www.biostars.org/p/171056/).\n\n```{r base_color_scheme, warning = FALSE, fig.height = 2, fig.width = 6, fig.align = \"center\"}\n# color scheme\nnuc_color \u003c- c(\n  \"A\" = \"#ff2b08\", \"C\" = \"#009aff\", \"G\" = \"#ffb507\", \"T\" = \"#00bc0d\"\n)\n\n# create plot\ngraphics::image(\n  seq_along(nuc_color),\n  1,\n  as.matrix(seq_along(nuc_color)),\n  col = nuc_color,\n  xlab = \"\",\n  ylab = \"\",\n  xaxt = \"n\",\n  yaxt = \"n\",\n  bty = \"n\"\n)\ngraphics::text(seq_along(nuc_color), 1, names(nuc_color))\ngraphics::mtext(\n  text = \"Base\",\n  adj = 1,\n  las = 1,\n  side = 2\n)\n```\n\nDefault color scheme for amino acid annotation is from [Residual colours: a proposal for aminochromography](https://pubmed.ncbi.nlm.nih.gov/9342138/):\n\n```{r aa_color_scheme, warning = FALSE, fig.height = 9, fig.width = 10, fig.align = \"center\"}\naa_color \u003c- c(\n  \"D\" = \"#FF0000\", \"S\" = \"#FF2400\", \"T\" = \"#E34234\", \"G\" = \"#FF8000\",\n  \"P\" = \"#F28500\", \"C\" = \"#FFFF00\", \"A\" = \"#FDFF00\", \"V\" = \"#E3FF00\",\n  \"I\" = \"#C0FF00\", \"L\" = \"#89318C\", \"M\" = \"#00FF00\", \"F\" = \"#50C878\",\n  \"Y\" = \"#30D5C8\", \"W\" = \"#00FFFF\", \"H\" = \"#0F2CB3\", \"R\" = \"#0000FF\",\n  \"K\" = \"#4b0082\", \"N\" = \"#800080\", \"Q\" = \"#FF00FF\", \"E\" = \"#8F00FF\",\n  \"*\" = \"#FFC0CB\", \" \" = \"#FFFFFF\", \" \" = \"#FFFFFF\", \" \" = \"#FFFFFF\",\n  \" \" = \"#FFFFFF\"\n)\n\ngraphics::image(\n  1:5,\n  1:5,\n  matrix(seq_along(aa_color), nrow = 5),\n  col = rev(aa_color),\n  xlab = \"\",\n  ylab = \"\",\n  xaxt = \"n\",\n  yaxt = \"n\",\n  bty = \"n\"\n)\n\ngraphics::text(expand.grid(1:5, 1:5), names(rev(aa_color)))\ngraphics::mtext(\n  text = \"Amino acids\",\n  adj = 1,\n  las = 1,\n  side = 2\n)\n```\n\n#### Add base and amino acid annotation\n\n**Use twill to mark position with SNV**:\n\n```{r, echo = FALSE}\n# wait some time to avoid 'Too Many Requests' error\nSys.sleep(60)\n```\n\n\n```{r base_aa_coverage, warning = FALSE, fig.height = 10, fig.width = 12, fig.align = \"center\"}\n# create plot with twill mark\nggcoverage(\n  data = track_df,\n  color = \"grey\",\n  range.position = \"out\",\n  single.nuc = TRUE,\n  rect.color = \"white\"\n) +\n  geom_base(\n    bam.file = bam_file,\n    bs.fa.seq = BSgenome.Hsapiens.UCSC.hg19,\n    mark.type = \"twill\"\n  ) +\n  geom_ideogram(genome = \"hg19\", plot.space = 0)\n```\n\n**Use star to mark position with SNV**:\n\n```{r, echo = FALSE}\n# wait some time to avoid 'Too Many Requests' error\nSys.sleep(60)\n```\n\n```{r base_aa_coverage_star, warning = FALSE, fig.height = 10, fig.width = 12, fig.align = \"center\"}\n# create plot with star mark\nggcoverage(\n  data = track_df,\n  color = \"grey\",\n  range.position = \"out\",\n  single.nuc = TRUE,\n  rect.color = \"white\"\n) +\n  geom_base(\n    bam.file = bam_file,\n    bs.fa.seq = BSgenome.Hsapiens.UCSC.hg19,\n    mark.type = \"star\"\n  ) +\n  geom_ideogram(genome = \"hg19\", plot.space = 0)\n```\n\n**Highlight position with SNV**:\n\n```{r, echo = FALSE}\n# wait some time to avoid 'Too Many Requests' error\nSys.sleep(60)\n```\n\n```{r base_aa_coverage_highlight, warning = FALSE, fig.height = 10, fig.width = 12, fig.align = \"center\"}\n# highlight one base\nggcoverage(\n  data = track_df,\n  color = \"grey\",\n  range.position = \"out\",\n  single.nuc = TRUE,\n  rect.color = \"white\"\n) +\n  geom_base(\n    bam.file = bam_file,\n    bs.fa.seq = BSgenome.Hsapiens.UCSC.hg19,\n    mark.type = \"highlight\"\n  ) +\n  geom_ideogram(genome = \"hg19\", plot.space = 0)\n```\n\n## ChIP-seq data\n\nThe ChIP-seq data used here is from [DiffBind](https://bioconductor.org/packages/release/bioc/html/DiffBind.html). Four samples are selected as examples: `Chr18_MCF7_input`, `Chr18_MCF7_ER_1`, `Chr18_MCF7_ER_3`, `Chr18_MCF7_ER_2`, and all bam files were converted to bigwig files with [deeptools](https://deeptools.readthedocs.io/en/develop/).\n\nCreate metadata:\n\n```{r load_metadata_chip}\n# load metadata\nsample_meta \u003c- data.frame(\n  SampleName = c(\n    \"Chr18_MCF7_ER_1\",\n    \"Chr18_MCF7_ER_2\",\n    \"Chr18_MCF7_ER_3\",\n    \"Chr18_MCF7_input\"\n  ),\n  Type = c(\"MCF7_ER_1\", \"MCF7_ER_2\", \"MCF7_ER_3\", \"MCF7_input\"),\n  Group = c(\"IP\", \"IP\", \"IP\", \"Input\")\n)\n\nsample_meta\n```\n\nLoad track files:\n\n```{r load_track_chip}\n# track folder\ntrack_folder \u003c- system.file(\"extdata\", \"ChIP-seq\", package = \"ggcoverage\")\n\n# load bigwig file\ntrack_df \u003c- LoadTrackFile(\n  track.folder = track_folder,\n  format = \"bw\",\n  region = \"chr18:76822285-76900000\",\n  meta.info = sample_meta\n)\n\n# check data\nhead(track_df)\n```\n\nPrepare mark region:\n\n```{r prepare_mark_chip}\n# create mark region\nmark_region \u003c- data.frame(\n  start = c(76822533),\n  end = c(76823743),\n  label = c(\"Promoter\")\n)\n\n# check data\nmark_region\n```\n\n### Basic coverage\n\n```{r basic_coverage_chip, warning = FALSE, fig.height = 6, fig.width = 12, fig.align = \"center\"}\nbasic_coverage \u003c- ggcoverage(\n  data = track_df,\n  mark.region = mark_region,\n  show.mark.label = FALSE\n)\nbasic_coverage\n```\n\n### Add annotations\n\nAdd **gene**, **ideogram** and **peak** annotations. To create peak annotation, we first **get consensus peaks** with [MSPC](https://github.com/Genometric/MSPC).\n\n```{r, echo = FALSE}\n# wait some time to avoid 'Too Many Requests' error\nSys.sleep(60)\n```\n\n```{r peak_coverage, warning = FALSE, fig.height = 10, fig.width = 12, fig.align = \"center\"}\n# get consensus peak file\npeak_file \u003c- system.file(\"extdata\",\n  \"ChIP-seq\",\n  \"consensus.peak\",\n  package = \"ggcoverage\"\n)\n\nbasic_coverage +\n  geom_gene(gtf.gr = gtf_gr) +\n  geom_peak(bed.file = peak_file) +\n  geom_ideogram(genome = \"hg19\", plot.space = 0)\n```\n\n## Hi-C data\n\nThe Hi-C method maps chromosome contacts in eukaryotic cells.\nFor this purpose, DNA and protein complexes are cross-linked and DNA fragments then purified.\nAs a result, even distant chromatin fragments can be found to interact due to the spatial organization of the DNA and histones in the cell. Hi-C data shows these interactions for example as a contact map.\n\nThe Hi-C data is taken from [pyGenomeTracks: reproducible plots for multivariate genomic datasets](https://pubmed.ncbi.nlm.nih.gov/32745185/).\n\nThe Hi-C matrix visualization is implemented by [`HiCBricks`](https://github.com/koustav-pal/HiCBricks).\nThis package needs to be installed separately (it is only 'Suggested' by `ggcoverage`).\n\n### Load track data\n\n```{r hic_track}\n# prepare track dataframe\ntrack_file \u003c-\n  system.file(\"extdata\", \"HiC\", \"H3K36me3.bw\", package = \"ggcoverage\")\n\ntrack_df \u003c- LoadTrackFile(\n  track.file = track_file,\n  format = \"bw\",\n  region = \"chr2L:8050000-8300000\",\n  extend = 0\n)\n\ntrack_df$score \u003c- ifelse(track_df$score \u003c 0, 0, track_df$score)\n\n# check the data\nhead(track_df)\n```\n\n### Load Hi-C data\n\nMatrix:\n\n```{r hic_load_hic_matrix}\n## matrix\nhic_mat_file \u003c- system.file(\"extdata\",\n  \"HiC\", \"HiC_mat.txt\",\n  package = \"ggcoverage\"\n)\nhic_mat \u003c- read.table(file = hic_mat_file, sep = \"\\t\")\nhic_mat \u003c- as.matrix(hic_mat)\n```\n\nBin table:\n\n```{r hic_load_hic_bin}\n## bin\nhic_bin_file \u003c-\n  system.file(\"extdata\", \"HiC\", \"HiC_bin.txt\", package = \"ggcoverage\")\nhic_bin \u003c- read.table(file = hic_bin_file, sep = \"\\t\")\ncolnames(hic_bin) \u003c- c(\"chr\", \"start\", \"end\")\nhic_bin_gr \u003c- GenomicRanges::makeGRangesFromDataFrame(df = hic_bin)\n\n## transfrom func\nfailsafe_log10 \u003c- function(x) {\n  x[is.na(x) | is.nan(x) | is.infinite(x)] \u003c- 0\n  return(log10(x + 1))\n}\n```\n\nData transfromation method:\n\n### Load link\n\n```{r hic_load_link}\n# prepare arcs\nlink_file \u003c-\n  system.file(\"extdata\", \"HiC\", \"HiC_link.bedpe\", package = \"ggcoverage\")\n```\n\n### Basic coverage\n\n```{r basic_coverage_hic, warning = FALSE, fig.height = 6, fig.width = 12, fig.align = \"center\"}\nbasic_coverage \u003c-\n  ggcoverage(\n    data = track_df,\n    color = \"grey\",\n    mark.region = NULL,\n    range.position = \"out\"\n  )\n\nbasic_coverage\n```\n\n### Add annotations\n\nAdd **link**, **contact map**annotations:\n\n```{r hic_coverage, warning = FALSE, fig.height = 10, fig.width = 12, fig.align = \"center\"}\nlibrary(HiCBricks)\n\nbasic_coverage +\n  geom_tad(\n    matrix = hic_mat,\n    granges = hic_bin_gr,\n    value.cut = 0.99,\n    color.palette = \"viridis\",\n    transform.fun = failsafe_log10,\n    top = FALSE,\n    show.rect = TRUE\n  ) +\n  geom_link(\n    link.file = link_file,\n    file.type = \"bedpe\",\n    show.rect = TRUE\n  )\n```\n\n## Mass spectrometry protein coverage\n\n[Mass spectrometry](https://en.wikipedia.org/wiki/Protein_mass_spectrometry) (MS) is an important method for the accurate mass determination and characterization of proteins, and a variety of methods and instruments have been developed for its many uses. With `ggcoverage`, we can easily inspect the peptide coverage of a protein in order to learn about the quality of the data.\n\n### Load coverage\n\nThe exported coverage from [Proteome Discoverer](https://doi.org/10.3390/proteomes9010015):\n\n```{r ms_coverage_data}\nlibrary(openxlsx)\n# prepare coverage dataframe\ncoverage_file \u003c-\n  system.file(\"extdata\",\n    \"Proteomics\", \"MS_BSA_coverage.xlsx\",\n    package = \"ggcoverage\"\n  )\ncoverage_df \u003c- openxlsx::read.xlsx(coverage_file, sheet = \"Sheet1\")\n# check the data\nhead(coverage_df)\n```\n\nThe input protein fasta:\n\n```{r ms_coverage_fasta}\nfasta_file \u003c-\n  system.file(\"extdata\",\n    \"Proteomics\", \"MS_BSA_coverage.fasta\",\n    package = \"ggcoverage\"\n  )\n\n# prepare track dataframe\nprotein_set \u003c- Biostrings::readAAStringSet(fasta_file)\n\n# check the data\nprotein_set\n```\n\n### Protein coverage\n\n```{r basic_coverage_protein, warning = FALSE, fig.height = 6, fig.width = 10, fig.align = \"center\"}\nprotein_coverage \u003c- ggprotein(\n  coverage.df = coverage_df,\n  fasta.file = fasta_file,\n  protein.id = \"sp|P02769|ALBU_BOVIN\",\n  range.position = \"out\"\n)\n\nprotein_coverage\n```\n\n### Add annotation\n\nWe can obtain features of the protein from [UniProt](https://www.uniprot.org/). For example, the above protein coverage plot shows that there is empty region in 1-24, and this empty region in [UniProt](https://www.uniprot.org/uniprotkb/P02769/entry) is annotated as Signal peptide and Propeptide peptide. When the protein is mature and released extracellular, these peptides will be cleaved. This is the reason why there is empty region in 1-24.\n\n```{r basic_coverage_protein_feature, warning = FALSE, fig.height = 6, fig.width = 10, fig.align = \"center\"}\n# protein feature obtained from UniProt\nprotein_feature_df \u003c- data.frame(\n  ProteinID = \"sp|P02769|ALBU_BOVIN\",\n  start = c(1, 19, 25),\n  end = c(18, 24, 607),\n  Type = c(\"Signal\", \"Propeptide\", \"Chain\")\n)\n\n# add annotation\nprotein_coverage +\n  geom_feature(\n    feature.df = protein_feature_df,\n    feature.color = c(\"#4d81be\", \"#173b5e\", \"#6a521d\")\n  )\n```\n\n## Code of Conduct\n\nPlease note that the `ggcoverage` project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/showteeth.github.io%2Fggcoverage%2F","html_url":"https://awesome.ecosyste.ms/projects/showteeth.github.io%2Fggcoverage%2F","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/showteeth.github.io%2Fggcoverage%2F/lists"}