{"id":49783172,"url":"https://github.com/qgenlab/diffmethyltools","last_synced_at":"2026-05-11T23:01:57.999Z","repository":{"id":306305615,"uuid":"942226024","full_name":"qgenlab/DiffMethylTools","owner":"qgenlab","description":"An automated, end-to-end Python pipeline for differential DNA methylation analysis, region annotation, and visualization.","archived":false,"fork":false,"pushed_at":"2026-05-05T23:38:37.000Z","size":28248,"stargazers_count":5,"open_issues_count":5,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-05-06T01:31:00.778Z","etag":null,"topics":["bioinformatics","data-visualization","dna-methylation","epigenetics","python3"],"latest_commit_sha":null,"homepage":"https://qgenlab.github.io/DiffMethylTools/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/qgenlab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-03-03T19:21:08.000Z","updated_at":"2026-05-06T00:12:31.000Z","dependencies_parsed_at":"2025-10-21T19:33:51.262Z","dependency_job_id":null,"html_url":"https://github.com/qgenlab/DiffMethylTools","commit_stats":null,"previous_names":["qgenlab/diffmethyltools"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/qgenlab/DiffMethylTools","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qgenlab%2FDiffMethylTools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qgenlab%2FDiffMethylTools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qgenlab%2FDiffMethylTools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qgenlab%2FDiffMethylTools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/qgenlab","download_url":"https://codeload.github.com/qgenlab/DiffMethylTools/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qgenlab%2FDiffMethylTools/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32916333,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-11T17:09:15.040Z","status":"ssl_error","status_checked_at":"2026-05-11T17:08:45.420Z","response_time":120,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","data-visualization","dna-methylation","epigenetics","python3"],"created_at":"2026-05-11T23:01:56.825Z","updated_at":"2026-05-11T23:01:57.994Z","avatar_url":"https://github.com/qgenlab.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DiffMethylTools\n# DiffMethylTools\nDiffMethylTools is a Python-based toolkit for the comprehensive analysis of DNA methylation differences between two groups of samples. Designed for both short-read (e.g., WGBS, RRBS) and long-read (e.g., Nanopore) methylome data, DiffMethylTools enables accurate and streamlined detection of differentially methylated loci (DMLs) and regions (DMRs). The tool accepts flexible input formats including Bismark reports and generic BED-style methylation calls, making it compatible with most upstream methylation profiling workflows.\n\nThe package integrates statistical testing, biological annotation, and high-quality visualization into a single-command pipeline. Users can merge and filter candidate regions, map methylation changes to gene features and cis-regulatory elements (CCREs), and generate summary plots such as volcano plots, Manhattan plots, heatmaps, and gene-region profiles. DiffMethylTools also includes a module for generating annotation pie charts that quantify overlap of DMRs with genomic and functional features.\n\nBy combining flexibility with usability, DiffMethylTools provides researchers with a practical and efficient platform for epigenomic analysis. It supports high-throughput and reproducible workflows and is especially well-suited for studies investigating the role of DNA methylation in development, differentiation, and disease progression.\n\n*This tutorial is still a work in progress.*\n\n## Installation\n### Install python dependencies\nTo download the package, run:\n```\ngit clone https://github.com/qgenlab/DiffMethylTools.git\n```\n\nTo create DiffMethylTools environment, run:\n```\nconda env create -f environment.yml\n```\nThen activate the environment with\n```\nconda activate DiffMethylTools\n```\n\n### Download annotation databases\n#### For hg19\n```\nchmod +x get_files_hg19.sh\n./get_files_hg19.sh\n```\n#### For hg38\n```\nchmod +x get_files_hg38.sh\n./get_files_hg38.sh\n```\n\n## Usage\n\n\n### Generate DML/DMR and map positions to genes\n`DiffMethylTools` supports both **default methylation input formats** and **fully customizable formats**. Users can either specify a standard format via `--input_format` or manually define column indices.\n\n### Default Input Formats\n#### Input Format 1: BED Format with Methylation Percentage\n\nSupported via: ```--input_format BED```\n\nExpected BED-like format (example):\n```\nchr1    10468   10469   5mC  743   +   10468  10469  0,0,0   52   95.21\nchr1    10470   10471   5mC  850   +   10470  10471  0,0,0   58   94.26\n```\n\nRun all analysis with:\n```\npython ../DiffMethylTools/DiffMethylTools.py all_analysis \\\n  --case_data_file case1.bed case2.bed \\\n  --ctr_data_file ctr1.bed ctr2.bed \\\n  --input_format BED \\\n  --ref_folder hg38 (or hg19)\n```\nFor BED input, DiffMethylTools automatically interprets:\n- Chromosome\n- Position\n- Coverage\n- Methylation percentage\n\nNo manual column specification is required.\n\n#### Input Format 2: Bismark CpG Report Format (CR)\n\nSupported via: ```--input_format CR```\n\nExpected Bismark CpG report format:\n\n```\nchr1    10468   +    12   4   CG   CGC\nchr1    10470   +     8   4   CG   CGC\nchr1    10483   +     7   2   CG   CGC\n```\n\n\nRun full analysis:\n\n```\npython ../DiffMethylTools/DiffMethylTools.py all_analysis \\\n  --case_data_file case1_CpG_report.txt case2_CpG_report.txt case3_CpG_report.txt \\\n  --ctr_data_file ctr1_CpG_report.txt ctr2_CpG_report.txt ctr3_CpG_report.txt \\\n  --input_format CR \\\n  --ref_folder hg38 (or hg19)\n```\nFor CR input, DiffMethylTools automatically detects:\n- Chromosome\n- CpG position\n- Methylated counts\n- Unmethylated counts\n\n\n#### Flexible / Custom Input Format\nIf the input files do not conform to standard BED or Bismark CpG report formats, users can manually specify column indices. For both case and control files, the user must define:\n- Field separator\n- Chromosome column index (0-based)\n- Start position column index\n- Either:\n  - Methylation percentage + coverage column indices\n  or\n  - Methylated + unmethylated count column indices\n\n\n\n##### Examples:\nCustom format with CpG report file as input\n```\npython ../DiffMethylTools.py all_analysis \\\n  --case_data_file space_separated_case_file_paths \\\n  --ctr_data_file space_separated_ctr_file_paths \\\n  --case_data_chromosome_column_index 0 \\\n  --ctr_data_chromosome_column_index 0 \\\n  --case_data_position_start_column_index 1 \\\n  --ctr_data_position_start_column_index 1 \\\n  --case_data_positive_methylation_count_column_index 3 \\\n  --case_data_negative_methylation_count_column_index 4 \\\n  --ctr_data_positive_methylation_count_column_index 3 \\\n  --ctr_data_negative_methylation_count_column_index 4 \\\n  --case_data_separator 't' \\\n  --ctr_data_separator 't' \\\n  --ref_folder (hg19 or hg38)\n```\n\nCustom format with Bed file as input\n\n```\npython ../DiffMethylTools.py all_analysis \\\n  --case_data_file space_separated_case_file_paths \\\n  --ctr_data_file space_separated_ctr_file_paths \\\n  --case_data_separator 't' \\\n  --ctr_data_separator 't' \\\n  --case_data_chromosome_column_index 0 \\\n  --ctr_data_chromosome_column_index 0 \\\n  --case_data_position_start_column_index 1 \\\n  --ctr_data_position_start_column_index 1 \\\n  --case_data_methylation_percentage_column_index 10 \\\n  --case_data_coverage_column_index 9 \\\n  --ctr_data_methylation_percentage_column_index 10 \\\n  --ctr_data_coverage_column_index 9 \\\n  --ref_folder (hg19 or hg38)\n```\n\nRunning *all_analysis* will generate two folders:\n\n##### `plot/`\n\n- This folder is initially empty.\n- It will be populated by output figures and plots generated by `DiffMethylTools` (e.g., volcano plots, methylation curves, pie charts).\n\n##### `data/`\n\nThis folder contains the required input and intermediate files used during the analysis. Below is a description of each file:\n\n- `merge_tables.csv`: Merged and coverage-filtered methylation data for both case and control samples.\n- `position_based.csv`: Methylation data summarized at the individual CpG or position level, including q-value and difference information.\n- `filters.csv`: Filtered data based on methylation difference and statistical significance (q-value).\n- `generate_DMR_0.csv`: Differentially methylated regions (DMRs), aggregated from position-level data.\n- `generate_DMR_2.csv`: Differentially methylated loci (DMLs) located within identified DMRs.\n- `generate_DMR_1.csv`: DMLs that do not fall within any DMR (isolated differential sites).\n- `map_positions_to_genes_genes.csv`: Mapping of methylation regions to annotated gene features.\n- `map_positions_to_genes_CCRE.csv`: Mapping of methylation regions to candidate cis-regulatory elements (CCREs).\n- `map_win_2_pos.csv`: Maps DMR regions to all underlying CpG positions (includes both DML and non-DML positions, unlike `generate_DMR_1.csv`).\n- `state.yaml`: YAML configuration file that tracks tool state, parameters, and progress.\n\n### Plot Generation\nGenerate **volcano plot**, **Manhattan plot**, **upstream clustering**, and **region-based gene plots**:\n```\npython ../DiffMethylTools.py all_plots \\\n  --data_file data/position_based.csv \\\n  --data_has_header \\\n  --window_data_file data/generate_DMR_0.csv \\\n  --window_data_has_header \\\n  --data_separator ',' \\\n  --window_data_separator ',' \\\n  --gene_file data/map_positions_to_genes_genes.csv \\\n  --gene_has_header \\\n  --ccre_file data/map_positions_to_genes_CCRE.csv \\\n  --ccre_has_header \\\n  --ref_folder (hg19 or hg38)\n```\nTo plot all DMR regions on chr1 between positions 3,664,000 and 3,668,000:\n```\npython ../DiffMethylTools.py plot_methylation_curve \\\n  --region_data_file data/generate_DMR_0.csv \\\n  --region_data_has_header \\\n  --position_data_file data/position_based.csv \\\n  --position_data_has_header \\\n  --chr_filter chr1 \\\n  --start_filter 3664000 \\\n  --end_filter 3668000 \\\n  --ref_folder (hg19 or hg38)\n```\nOmitting the --chr_filter, --start_filter, and --end_filter options will generate plots for all DMRs.\n\nRegion-based DMR annotation pie chart:\n```\npython ../DiffMethylTools.py match_region_annotation \\\n  --regions_df_file generate_DMR_0.csv \\\n  --regions_df_has_header \\\n  --ref_folder (hg19 or hg38)\n```\n\nAnnotation-based DMR annotation pie chart:\n\n```\npython ../DiffMethylTools.py match_region_annotation \\\n  --regions_df_file generate_DMR_0.csv \\\n  --regions_df_has_header \\\n  --annotation_or_region annotation \\\n  --ref_folder (hg19 or hg38)\n```\n\n## Citing DiffMethylTools\nIf you used DiffMethylTools please cite:\n\nDerbel, Houssemeddine, Evan Kinnear, Justin J-L. Wong, and Qian Liu. \"DiffMethylTools: a toolbox of the detection, annotation and visualization of differential DNA methylation.\" bioRxiv (2025): 2025-07.\n\n*(Manuscript currently under peer review)*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqgenlab%2Fdiffmethyltools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fqgenlab%2Fdiffmethyltools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqgenlab%2Fdiffmethyltools/lists"}