{"id":18284965,"url":"https://github.com/seandavi/ngcgh","last_synced_at":"2025-10-09T22:13:09.161Z","repository":{"id":57446023,"uuid":"1412749","full_name":"seandavi/ngCGH","owner":"seandavi","description":"Tools for producing pseudo-cgh of next-generation sequencing data","archived":false,"fork":false,"pushed_at":"2016-09-05T13:37:21.000Z","size":56,"stargazers_count":18,"open_issues_count":5,"forks_count":9,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-09-25T10:58:25.389Z","etag":null,"topics":["bioinformatics","cancer-genomics","genomics","python","sequencing"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/seandavi.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2011-02-25T22:11:35.000Z","updated_at":"2025-04-08T16:11:27.000Z","dependencies_parsed_at":"2022-09-26T17:30:30.864Z","dependency_job_id":null,"html_url":"https://github.com/seandavi/ngCGH","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/seandavi/ngCGH","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seandavi%2FngCGH","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seandavi%2FngCGH/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seandavi%2FngCGH/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seandavi%2FngCGH/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/seandavi","download_url":"https://codeload.github.com/seandavi/ngCGH/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seandavi%2FngCGH/sbom","scorecard":{"id":807792,"data":{"date":"2025-08-11","repo":{"name":"github.com/seandavi/ngCGH","commit":"fe75aa548649c53b5066d7791668c95bd8d87681"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":2.7,"checks":[{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Code-Review","score":1,"reason":"Found 3/27 approved changesets -- score normalized to 1","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"License","score":0,"reason":"license file not detected","details":["Warn: project does not have a license file"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 6 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-23T12:16:02.697Z","repository_id":57446023,"created_at":"2025-08-23T12:16:02.697Z","updated_at":"2025-08-23T12:16:02.697Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279002129,"owners_count":26083307,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-09T02:00:07.460Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","cancer-genomics","genomics","python","sequencing"],"created_at":"2024-11-05T13:15:04.267Z","updated_at":"2025-10-09T22:13:09.145Z","avatar_url":"https://github.com/seandavi.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":".. image:: https://zenodo.org/badge/5710/seandavi/ngCGH.png\n   :target: http://dx.doi.org/10.5281/zenodo.11391\n\nOverview\n============\nNext-generation sequencing of tumor/normal pairs provides a good opportunity to examine large-scale copy number variation in the tumor relative to the normal sample.  In practice, this concept seems to extend even to exome-capture sequencing of pairs of tumor and normal.  This library consists of a single script, ngCGH, that computes a pseudo-CGH using simple coverage counting on the tumor relative to the normal.\n\nI have chosen to use a fixed number of reads in the normal sample as the \"windowing\" approach.  This has the advantage of producing copy number estimates that should have similar variance at each location.  The algorithm will adaptively deal with inhomogeneities across the genome such as those associated with exome-capture technologies (to the extent that the capture was similar in both tumor and normal).  The disadvantage is that the pseudo-probes will be at different locations for every \"normal control\" sample. \n \n\nInstallation\n=============\nThere are several possible ways to install ngCGH.  \n\ngithub\n-------\nIf you are a git user, then simply cloning the repository will get you the latest code.\n\n::\n\n  git clone git://github.com/seandavi/ngCGH.git\n\nAlternatively, click the `Download \u003chttps://github.com/seandavi/ngCGH/archives/master\u003e`_ button and get the tarball or zip file.\n\nIn either case, change into the resulting directory and::\n\n  cd ngCGH\n  python setup.py install\n\nFrom PyPi\n-------------------\nIf you have easy_install in place, this should suffice for installation:\n\n::\n\n  easy_install ngCGH\n\n\n\n\nUsage\n=====\nUsage is very simple:\n\n::\n\n    usage: ngCGH [-h] [-w WINDOWSIZE] [-o OUTFILE] [-l LOGLEVEL] [-r REGIONS]\n\t\t [-t PROCESSES]\n\t\t normalbam tumorbam\n\n    positional arguments:\n      normalbam             The name of the bamfile for the normal comparison\n      tumorbam              The name of the tumor sample bamfile\n\n    optional arguments:\n      -h, --help            show this help message and exit\n      -w WINDOWSIZE, --windowsize WINDOWSIZE\n\t\t\t    The number of reads captured from the normal sample\n\t\t\t    for calculation of copy number (default: 1000)\n      -o OUTFILE, --outfile OUTFILE\n\t\t\t    Output filename, default \u003cstdout\u003e (default: None)\n      -l LOGLEVEL, --loglevel LOGLEVEL\n\t\t\t    Logging Level, 1-30 with 1 being maximal logging and\n\t\t\t    30 being errors only [20] (default: 20)\n      -r REGIONS, --regions REGIONS\n\t\t\t    regions to which analysis should be restricted, either\n\t\t\t    a bed file name or a single region in format chrN:XXX-\n\t\t\t    YYY (default: None)\n      -t PROCESSES, --threads PROCESSES\n\t\t\t    parallelize over regions (or chromosomes) (default: 1)\n\n\nOutput\n======\nThe output format is also very simple:\n\n::\n\n  chr1    4851    52735   1000    854     -0.025120\n  chr1    52736   59251   1000    812     -0.097876\n  chr1    59251   119119  1000    876     0.011575\n  chr1    119120  707038  1000    1087    0.322924\n  chr1    707040  711128  1000    1016    0.225472\n  chr1    711128  711375  1000    1059    0.285275\n  chr1    711375  735366  1000    919     0.080709\n  chr1    735368  798455  1000    972     0.161600\n\nColumns 1-3 describe the chromosome, start, and end for each pseudo-probe.  The fourth column is the number of reads in the normal sample in the window while the fifth column represents the reads *in the same genomic window* from the tumor.  The last column contains the median-centered log2 ratio between tumor and normal.\n\n\nConvert from ngCGH to BioDiscovery Nexus\n----------------------------------------\nIncluded in the release is a script, convert2nexus, that takes as input the filename of a file created by ngCGH and converts it into a file that the Nexus CGH software from BioDiscovery can load for further analysis.  The format looks like this:\n\n::\n\n  Name    Chromosome      Start   End     PALZGU.cgh\n  chr1_10004      chr1    10004   15735   -2.087921\n  chr1_15736      chr1    15736   69385   -2.670936\n  chr1_69386      chr1    69386   521687  -0.428244\n  chr1_523537     chr1    523537  726959  0.080269\n  chr1_726959     chr1    726959  808542  0.223047\n  chr1_808546     chr1    808546  809138  -1.186761\n\nI presented a webinar on using ngCGH with `BioDiscovery Nexus \u003chttp://www.biodiscovery.com/software/nexus-copy-number/\u003e`_ that you can `view here \u003chttp://www.biodiscovery.com/2012/05/16/copy-number-estimation-from-exome-and-genome-sequencing-data/\u003e`_.\n\n.. note::\n\n   The file format generated above can be loaded into Biodiscovery Nexus using the \"Multi1\" data type.\n\n\nConvert from Complete Genomics to BioDiscovery Nexus\n----------------------------------------------------\nThere is now plenty of Complete Genomics data floating around.  We are often interested in visualizing the somatic CNV data in Biodiscovery nexus.  There is a script, cgi2nexus that takes a file typically named as \"SomaticCnvDetailsDiploidBeta*\" and converts to the file format noted above.  Bzip2 (typical from CGI) are uncompressed on-the-fly.\n\nSegmenting output\n-------------------------\nThe cgh2seg script uses some sane defaults (at least for exomes) to the Circular Binary Segmentation algorithm as implemented in the DNAcopy Bioconductor package.  The segmented results are centered around the mode of the density of the segmented values on a per-probe basis.  The script will write the \"Centrality parameter\" to stderr when it completes.\n\nThe file format is:\n\n:: \n\n  ID      chrom   loc.start       loc.end num.mark        seg.mean\n  09      chr1    367695  82438842        2279    0.546541374526925\n  09      chr1    82778033        93082545        206     0.077841374526925\n  09      chr1    93205647        103965955       188     -0.913458625473075\n  09      chr1    104000621       104166584       4       -0.216558625473075\n  09      chr1    104342470       110014374       109     -0.948958625473075\n  09      chr1    110024223       110058480       4       -1.38295862547308\n\n\nMethods\n============\nThe pseudo-cgh algorithm employed by ngCGH takes as input two appropriately matched BAM files, typically from a tumor and a matched normal.  Genomic windows are defined by reading blocks of a fixed number of reads (default 1000 reads) in the normal sample.  Within each defined genomic window, the number of reads in the tumor is quantified.  For each genomic window, a ratio is made between the number of reads in the tumor and the number of reads in the normal.  Finally, a log2 transformation is applied to each ratio and the entire vector of the results is then centered by subtracting the median.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fseandavi%2Fngcgh","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fseandavi%2Fngcgh","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fseandavi%2Fngcgh/lists"}