{"id":38514716,"url":"https://github.com/ulelab/cv_coverage","last_synced_at":"2026-01-17T06:27:10.643Z","repository":{"id":89256090,"uuid":"381951979","full_name":"ulelab/cv_coverage","owner":"ulelab","description":null,"archived":false,"fork":false,"pushed_at":"2023-09-28T12:09:21.000Z","size":86,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-09-05T04:11:42.410Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ulelab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-07-01T07:47:33.000Z","updated_at":"2023-11-02T16:41:14.000Z","dependencies_parsed_at":null,"dependency_job_id":"51e4e412-84b5-43b2-8d0d-d6d8f2b37283","html_url":"https://github.com/ulelab/cv_coverage","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/ulelab/cv_coverage","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ulelab%2Fcv_coverage","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ulelab%2Fcv_coverage/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ulelab%2Fcv_coverage/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ulelab%2Fcv_coverage/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ulelab","download_url":"https://codeload.github.com/ulelab/cv_coverage/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ulelab%2Fcv_coverage/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28502264,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-17T04:31:57.058Z","status":"ssl_error","status_checked_at":"2026-01-17T04:31:45.816Z","response_time":85,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-17T06:27:09.900Z","updated_at":"2026-01-17T06:27:10.626Z","avatar_url":"https://github.com/ulelab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## cv_coverage\n[![DOI](https://zenodo.org/badge/381951979.svg)](https://zenodo.org/badge/latestdoi/381951979)\n\nThis code compares the coverages of a k-mer group around crosslink events (landmarks) across multiple CLIP datasets. The comparison is conducted within user-specified genomic regions. \n\nAuthor: aram.amalietti@gmail.com\n\n\n**Dependencies** (these are the versions the script was developed with, pandas \u003e= 1 introduced breaking changes, please use these versions):\n```\npython=3.7  \npandas=0.24.2  \nnumpy=1.19.2  \npybedtools=0.8.1  \nmatplotlib=3.3.2\nseaborn=0.11.0\n```\n**Usage**:  \n  ```\n  python3 \u003cpath_to_script\u003e \u003cxl_in\u003e \u003cmotifs\u003e \u003cregions\u003e \u003ckmer_len\u003e \u003cfasta\u003e \u003cfai\u003e \u003cregions_file\u003e \u003csmothing\u003e \u003cpercentile\u003e \u003cwindow\u003e \u003cuse_scores\u003e \u003cn_cores\u003e \u003cchunk_size\u003e \u003ccap\u003e\n  ```\n\n  - **`xl_in`** *list of BED files containing landmarks around which to display the motifs, given as* `path_to_file1,path_to_file2,path_to_file3 ...`. *It can be a single file or a list of files, the plots will be adjusted accordingly;*  \n  - **`motifs`** *group of motifs to be displayed around the input landmarks, for example `AAAA,CCCC,GGGG ...`;*  \n  - **`regions`** *group of regions to be considered for analysis. Available regions are `genome` (intron, CDS, UTR3, UTR5, ncRNA, intergenic), `whole_gene` (intron, CDS, UTR3, UTR5), `intergenic`, `intron`, `ncRNA`, `other_exon` (UTR5, CDS), `UTR3` and `UTR5`. Single or multiple regions can be analysed and the plots will adjust accordingly. For example, `intron,ncRNA`;*  \n  - **`kmer_len`** *the length of analysed motifs (in bases);*  \n  - **`fasta`** *is the path to the genome in fasta format;*  \n  - **`fai`** *is the path to the genome index file;*  \n  - **`regions_file`** *is a segmentation file in GTF format, usually obtained with iCount segment but can also come from other sources;*  \n  - **`smoothing`** *is the size of the smoothing window, usually 12;*  \n  - **`percentile`** *defines the percentile for thresholding genomic landmarks by score, if percentile is set to None no thresholding is used and all landmarks will be used for the analisys;*  \n  - **`window`** *flanking distance around the landmarks that is analysed and displayed;*  \n  - **`use_scores`** *If True, each landmark is weighted by its score, else all landmarks will have equal weigths of 1;*  \n  - **`n_cores`** *is the number of threads used in the process (4 is the usual value);*   \n  - **`chunk_size`** *is the number of rows per thread (10000 is the usual value);*  \n  - **`cap`** *is the max value for any landmarks score. It is only used if use_scores is set to True (20 is the recommended value);*\n\n **Common issues**\n\n The script needs writing permission in the staging directory to make `results` directory and environment variable `TMPDIR` for temporary files. If you get `KeyError: 'TMPDIR'` a solution would be to type `export TMPDIR=\u003cpath_to_folder\u003e` in terminal where you want to run the script.\n \n **Outputs**\n - A pdf file with graphs showing % k-mer group coverage around landmarks for each analyzed CLIP dataset;\n - A tsv file with % coverage values (raw values used for plotting);\n - A text file that saves the number of analyzed landmarks in each sample;\n - A tsv file with saved run parameters.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fulelab%2Fcv_coverage","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fulelab%2Fcv_coverage","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fulelab%2Fcv_coverage/lists"}