{"id":32200562,"url":"https://github.com/mt1022/cubar","last_synced_at":"2025-10-22T03:52:15.572Z","repository":{"id":175727262,"uuid":"359824009","full_name":"mt1022/cubar","owner":"mt1022","description":"R Package for Codon Usage Bias Analysis. Comprehensive documentation and tutorials are available at:","archived":false,"fork":false,"pushed_at":"2025-10-04T03:06:11.000Z","size":24430,"stargazers_count":10,"open_issues_count":1,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-22T03:51:44.065Z","etag":null,"topics":["bioinformatics","codon-usage","machine-learning","r-package","sequence-analysis"],"latest_commit_sha":null,"homepage":"https://mt1022.github.io/cubar/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mt1022.png","metadata":{"files":{"readme":"README.md","changelog":"NEWS.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-04-20T13:22:41.000Z","updated_at":"2025-10-16T12:32:07.000Z","dependencies_parsed_at":"2024-04-24T14:45:04.821Z","dependency_job_id":"327a3e2a-65fc-43c5-a2bc-741a6c6cb746","html_url":"https://github.com/mt1022/cubar","commit_stats":{"total_commits":121,"total_committers":2,"mean_commits":60.5,"dds":"0.016528925619834656","last_synced_commit":"9e9ab649a0cc6ec6c36038a3f59adec96aa0b677"},"previous_names":["mt1022/cubar"],"tags_count":11,"template":false,"template_full_name":null,"purl":"pkg:github/mt1022/cubar","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mt1022%2Fcubar","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mt1022%2Fcubar/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mt1022%2Fcubar/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mt1022%2Fcubar/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mt1022","download_url":"https://codeload.github.com/mt1022/cubar/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mt1022%2Fcubar/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":280376534,"owners_count":26320276,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-22T02:00:06.515Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","codon-usage","machine-learning","r-package","sequence-analysis"],"created_at":"2025-10-22T03:52:10.409Z","updated_at":"2025-10-22T03:52:15.554Z","avatar_url":"https://github.com/mt1022.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# cubar\n\n\u003e **Comprehensive Codon Usage Bias Analysis in R**\n\n\u003c!-- badges: start --\u003e\n[![CRAN status](https://www.r-pkg.org/badges/version/cubar)](https://CRAN.R-project.org/package=cubar)\n[![](https://cranlogs.r-pkg.org/badges/cubar)](https://cran.r-project.org/package=cubar)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.10155990.svg)](https://doi.org/10.5281/zenodo.10155990)\n[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)\n\n\u003c!-- badges: end --\u003e\n\n## Table of Contents\n\n- [Overview](#overview)\n- [Features](#features)\n  - [🧬 Codon-Level Analysis](#-codon-level-analysis)\n  - [📊 Gene-Level Metrics](#-gene-level-metrics)\n  - [🛠️ Utilities \\\u0026 Tools](#️-utilities--tools)\n- [Why Choose cubar?](#why-choose-cubar)\n- [Installation](#installation)\n  - [Stable Release (Recommended)](#stable-release-recommended)\n  - [Development Version](#development-version)\n  - [Dependencies](#dependencies)\n- [Documentation \\\u0026 Tutorials](#documentation--tutorials)\n  - [🎯 Getting Started](#-getting-started)\n  - [📚 Advanced Topics](#-advanced-topics)\n- [Example Workflow](#example-workflow)\n- [🆘 Getting Help](#-getting-help)\n- [Related Packages](#related-packages)\n- [License](#license)\n- [Acknowledgments](#acknowledgments)\n- [Citation](#citation)\n\n## Overview\n\nCodon usage bias refers to the non-uniform usage of synonymous codons (codons that encode the same amino acid) across different organisms, genes, and functional categories. **cubar** is a comprehensive R package for analyzing codon usage bias in coding sequences. It provides a unified framework for calculating established codon usage metrics, conducting sliding-window analyses or differential usage analyses, and optimizing sequences for heterologous expression.\n\n\n## Features\n\n### 🧬 Codon-Level Analysis\n- **RSCU calculation**: Relative synonymous codon usage analysis\n- **Amino acid usage**: Frequency of each amino acid in sequences\n- **Codon weights**: Calculate weights based on gene expression, tRNA availability, and mRNA stability\n- **Optimal codon inference**: Machine learning-based identification of optimal codons\n- **Codon-anticodon visualization**: Visualization of codon-tRNA pairing relationships\n\n### 📊 Gene-Level Metrics  \n- **Codon frequency tabulation**: Count codon occurrences across sequences\n- **CAI (Codon Adaptation Index)**: Measure similarity to highly expressed genes \n- **ENC (Effective Number of Codons)**: Assess codon usage bias strength\n- **Fop (Fraction of Optimal codons)**: Calculate proportion of optimal codons\n- **tAI (tRNA Adaptation Index)**: Match codon usage to tRNA availability\n- **CSCg (Codon Stabilization Coefficients)**: Quantify mRNA stability effects \n- **Dp (Deviation from Proportionality)**: Analyze virus-host codon usage relationships\n- **GC content metrics**: Overall GC, GC3s (3rd codon positions), GC4d (4-fold degenerate sites)\n\n### 🛠️ Utilities \u0026 Tools\n- **Sliding window analysis**: Positional codon usage patterns within genes\n- **Sequence optimization**: Redesign sequences for optimal expression\n- **Differential codon usage**: Statistical comparison between sequence sets\n- **Quality control**: Comprehensive CDS validation and preprocessing\n\n\n## Why Choose cubar?\n\n- **🚀 High Performance**: Process large datasets (\u003e100,000 sequences) efficiently using optimized `Biostrings` and `data.table` backends\n- **🧬 Flexible Genetic Codes**: Support for all NCBI genetic codes plus custom genetic code tables\n- **🔗 R Ecosystem Integration**: Seamlessly integrate with other bioinformatics and data analysis packages\n- **📚 Comprehensive Documentation**: Extensive tutorials, examples, and theoretical background\n- **🔬 Research Ready**: Implements established metrics with proper citations and validation\n\n\n## Installation\n\n### Stable Release (Recommended)\n\nInstall the latest stable version from CRAN:\n\n```r\ninstall.packages(\"cubar\")\n```\n\n### Development Version\n\nInstall the latest development version from GitHub:\n\n```r\n# Install devtools if not already installed\nif (!requireNamespace(\"devtools\", quietly = TRUE)) {\n    install.packages(\"devtools\")\n}\n\n# Install cubar from GitHub\ndevtools::install_github(\"mt1022/cubar\", dependencies = TRUE)\n```\n\n### Dependencies\n\n**System Requirements:**\n- R (≥ 4.1.0)\n\n**Required Packages:**\n- `Biostrings` (≥ 2.60.0) - Bioconductor package for sequence manipulation\n- `IRanges` (≥ 2.34.0) - Bioconductor infrastructure for range operations  \n- `data.table` (≥ 1.14.0) - High-performance data manipulation\n- `ggplot2` (≥ 3.3.5) - Data visualization\n- `rlang` (≥ 0.4.11) - Language tools\n\n**Note:** Bioconductor packages will be installed automatically, but you may need to update your R installation if you encounter compatibility issues.\n\n## Documentation \u0026 Tutorials\n📖 **Complete documentation** is available within R (`?function_name`) and on our [**package website**](https://mt1022.github.io/cubar/).\n\n### 🎯 Getting Started\n- [**Introduction to cubar**](https://mt1022.github.io/cubar/articles/cubar.html) - Basic usage and core functionality\n- [**Non-standard Genetic Codes**](https://mt1022.github.io/cubar/articles/non_standard_genetic_code.html) - Working with alternative genetic codes\n- [**Codon Optimization**](https://mt1022.github.io/cubar/articles/codon_optimization.html) - Sequence optimization strategies\n\n### 📚 Advanced Topics  \n- [**Mathematical Foundations**](https://mt1022.github.io/cubar/articles/theory.html) - Detailed theory behind the metrics\n- [**Function Reference**](https://mt1022.github.io/cubar/reference/) - Complete function documentation\n\n## Example Workflow\n\nHere's a toy example demonstrating key functionality:\n\n```r\nlibrary(cubar)\nlibrary(ggplot2)\n\n# 1. Load and quality-check sequences\ndata(yeast_cds)\nclean_cds \u003c- check_cds(yeast_cds)\n\n# 2. Calculate codon frequencies\ncodon_freq \u003c- count_codons(clean_cds)\n\n# 3. Calculate multiple metrics\nenc \u003c- get_enc(codon_freq)           # Effective number of codons\ngc3s \u003c- get_gc3s(codon_freq)         # GC content at 3rd positions\n\n# 4. Calculate CAI with RSCU of highly expressed genes\ndata(yeast_exp)\nyeast_exp \u003c- yeast_exp[yeast_exp$gene_id %in% rownames(codon_freq), ]\nhigh_expr \u003c- head(yeast_exp[order(-yeast_exp$fpkm), ], 500)\nrscu_high \u003c- est_rscu(codon_freq[high_expr$gene_id, ])\ncai \u003c- get_cai(codon_freq, rscu_high)\n\n# 5. Visualize results\ndf \u003c- data.frame(ENC = enc, CAI = cai, GC3s = gc3s)\nggplot(df, aes(color = GC3s, x = ENC, y = CAI)) + \n  geom_point(alpha = 0.6) + \n  scale_color_viridis_c() +\n  labs(title = \"Codon Usage Bias Relationships\",\n       x = \"Effective Number of Codons\", y = \"Codon Adaptation Index\")\n```\n\n## 🆘 Getting Help\n\n- **📋 GitHub Issues**: [Report bugs, request features, or ask questions](https://github.com/mt1022/cubar/issues)\n- **📖 Documentation**: Check function help (`?function_name`) and [online docs](https://mt1022.github.io/cubar/)\n\n\n## Related Packages\nFor complementary analysis, consider these R packages:\n\n- **[Biostrings](https://bioconductor.org/packages/release/bioc/html/Biostrings.html)** - Sequence input/output and manipulation\n- **[Peptides](https://github.com/dosorio/Peptides)** - Peptide and protein property calculations  \n\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- The R and Bioconductor communities for excellent foundational packages\n- Contributors and users who have provided feedback and improvements\n- **[GitHub Education](https://education.github.com/)** for providing free access to development tools\n- **GitHub Copilot** was used to suggest code snippets during development\n\n\n## Citation\nIf you use cubar in your research, please cite:\n\n\u003e Mengyue Liu, Bu Zi, Hebin Zhang, Hong Zhang, cubar: a versatile package for codon usage bias analysis in R, Genetics, 2025, iyaf191, https://doi.org/10.1093/genetics/iyaf191\n\nPlease also cite the original studies associated with any codon usage metrics or third-party software you use. You can find the relevant references in the documentation of the corresponding functions (for example, type `?cubar::get_enc` in the R console and check the \"References\" section in the help page).\n\n---\n\n\u003cdiv align=\"center\"\u003e\n\n**[📚 Documentation](https://mt1022.github.io/cubar/) • [🐛 Report Bug](https://github.com/mt1022/cubar/issues) • [💡 Request Feature](https://github.com/mt1022/cubar/issues)**\n\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmt1022%2Fcubar","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmt1022%2Fcubar","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmt1022%2Fcubar/lists"}