{"id":20693436,"url":"https://github.com/nanxstats/rcpi","last_synced_at":"2025-08-22T05:31:05.401Z","repository":{"id":19752401,"uuid":"23009539","full_name":"nanxstats/Rcpi","owner":"nanxstats","description":"💊 Molecular informatics toolkit with integration of bioinformatics and cheminformatics tools for drug discovery","archived":false,"fork":false,"pushed_at":"2024-09-18T13:18:26.000Z","size":11499,"stargazers_count":37,"open_issues_count":4,"forks_count":12,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-12-16T21:57:49.093Z","etag":null,"topics":["bioconductor","bioinformatics","cheminformatics","drug-discovery","feature-extraction","fingerprint","molecular-descriptors","protein-sequences"],"latest_commit_sha":null,"homepage":"https://nanx.me/Rcpi/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"artistic-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nanxstats.png","metadata":{"files":{"readme":"README.md","changelog":"NEWS","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2014-08-16T03:12:24.000Z","updated_at":"2024-11-13T02:55:50.000Z","dependencies_parsed_at":"2022-08-05T05:15:18.375Z","dependency_job_id":"d0550087-5b61-4c1b-b09f-d1d00da629d5","html_url":"https://github.com/nanxstats/Rcpi","commit_stats":null,"previous_names":[],"tags_count":21,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nanxstats%2FRcpi","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nanxstats%2FRcpi/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nanxstats%2FRcpi/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nanxstats%2FRcpi/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nanxstats","download_url":"https://codeload.github.com/nanxstats/Rcpi/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":230561013,"owners_count":18245324,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioconductor","bioinformatics","cheminformatics","drug-discovery","feature-extraction","fingerprint","molecular-descriptors","protein-sequences"],"created_at":"2024-11-16T23:26:39.568Z","updated_at":"2024-12-20T09:06:24.704Z","avatar_url":"https://github.com/nanxstats.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Rcpi \u003cimg src=\"man/figures/logo.png\" align=\"right\" width=\"120\" /\u003e\n\n\u003c!-- badges: start --\u003e\n[![bioc](https://www.bioconductor.org/shields/years-in-bioc/Rcpi.svg)](https://bioconductor.org/packages/release/bioc/html/Rcpi.html#since)\n[![downloads](https://bioconductor.org/shields/downloads/release/Rcpi.svg)](https://bioconductor.org/packages/stats/bioc/Rcpi/)\n[![R-CMD-check](https://github.com/nanxstats/Rcpi/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/nanxstats/Rcpi/actions/workflows/R-CMD-check.yaml)\n\u003c!-- badges: end --\u003e\n\n## Overview\n\nRcpi offers a molecular informatics toolkit with a comprehensive integration of bioinformatics and cheminformatics tools for drug discovery. For more information, please see our paper \u003c[DOI:10.1093/bioinformatics/btu624](https://doi.org/10.1093/bioinformatics/btu624)\u003e ([PDF](https://nanx.me/papers/Rcpi.pdf)).\n\n## Paper Citation\n\nFormatted citation:\n\nDong-Sheng Cao, Nan Xiao, Qing-Song Xu, and Alex F. Chen. (2015). Rcpi: R/Bioconductor package to generate various descriptors of proteins, compounds and their interactions. _Bioinformatics_ 31 (2), 279-281.\n\nBibTeX entry:\n\n```bibtex\n@article{Rcpi2015,\n  author  = {Cao, Dong-Sheng and Xiao, Nan and Xu, Qing-Song and Chen, Alex F.},\n  title   = {{Rcpi: R/Bioconductor package to generate various descriptors of proteins, compounds and their interactions}},\n  journal = {Bioinformatics},\n  year    = {2015},\n  volume  = {31},\n  number  = {2},\n  pages   = {279--281},\n  doi     = {10.1093/bioinformatics/btu624}\n}\n```\n\nBrowse the [workflow](https://nanx.me/Rcpi/articles/Rcpi.html) and\n[cheatsheet](https://nanx.me/Rcpi/articles/Rcpi-quickref.html)\nvignettes to get started.\n\n## Installation\n\n### Install Rcpi\n\nInstall the Rcpi package via BiocManager. If BiocManager is not already installed:\n\n```r\ninstall.packages(\"BiocManager\")\n```\n\nThen install Rcpi:\n\n```r\nBiocManager::install(\"Rcpi\")\n```\n\n### Manage dependencies\n\nSome features in the Rcpi package rely on certain R packages which may\nrequire specific system configurations to install from source.\nTo make the build process robust, these dependencies have been configured\nas runtime dependencies. Here are some instructions for installing such\ndependencies to enable the features in Rcpi.\n\n#### Install rcdk\n\nrcdk can be installed from either CRAN or GitHub:\n\n```r\ninstall.packages(\"rcdk\", type = \"source\")\nremotes::install_github(\"CDK-R/cdkr\", subdir = \"rcdk\")\n```\n\nrcdk requires JDK and rJava to be installed and configured on your system.\nCheck out the [rJava readme](https://github.com/s-u/rJava) for installation\nand troubleshooting instructions.\n\n#### Install cheminformatics packages\n\nAdditional packages for cheminformatics capabilities are available\nfrom Bioconductor:\n\n```r\nBiocManager::install(c(\"fmcsR\", \"ChemmineR\", \"ChemmineOB\"))\n```\n\nChemmineOB requires Open Babel to compile from source.\nEnsure Open Babel is properly installed on your system.\n\n## Features\n\nRcpi implemented and integrated the state-of-the-art protein sequence descriptors and molecular descriptors/fingerprints with R. For protein sequences, the Rcpi package could\n\n- Calculate six protein descriptor groups composed of fourteen types of commonly used structural and physicochemical descriptors that include 9920 descriptors.\n\n- Calculate six types of generalized scales-based descriptors derived by various dimensionality reduction methods for proteochemometric (PCM) modeling.\n\n- Parallelized pairwise similarity computation derived by protein sequence alignment and Gene Ontology (GO) semantic similarity measures within a list of proteins.\n\nFor small molecules, the Rcpi package could\n\n- Calculate 307 molecular descriptors (2D/3D), including constitutional, topological, geometrical, and electronic descriptors, etc.\n\n- Calculate more than ten types of molecular fingerprints, including FP4 keys, E-state fingerprints, MACCS keys, etc., and parallelized chemical similarity search.\n\n- Parallelized pairwise similarity computation derived by fingerprints and maximum common substructure search within a list of small molecules.\n\nBy combining various types of descriptors for drugs and proteins in different methods, interaction descriptors representing protein-protein or compound-protein interactions could be conveniently generated with Rcpi, including:\n\n- Two types of compound-protein interaction (CPI) descriptors\n\n- Three types of protein-protein interaction (PPI) descriptors\n\nSeveral useful auxiliary utilities are also shipped with Rcpi:\n\n- Parallelized molecule and protein sequence retrieval from several online databases, like PubChem, ChEMBL, KEGG, DrugBank, UniProt, RCSB PDB, etc.\n\n- Loading molecules stored in SMILES/SDF files and loading protein sequences from FASTA/PDB files\n\n- Molecular file format conversion\n\nThe computed protein sequence descriptors, molecular descriptors/fingerprints, interaction descriptors and pairwise similarities are widely used in various research fields relevant to drug discovery, primarily bioinformatics, cheminformatics, proteochemometrics, and chemogenomics.\n\n## Contribute\n\nTo contribute to this project, please take a look at the\n[Contributing Guidelines](https://nanx.me/Rcpi/CONTRIBUTING.html) first.\nPlease note that the Rcpi project is released with a\n[Contributor Code of Conduct](https://nanx.me/Rcpi/CODE_OF_CONDUCT.html).\nBy contributing to this project, you agree to abide by its terms.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnanxstats%2Frcpi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnanxstats%2Frcpi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnanxstats%2Frcpi/lists"}