{"id":19140799,"url":"https://github.com/codait/r4ml","last_synced_at":"2025-07-02T11:34:51.164Z","repository":{"id":71919248,"uuid":"85637349","full_name":"CODAIT/r4ml","owner":"CODAIT","description":"Scalable R for Machine Learning","archived":false,"fork":false,"pushed_at":"2018-09-11T17:31:12.000Z","size":36788,"stargazers_count":43,"open_issues_count":2,"forks_count":13,"subscribers_count":37,"default_branch":"master","last_synced_at":"2025-05-06T23:17:04.852Z","etag":null,"topics":["bigdata","distributed","machine-learning","r","scalable"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CODAIT.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2017-03-20T23:25:29.000Z","updated_at":"2025-04-08T02:16:06.000Z","dependencies_parsed_at":null,"dependency_job_id":"fab3eefe-39bf-43fd-9cd7-df23ca105f7a","html_url":"https://github.com/CODAIT/r4ml","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CODAIT%2Fr4ml","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CODAIT%2Fr4ml/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CODAIT%2Fr4ml/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CODAIT%2Fr4ml/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CODAIT","download_url":"https://codeload.github.com/CODAIT/r4ml/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252782835,"owners_count":21803410,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bigdata","distributed","machine-learning","r","scalable"],"created_at":"2024-11-09T07:18:52.237Z","updated_at":"2025-05-06T23:17:12.444Z","avatar_url":"https://github.com/CODAIT.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# \u003cimg src=\"R4ML/inst/images/r4ml-logo.png\" alt=\"R4ML Logo\"/\u003e\n\n# __**What is R4ML?**__\n\nR4ML is a scalable, hybrid approach to ML/Stats using R, Apache SystemML, and Apache Spark\n\n## __**R4ML Key Features**__\n\n - R4ML is a git downloadable open source R package from IBM\n - Created on top of SparkR and Apache SystemML (so it supports features from both)\n - Acts as a R bridge between SparkR and Apache SystemML\n - Provides a collection of canned algorithms\n - Provides the ability to create custom ML algorithms\n - Provides both SparkR and Apache SystemML functionality\n - APIs are friendlier to the R user\n\n## __**R4ML Architecture**__\n\n\u003cimg src=\"R4ML/inst/images/r4ml_architecture_simplified.png\" alt=\"R4ML Simple Architecture\" width=\"340\" height=\"250\"/\u003e\n\n## __**How to install**__\n  \n  Quick install (run from R console):\n    \n    # Download Apache Spark 2.1.0 (Note: Java must be installed)\n    download.file(\"http://archive.apache.org/dist/spark/spark-2.1.0/spark-2.1.0-bin-hadoop2.7.tgz\", \"~/spark-2.1.0-bin-hadoop2.7.tgz\")\n    system(\"tar -xvf ~/spark-2.1.0-bin-hadoop2.7.tgz\")\n    Sys.setenv(\"SPARK_HOME\" = file.path(getwd(), \"spark-2.1.0-bin-hadoop2.7\"))\n  \n    # Add the library path for SparkR\n    .libPaths(c(.libPaths(), \"~/spark-2.1.0-bin-hadoop2.7/R/lib/\"))\n\n    # Install R4ML dependencies\n    install.packages(c(\"uuid\", \"R6\"), repos = \"http://cloud.r-project.org\")\n\n    # Download and install R4ML\n    download.file(\"http://codait-r4ml.s3-api.us-geo.objectstorage.softlayer.net/R4ML_0.8.0.tar.gz\", \"~/R4ML_0.8.0.tar.gz\")\n    install.packages(\"~/R4ML_0.8.0.tar.gz\", repos = NULL, type = \"source\")\n\n    # Load dependencies and use R4ML\n    library(\"SparkR\", lib.loc = \"~/spark-2.1.0-bin-hadoop2.7/R/lib/\")\n    library(\"R4ML\")\n    r4ml.session(sparkHome = file.path(getwd(), \"spark-2.1.0-bin-hadoop2.7\"))\n  \n  More detailed instructions can be found [here](./docs/r4ml-install.md).\n\n## __**How to Use R4ML**__\n\n  Once you have installed R4ML it is time to use it for scalable machine learning and \n  data analysis. Look at the section on [R4ML Examples](./docs/r4ml-examples.md).\n\n## __**R4ML Documentation**__\n\n After you follow the instruction at 'How to install', you can point your browser to \n ```\n $R4ML_INSTALLED_LOCATION/R4ML/html/00Index.html\n ```\n\n For example, if you have installed in the `/home/data-scientist/codait` then open a \n web browser and type in the following in the url\n\n ```\n file:///home/data-scientist/codait/R4ML/html/00Index.html\n ```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodait%2Fr4ml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcodait%2Fr4ml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodait%2Fr4ml/lists"}