{"id":14066742,"url":"https://github.com/DyfanJones/s3fs","last_synced_at":"2025-07-29T23:32:07.325Z","repository":{"id":38828678,"uuid":"506333811","full_name":"DyfanJones/s3fs","owner":"DyfanJones","description":"Access Amazon Web Service 'S3' as if it were a file system. File system 'API' design around R package 'fs'","archived":false,"fork":false,"pushed_at":"2025-04-01T16:11:32.000Z","size":1489,"stargazers_count":44,"open_issues_count":2,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-24T14:54:43.868Z","etag":null,"topics":["aws","aws-s3","fs","minio","r","r-package"],"latest_commit_sha":null,"homepage":"https://dyfanjones.github.io/s3fs/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DyfanJones.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-06-22T16:56:22.000Z","updated_at":"2025-04-11T13:31:38.000Z","dependencies_parsed_at":"2024-07-25T14:26:48.349Z","dependency_job_id":"6340a8d4-6ba6-4f96-a08f-3cc7b203806d","html_url":"https://github.com/DyfanJones/s3fs","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/DyfanJones/s3fs","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DyfanJones%2Fs3fs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DyfanJones%2Fs3fs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DyfanJones%2Fs3fs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DyfanJones%2Fs3fs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DyfanJones","download_url":"https://codeload.github.com/DyfanJones/s3fs/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DyfanJones%2Fs3fs/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267780023,"owners_count":24143201,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-29T02:00:12.549Z","response_time":2574,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","aws-s3","fs","minio","r","r-package"],"created_at":"2024-08-13T07:05:14.390Z","updated_at":"2025-07-29T23:32:06.987Z","avatar_url":"https://github.com/DyfanJones.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n# s3fs\n\n\u003c!-- badges: start --\u003e\n[![s3fs status badge](https://dyfanjones.r-universe.dev/badges/s3fs)](https://dyfanjones.r-universe.dev/s3fs)\n[![R-CMD-check](https://github.com/DyfanJones/s3fs/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/DyfanJones/s3fs/actions/workflows/R-CMD-check.yaml)\n[![Codecov test coverage](https://codecov.io/gh/DyfanJones/s3fs/branch/main/graph/badge.svg)](https://app.codecov.io/gh/DyfanJones/s3fs?branch=main)\n[![CRAN status](https://www.r-pkg.org/badges/version/s3fs)](https://CRAN.R-project.org/package=s3fs)\n\u003c!-- badges: end --\u003e\n\n\n`s3fs` provides a file-system like interface into Amazon Web Services\nfor `R`. It utilizes [`paws`](https://github.com/paws-r/paws) `SDK`and\n[`R6`](https://github.com/r-lib/R6) for it's core design. This repo has been inspired by\nPython’s [`s3fs`](https://github.com/fsspec/s3fs), however it’s API and\nimplementation has been developed to follow `R`’s\n[`fs`](https://github.com/r-lib/fs).\n\n## Installation\n\nYou can install the released version of s3fs from [CRAN](https://cran.r-project.org/) with:\n```r\ninstall.packages('s3fs')\n```\n\nr-universe installation:\n```r\n# Enable repository from dyfanjones\noptions(repos = c(\n  dyfanjones = 'https://dyfanjones.r-universe.dev',\n  CRAN = 'https://cloud.r-project.org')\n)\n\n# Download and install s3fs in R\ninstall.packages('s3fs')\n```\n\nGithub installation\n\n```r\nremotes::install_github(\"dyfanjones/s3fs\")\n```\n\n### Dependencies\n\n* [`paws`](https://github.com/paws-r/paws): connection with AWS S3\n* [`R6`](https://github.com/r-lib/R6): Setup core class\n* [`data.table`](https://github.com/Rdatatable/data.table): wrangle lists into data.frames\n* [`fs`](https://github.com/r-lib/fs): file system on local files\n* [`lgr`](https://github.com/s-fleck/lgr): set up logging\n* [`future`](https://github.com/HenrikBengtsson/future): set up async functionality\n* [`future.apply`](https://github.com/HenrikBengtsson/future.apply): set up parallel looping\n\n# Comparison with `fs`\n\n`s3fs` attempts to give the same interface as `fs` when handling files on AWS S3 from `R`.\n\n- **Vectorization**. All `s3fs` functions are vectorized, accepting multiple path inputs similar to `fs`.\n- **Predictable**. \n  - Non-async functions return values that convey a path.\n  - Async functions return a `future` object of it's no-async counterpart.\n  - The only exception will be `s3_stream_in` which returns a list of raw objects.\n- **Naming conventions**. s3fs functions follows `fs` naming conventions with `dir_*`, `file_*` and `path_*` however with the syntax `s3_` infront i.e `s3_dir_*`, `s3_file_*` and `s3_path_*` etc.\n- **Explicit failure**. Similar to `fs` if a failure happens, then it will be raised and not masked with a warning.\n\n# Extra features:\n\n- **Scalable**. All `s3fs` functions are designed to have the option to run in parallel through the use of `future` and `future.apply`.\n\nFor example: copy a large file from one location to the next.\n```r\nlibrary(s3fs)\nlibrary(future)\n\nplan(\"multisession\")\n\ns3_file_copy(\"s3://mybucket/multipart/large_file.csv\", \"s3://mybucket/new_location/large_file.csv\")\n```\n\n`s3fs` to copy a large file (\u003e 5GB) using multiparts, `future` allows each multipart to run in parallel to speed up the process.\n\n- **Async**. `s3fs` uses `future` to create a few key async functions. This is more focused on functions that might be moving large files to and from `R` and `AWS S3`.\n\nFor example: Copying a large file from `AWS S3` to `R`.\n```r\nlibrary(s3fs)\nlibrary(future)\n\nplan(\"multisession\")\n\ns3_file_copy_async(\"s3://mybucket/multipart/large_file.csv\", \"large_file.csv\")\n```\n\n## Usage\n\n`fs` has a straight forward API with 4 core themes:\n\n- `path_` for manipulating and constructing paths\n- `file_` for files\n- `dir_` for directories\n- `link_` for links\n\n`s3fs` follows theses themes with the following:\n\n- `s3_path_` for manipulating and constructing s3 uri paths\n- `s3_file_` for s3 files\n- `s3_dir_` for s3 directories\n\n**NOTE:** `link_` is currently not supported.\n\n``` r\nlibrary(s3fs)\n\n# Construct a path to a file with `path()`\ns3_path(\"foo\", \"bar\", letters[1:3], ext = \"txt\")\n#\u003e [1] \"s3://foo/bar/a.txt\" \"s3://foo/bar/b.txt\" \"s3://foo/bar/c.txt\"\n\n# list buckets\ns3_dir_ls()\n#\u003e [1] \"s3://MyBucket1\"\n#\u003e [2] \"s3://MyBucket2\"                                        \n#\u003e [3] \"s3://MyBucket3\"               \n#\u003e [4] \"s3://MyBucket4\"                            \n#\u003e [5] \"s3://MyBucket5\"\n\n# list files in bucket\ns3_dir_ls(\"s3://MyBucket5\")\n#\u003e [1] \"s3://MyBucket5/iris.json\"     \"s3://MyBucket5/athena-query/\"\n#\u003e [3] \"s3://MyBucket5/data/\"         \"s3://MyBucket5/default/\"     \n#\u003e [5] \"s3://MyBucket5/iris/\"         \"s3://MyBucket5/made-up/\"     \n#\u003e [7] \"s3://MyBucket5/test_df/\"\n\n# create a new directory\ntmp \u003c- s3_dir_create(s3_file_temp(tmp_dir = \"MyBucket5\"))\ntmp\n#\u003e [1] \"s3://MyBucket5/filezwkcxx9q5562\"\n\n# create new files in that directory\ns3_file_create(s3_path(tmp, \"my-file.txt\"))\n#\u003e [1] \"s3://MyBucket5/filezwkcxx9q5562/my-file.txt\"\ns3_dir_ls(tmp)\n#\u003e [1] \"s3://MyBucket5/filezwkcxx9q5562/my-file.txt\"\n\n# remove files from the directory\ns3_file_delete(s3_path(tmp, \"my-file.txt\"))\ns3_dir_ls(tmp)\n#\u003e character(0)\n\n# remove the directory\ns3_dir_delete(tmp)\n```\n\n\u003csup\u003eCreated on 2022-06-21 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)\u003c/sup\u003e\n\nSimilar to `fs`, `s3fs` is designed to work well with the pipe.\n\n``` r\nlibrary(s3fs)\npaths \u003c- s3_file_temp(tmp_dir = \"MyBucket\") |\u003e\n s3_dir_create() |\u003e\n s3_path(letters[1:5]) |\u003e\n s3_file_create()\npaths\n#\u003e [1] \"s3://MyBucket/fileazqpwujaydqg/a\"\n#\u003e [2] \"s3://MyBucket/fileazqpwujaydqg/b\"\n#\u003e [3] \"s3://MyBucket/fileazqpwujaydqg/c\"\n#\u003e [4] \"s3://MyBucket/fileazqpwujaydqg/d\"\n#\u003e [5] \"s3://MyBucket/fileazqpwujaydqg/e\"\n\npaths |\u003e s3_file_delete()\n#\u003e [1] \"s3://MyBucket/fileazqpwujaydqg/a\"\n#\u003e [2] \"s3://MyBucket/fileazqpwujaydqg/b\"\n#\u003e [3] \"s3://MyBucket/fileazqpwujaydqg/c\"\n#\u003e [4] \"s3://MyBucket/fileazqpwujaydqg/d\"\n#\u003e [5] \"s3://MyBucket/fileazqpwujaydqg/e\"\n```\n\n\u003csup\u003eCreated on 2022-06-22 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)\u003c/sup\u003e\n\n**NOTE:** all examples have be developed from `fs`.\n\n### File systems that emulate S3\n\n`s3fs` allows you to connect to file systems that provides an S3-compatible interface. For example, [MinIO](https://min.io/) offers high-performance, S3 compatible object storage. \nYou will be able to connect to your `MinIO` server using `s3fs::s3_file_system`:\n\n``` r\nlibrary(s3fs)\n\ns3_file_system(\n  aws_access_key_id = \"minioadmin\",  \n  aws_secret_access_key = \"minioadmin\",\n  endpoint = \"http://localhost:9000\"\n)\n\ns3_dir_ls()\n#\u003e [1] \"\"\n\ns3_bucket_create(\"s3://testbucket\")\n#\u003e [1] \"s3://testbucket\"\n\n# refresh cache\ns3_dir_ls(refresh = T)\n#\u003e [1] \"s3://testbucket\"\n\ns3_bucket_delete(\"s3://testbucket\")\n#\u003e [1] \"s3://testbucket\"\n\n# refresh cache\ns3_dir_ls(refresh = T)\n#\u003e [1] \"\"\n```\n\n\u003csup\u003eCreated on 2022-12-14 with [reprex v2.0.2](https://reprex.tidyverse.org)\u003c/sup\u003e\n\n**NOTE:** if you to want change from AWS S3 to Minio in the same R session, you will need to set the parameter `refresh = TRUE` when calling `s3_file_system` again.\nYou can use multiple sessions by using the R6 class `S3FileSystem` directly.\n\n# Feedback wanted\n\nPlease open a Github ticket raising any issues or feature requests.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDyfanJones%2Fs3fs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FDyfanJones%2Fs3fs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDyfanJones%2Fs3fs/lists"}