{"id":19434503,"url":"https://github.com/mlin/sqlite_web_vfs","last_synced_at":"2025-04-24T20:32:10.140Z","repository":{"id":49830471,"uuid":"332367521","full_name":"mlin/sqlite_web_vfs","owner":"mlin","description":"SQLite3 extension for read-only HTTP(S) database access","archived":false,"fork":false,"pushed_at":"2023-11-19T08:56:43.000Z","size":127,"stargazers_count":52,"open_issues_count":2,"forks_count":4,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-04-03T10:37:47.755Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mlin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-01-24T04:34:27.000Z","updated_at":"2025-03-22T10:34:21.000Z","dependencies_parsed_at":"2024-11-10T14:46:59.131Z","dependency_job_id":null,"html_url":"https://github.com/mlin/sqlite_web_vfs","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlin%2Fsqlite_web_vfs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlin%2Fsqlite_web_vfs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlin%2Fsqlite_web_vfs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlin%2Fsqlite_web_vfs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mlin","download_url":"https://codeload.github.com/mlin/sqlite_web_vfs/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250704843,"owners_count":21473771,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-10T14:46:38.022Z","updated_at":"2025-04-24T20:32:09.870Z","avatar_url":"https://github.com/mlin.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# sqlite_web_vfs\n\nThis [SQLite3 virtual filesystem extension](https://www.sqlite.org/vfs.html) provides read-only access to database files over HTTP(S), including S3 and the like, without involving a [FUSE mount](https://en.wikipedia.org/wiki/Filesystem_in_Userspace) (a fine alternative when available). **See also** the companion projects [sqlite_zstd_vfs](https://github.com/mlin/sqlite_zstd_vfs/) and [Genomics Extension for SQLite](https://github.com/mlin/GenomicSQLite), which include sqlite_web_vfs along with other features, most notably compression of the database file.\n\nWith the [extension loaded](https://sqlite.org/loadext.html), use the normal SQLite3 API to open the special URI: \n\n```\nfile:/__web__?mode=ro\u0026immutable=1\u0026vfs=web\u0026web_url={{PERCENT_ENCODED_URL}}\n```\n\nwhere `{{PERCENT_ENCODED_URL}}` is the database file's complete http(s) URL passed through [percent-encoding](https://en.wikipedia.org/wiki/Percent-encoding) (doubly encoding its own query string, if any). The URL server must support [GET range](https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests) requests, and the content must be immutable for the session.\n\n**USE AT YOUR OWN RISK:** This project is not associated with the SQLite developers.\n\n### Quick example\n\nA Python program to access the [Chinook sample database](https://github.com/lerocha/chinook-database) on GitHub directly:\n\n```python3\nimport sqlite3\nimport urllib.parse\n\nCHINOOK_URL = \"https://github.com/lerocha/chinook-database/raw/master/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite\"\n\ncon = sqlite3.connect(\":memory:\")  # just to load_extension\ncon.enable_load_extension(True)\ncon.load_extension(\"web_vfs\")      # web_vfs.{so,dylib} in current directory\ncon = sqlite3.connect(\n    f\"file:/__web__?vfs=web\u0026mode=ro\u0026immutable=1\u0026web_url={urllib.parse.quote(CHINOOK_URL)}\",\n    uri=True,\n)\nschema = list(con.execute(\"select type, name from sqlite_master\"))\nprint(schema)\n```\n\n### Build from source\n\n![CI](https://github.com/mlin/sqlite_web_vfs/workflows/CI/badge.svg?branch=main)\n\nRequirements:\n\n* Linux or macOS\n* C++11 build system with CMake\n* SQLite3 and libcurl dev packages\n* (Tests only) python3, pytest, aria2, libmicrohttpd-dev\n\n```\ncmake -DCMAKE_BUILD_TYPE=Release -B build . \u0026\u0026 cmake --build build -j8\nenv -C build ctest -V\n```\n\nThe extension library is `build/web_vfs.so` or `build/web_vfs.dylib`.\n\n\n### Configuration\n\nThe VFS logs a message to standard error upon any fatally failed HTTP request, and requests that succeed after having to be retried. The latter can be suppressed by setting `\u0026vfs_log=1` in the open URI, or by setting environment `SQLITE_VFS_LOG=1` in the environment. The log level can be set to 0 to suppress all standard error logging, or increased up to 5 for verbose request/response debug logging.\n\nTo disable TLS certificate and hostname verification, set `\u0026web_insecure=1` or `SQLITE_WEB_INSECURE=1`.\n\n### Optimization\n\nSQLite reads one small page at a time (default 4 KiB), which would be inefficient to serve with HTTP requests one-to-one. Instead, the VFS adaptively consolidates page fetching into larger HTTP requests, and concurrently reads ahead on background threads. This works well for point lookups and queries that scan largely-contiguous slices of tables and indexes (and a modest number thereof). It's less suitable for big multi-way joins and other aggressively random access patterns; in those cases, it's better to download the database file upfront to open locally.\n\nReaders should enlarge their [page cache](https://www.sqlite.org/pragma.html#pragma_cache_size) capacity as much as feasible, while budgeting an additional ~640 MiB RAM for the VFS prefetch buffers. (That ought to be enough for anybody.)\n\nTo optimize a database file to be served over the web, write it with the largest possible [page size](https://www.sqlite.org/pragma.html#pragma_page_size) of 64 KiB, and [VACUUM](https://sqlite.org/lang_vacuum.html) it once the contents are finalized. These steps minimize the random accesses needed for queries.\n\n### Advanced: helper .dbi files\n\nOptionally, the access pattern can be further streamlined by a small .dbi helper file served alongside the main database file. The VFS automatically probes for this by appending `.dbi` to the `web_url` (unless there's a query string). If that's not usable for any reason, the VFS falls back to direct access. Increase the log level to 3 or higher to see which mode is used.\n\nThe included [`sqlite_web_dbi.py`](sqlite_web_dbi.py) utility generates the .dbi helper for an immutable SQLite database file. Download and `chmod +x` this script, then `./sqlite_web_dbi.py my.db` to generate `my.db.dbi`, and publish the database and .dbi alongside each other. The .dbi must be regenerated if the database subsequently changes. (The VFS makes a reasonable effort to detect \u0026 ignore out-of-date .dbi, but this cannot be guaranteed.)\n\nThe automatic probe can be overridden by setting `\u0026web_dbi_url=` to different percent-encoded URL for the .dbi file, or to a percent-encoded `file:/path/to.dbi` downloaded beforehand. Use the latter feature to save multiple connections from each having to fetch the .dbi separately. Lastly, set `\u0026web_nodbi=1` or `SQLITE_WEB_NODBI=1` to disable dbi mode entirely.\n\nThe .dbi helper is optional, but often beneficial for big databases accessed with high-latency requests. It collects bits of the main file that are key for navigating it, but typically scattered throughout (even after vacuum). These include interior nodes of SQLite's B-trees, and various metadata tables. Prefetching them in the compact .dbi saves the VFS from having to pluck them from all over the main file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlin%2Fsqlite_web_vfs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmlin%2Fsqlite_web_vfs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlin%2Fsqlite_web_vfs/lists"}