{"id":16545052,"url":"https://github.com/kenyony/flaxkv","last_synced_at":"2025-07-24T13:10:43.910Z","repository":{"id":203992424,"uuid":"710867500","full_name":"KenyonY/flaxkv","owner":"KenyonY","description":"🗲 A high-performance on-disk dictionary.","archived":false,"fork":false,"pushed_at":"2025-02-02T07:50:41.000Z","size":256,"stargazers_count":28,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-27T01:11:09.896Z","etag":null,"topics":["database","deep-learning","dictionary","leveldb","llm","lmdb","machine-learning","persistent-storage","vector-store"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/KenyonY.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-27T15:53:02.000Z","updated_at":"2025-01-07T07:53:16.000Z","dependencies_parsed_at":"2024-05-19T08:27:40.693Z","dependency_job_id":"56759068-a349-4438-89d6-7bba5c61cbd8","html_url":"https://github.com/KenyonY/flaxkv","commit_stats":{"total_commits":77,"total_committers":1,"mean_commits":77.0,"dds":0.0,"last_synced_commit":"815e293eee6c1b6daee327c47a565807ae8fb518"},"previous_names":["kenyony/flaxkv"],"tags_count":25,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KenyonY%2Fflaxkv","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KenyonY%2Fflaxkv/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KenyonY%2Fflaxkv/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KenyonY%2Fflaxkv/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/KenyonY","download_url":"https://codeload.github.com/KenyonY/flaxkv/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243725077,"owners_count":20337660,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database","deep-learning","dictionary","leveldb","llm","lmdb","machine-learning","persistent-storage","vector-store"],"created_at":"2024-10-11T19:05:39.126Z","updated_at":"2025-03-15T11:33:24.185Z","avatar_url":"https://github.com/KenyonY.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\r\n\u003ch1 align=\"center\"\u003e\r\n    \u003cbr\u003e\r\n    🗲  FlaxKV\r\n\u003c/h1\u003e\r\n\r\n\r\n\u003cp align=\"center\"\u003e\r\nA high-performance dictionary database.\r\n\u003c/p\u003e\r\n\u003cp align=\"center\"\u003e\r\n    \u003ca href=\"https://pypi.org/project/flaxkv/\"\u003e\r\n        \u003cimg src=\"https://img.shields.io/pypi/v/flaxkv?color=brightgreen\u0026style=flat-square\" alt=\"PyPI version\" \u003e\r\n    \u003c/a\u003e\r\n    \u003ca href=\"https://github.com/KenyonY/flaxkv/blob/main/LICENSE\"\u003e\r\n        \u003cimg alt=\"License\" src=\"https://img.shields.io/github/license/KenyonY/flaxkv.svg?color=blue\u0026style=flat-square\"\u003e\r\n    \u003c/a\u003e\r\n    \u003ca href=\"https://github.com/KenyonY/flaxkv/releases\"\u003e\r\n        \u003cimg alt=\"Release (latest by date)\" src=\"https://img.shields.io/github/v/release/KenyonY/flaxkv?\u0026style=flat-square\"\u003e\r\n    \u003c/a\u003e\r\n    \u003ca href=\"https://github.com/KenyonY/flaxkv/actions/workflows/ci.yml\"\u003e\r\n        \u003cimg alt=\"tests\" src=\"https://img.shields.io/github/actions/workflow/status/KenyonY/flaxkv/ci.yml?style=flat-square\u0026label=tests\"\u003e\r\n    \u003c/a\u003e\r\n    \u003ca href=\"https://pypistats.org/packages/flaxkv\"\u003e\r\n        \u003cimg alt=\"pypi downloads\" src=\"https://img.shields.io/pypi/dm/flaxkv?style=flat-square\"\u003e\r\n    \u003c/a\u003e\r\n\u003c/p\u003e\r\n\r\n\u003ch4 align=\"center\"\u003e\r\n    \u003cp\u003e\r\n        \u003cb\u003eEnglish\u003c/b\u003e |\r\n        \u003ca href=\"https://github.com/KenyonY/flaxkv/blob/main/README_ZH.md\"\u003e简体中文\u003c/a\u003e \r\n    \u003c/p\u003e\r\n\u003c/h4\u003e\r\n\r\n\u003cp \u003e\r\n\u003cbr\u003e\r\n\u003c/p\u003e\r\n\r\n\r\nThe `flaxkv` provides an interface very similar to a dictionary for interacting with high-performance key-value databases. More importantly, as a persistent database, it offers performance close to that of native dictionaries (in-memory access).  \r\nYou can use it just like a Python dictionary without having to worry about blocking your user process when operating the database at any time.\r\n\r\n---\r\n\r\n## Key Features\r\n\r\n- **Always Up-to-date, Never Blocking**: It was designed from the ground up to ensure that no write operations block the user process, while users can always read the most recently written data.\r\n\r\n- **Ease of Use**: Interacting with the database feels just like using a Python dictionary! You don't even have to worry about resource release.\r\n\r\n- **Buffered Writing**: Data is buffered and scheduled for write to the database, reducing the overhead of frequent database writes.\r\n\r\n- **High-Performance Database Backend**: Uses the high-performance key-value database LevelDB as its default backend.\r\n\r\n- **Atomic Operations**: Ensures that write operations are atomic, safeguarding data integrity.\r\n\r\n- **Thread-Safety**: Employs only necessary locks to ensure safe concurrent access while balancing performance.\r\n\r\n---\r\n\r\n## Quick Start\r\n\r\n### Installation\r\n\r\n```bash\r\npip install flaxkv \r\n# Install with server version: pip install flaxkv[server]\r\n```\r\n### Usage\r\n\r\n```python\r\nfrom flaxkv import FlaxKV\r\nimport numpy as np\r\nimport pandas as pd\r\n\r\ndb = FlaxKV('test_db')\r\n\"\"\"\r\nOr start as a server\r\n\u003e\u003e\u003e flaxkv run --port 8000\r\n\r\nClient call:\r\ndb = FlaxKV('test_db', root_path_or_url='http://localhost:8000')\r\n\"\"\"\r\n\r\ndb[1] = 1\r\ndb[1.1] = 1 / 3\r\ndb['key'] = 'value'\r\ndb['a dict'] = {'a': 1, 'b': [1, 2, 3]}\r\ndb['a list'] = [1, 2, 3, {'a': 1}]\r\ndb[(1, 2, 3)] = [1, 2, 3]\r\ndb['numpy array'] = np.random.randn(100, 100)\r\ndb['df'] = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})\r\n\r\ndb.setdefault('key', 'value_2')\r\nassert db['key'] == 'value'\r\n\r\ndb.update({\"key1\": \"value1\", \"key2\": \"value2\"})\r\n\r\nassert 'key2' in db\r\n\r\ndb.pop(\"key1\")\r\nassert 'key1' not in db\r\n\r\nfor key, value in db.items():\r\n    print(key, value)\r\n\r\nprint(len(db))\r\n```\r\n\r\n### Tips\r\n- `flaxkv` provides performance close to native dictionary (in-memory) access as a persistent database! (See benchmark below)\r\n- You may have noticed that in the previous example code, `db.close()` was not used to release resources! Because all this will be automatically handled by `flaxkv`. Of course, you can also manually call db.close() to immediately release resources.\r\n\r\n### Benchmark\r\n![benchmark](.github/img/benchmark.png)\r\n\r\nTest Content: Write and read traversal for N numpy array vectors (each vector is 1000-dimensional). \r\n\r\nExecute the test:\r\n```bash\r\ncd benchmark/\r\npytest -s -v run.py\r\n```\r\n\r\n\r\n### Use Cases\r\n- **Key-Value Structure:**\r\nUsed to save simple key-value structure data.\r\n- **High-Frequency Writing:**\r\nVery suitable for scenarios that require high-frequency insertion/update of data.\r\n- **Machine Learning:**\r\n`flaxkv` is very suitable for saving various large datasets of embeddings, images, texts, and other key-value structures in machine learning.\r\n\r\n### Limitations\r\n* In the current version, due to the delayed writing feature, in a multi-process environment, \r\none process cannot read the data written by another process in real-time (usually delayed by a few seconds). \r\nIf immediate writing is desired, the .write_immediately() method must be called. \r\nThis limitation does not exist in a single-process environment.\r\n* By default, the value does not support the `Tuple`, `Set` types. If these types are forcibly set, they will be deserialized into a `List`.\r\n \r\n## Citation\r\nIf `FlaxKV` has been helpful to your research, please cite:\r\n```bibtex\r\n@misc{flaxkv,\r\n    title={FlaxKV: An Easy-to-use and High Performance Key-Value Database Solution},\r\n    author={K.Y},\r\n    howpublished = {\\url{https://github.com/KenyonY/flaxkv}},\r\n    year={2023}\r\n}\r\n```\r\n\r\n## Contributions\r\nFeel free to make contributions to this module by submitting pull requests or raising issues in the repository.\r\n\r\n## License\r\n`FlaxKV` is licensed under the [Apache-2.0 License](./LICENSE).\r\n\r\n\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkenyony%2Fflaxkv","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkenyony%2Fflaxkv","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkenyony%2Fflaxkv/lists"}