{"id":25953976,"url":"https://github.com/abdelhai/oakdb","last_synced_at":"2025-03-04T15:40:04.364Z","repository":{"id":277160007,"uuid":"931206602","full_name":"abdelhai/oakdb","owner":"abdelhai","description":"🌳 a local-first database with built-in vector and full-text search","archived":false,"fork":false,"pushed_at":"2025-02-15T14:46:26.000Z","size":38,"stargazers_count":12,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-28T00:20:39.854Z","etag":null,"topics":["database","full-text","search","sqlite","vector"],"latest_commit_sha":null,"homepage":"https://discord.gg/JhjfywPr","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/abdelhai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-11T22:18:48.000Z","updated_at":"2025-02-25T16:57:38.000Z","dependencies_parsed_at":"2025-02-12T13:42:57.498Z","dependency_job_id":"2f3402f6-b133-4f12-be81-aca189c71c71","html_url":"https://github.com/abdelhai/oakdb","commit_stats":null,"previous_names":["abdelhai/oakdb"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abdelhai%2Foakdb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abdelhai%2Foakdb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abdelhai%2Foakdb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abdelhai%2Foakdb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/abdelhai","download_url":"https://codeload.github.com/abdelhai/oakdb/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241876001,"owners_count":20035368,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database","full-text","search","sqlite","vector"],"created_at":"2025-03-04T15:40:03.479Z","updated_at":"2025-03-04T15:40:04.348Z","avatar_url":"https://github.com/abdelhai.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# OakDB\n\nA nifty local-first database with full-text and vector similarity search. Ideal for desktop apps and personal web apps.\n\nOakDB is powered by SQLite (and [`sqlite-vec`](https://github.com/asg017/sqlite-vec)) and runs completely locally, with embeddings generated on-device using [`llama.cpp`](https://github.com/ggerganov/llama.cpp).\n\n## Install\n\n_Note: OakDB is still a new software and not thoroughly tested. Caution is advised and feedback is encouraged!_\n\n**Default (only NoSQL and full-text search)**\n\n```sh\npip install oakdb\n```\n\n**With vector similarity search:**\n\n```sh\npip install \"oakdb[vector]\"\n```\n\n\u003e Note: Vector search is compatible with Python installations that support SQLite extensions. The recommended installation method is through [Homebrew](https://brew.sh): `brew install python`\n\n## Use\n### Default (only NoSQL and full-text search)\n\n```py\nfrom oakdb import Oak\n\noak = Oak()\n# Create your first Oak Base\nideas = oak.Base(\"ideas\")\n\nideas.enable_search() # Optional. Enables full-text search\n\n\nideas.add(\"make a database\")\nideas.add(\"build a rocket\")\nideas.add(\"حواسيب ذاتية الطيران\")\n\nideas.fetch() # Fetch all notes\nideas.search(\"rocket\")\n\n# Create/use another Base\nthings = oak.Base(\"things\")\n\n# Add multiple at once\nthings.adds([\n    {\"name\": \"pen\", \"price\": 10},\n    {\"name\": \"notebook\", \"price\": 5, \"pages\": 200},\n    {\"name\": \"calculator\", \"price\": 100, \"used\": True},\n])\n\n# Provide filters\nthings.fetch({\"price__gte\": 5})\nthings.fetch({\"price\": 100, \"used\": True})\n```\n\n### With vector similarity search\n\n```py\nfrom oakdb import Oak\n\noak = Oak()\nideas = oak.Base(\"ideas\")\n\n# Read the installation section first\nideas.enable_vector() # Enables similarity search. Takes a few minutes the first time to download the model\n\nideas.add(\"make a database\")\nideas.add(\"build a rocket\")\nideas.add(\"حواسيب ذاتية الطيران\")\n\nideas.similar(\"flying vehicles\")\n```\n\n\u003cdetails\u003e\n  \u003csummary\u003eUsing alternative embedding providers\u003c/summary\u003e\n\n1. Install the required package:\n\n```sh\npip install langchain-community\n```\n\n2. Configure Oak with your preferred embedding provider:\n\n```py\nfrom oakdb import Oak\nfrom langchain_community.embeddings import FakeEmbeddings # import your provider\n\noak = Oak()\noak.backend.set_embedder(FakeEmbeddings(...))\n```\n\nImportant: don't mix up your embedding providers. Use one per Oak instance. Will add more flexibility later.\n\u003c/details\u003e\n\n\n## Plan/wishlist\n\n- [ ] Add missing features and refine API\n- [ ] Add support for file storage and indexing\n- [ ] Support more backends like libsql, Cloudflare D1, etc.\n- [ ] Release JavaScript, browser, Go, and Rust versions.\n- [ ] Implement in C and/or create a SQLite extension.\n\n## API Reference\n\u003e Note: Some parts of the API might change. Esp regarding error returns.\n\n\n### `Oak` Class\n\nThe primary entry point for creating and managing databases.\n\n#### Constructor\n```python\nOak(backend: Union[SQLiteBackend, str] = \"./oak.db\")\n```\n- `backend`: Either a SQLiteBackend instance or a file path for the database\n- Default creates a SQLite database at \"./oak.db\"\n\n#### Methods\n\n##### `Base(name: str) -\u003e Base`\nCreate or retrieve a named database instance.\n- `name`: Unique identifier for the database\n- Returns a `Base` instance\n\n## `Base` Class\n\nRepresents a specific database with various data operations.\n\n### Methods\n\n#### Data Manipulation\n\n##### `add(data, key=None, *, override=False) -\u003e AddResponse`\nAdd a single item to the database. Returns an error if key already exists unless `override=True`\n- `data`: The data to store (dict, list, str, int, bool, float)\n- `key`: Optional custom key (auto-generated if not provided). A custom key can also be passed in the `data` dict using `\"key\": \"...\"`\n- `override`: Optional. Replace existing item if key exists\n\n##### `adds(items, *, override=False) -\u003e AddsResponse`\nAdd multiple items to the database. Returns an error if a key already exists unless `override=True`\n- `items`: List/tuple/set of items to add. Custom keys can also be passed in the items' dicts using `\"key\": \"...\"`\n- `override`: Optional. Replace existing items if keys exist\n\n##### `get(key) -\u003e GetResponse`\nRetrieve an item by its key\n- `key`: types: str, int, float (they will be converted to to strings)\n\n##### `delete(key) -\u003e DeleteResponse`\nDelete an item by its key\n- `key`: types: str, int, float (they will be converted to to strings)\n\n##### `deletes(keys) -\u003e DeletesResponse`\nDelete multiple items by their keys\n- `keys`: a list of types: str, int, float (they will be converted to to strings)\n\n#### Query Methods\n\n##### `fetch(filters=None, *, limit=1000, order=\"created__desc\", page=1) -\u003e ItemsResponse`\nFetch items with advanced filtering and pagination\n- `filters`: Filtering criteria. Check [Query Language][#query-language] for filter syntax\n- `limit`: Maximum items per page\n- `order`: Sorting order. Options:\n  - `key__asc`\n  - `key__desc`\n  - `data__asc`\n  - `data__desc`\n  - `created__asc`\n  - `created__desc`\n  - `updated__asc`\n  - `updated__desc`\n- `page`: Page number for pagination.\n\n##### `search(query, *, filters=None, limit=10, page=1, order=\"rank__desc\") -\u003e ItemsResponse`\nPerform full-text search (requires search to be enabled)\n- `query`: Search text\n- `filters`: Optional additional filtering. Check [Query Language][#query-language] for filter syntax\n- `limit`: Maximum results\n- `page`: Pagination page number\n- `order`: Sorting order. Options:\n  - `rank__asc`\n  - `rank__desc`\n  - `key__asc`\n  - `key__desc`\n  - `data__asc`\n  - `data__desc`\n  - `created__asc`\n  - `created__desc`\n  - `updated__asc`\n  - `updated__desc`\n\n##### `similar(query, *, filters=None, limit=3, distance=\"cosine\", order=\"distance__desc\") -\u003e ItemsResponse`\nPerform vector similarity search (requires vector search to be enabled)\n- `query`: Search vector/text\n- `filters`: Optional additional filtering. Check [Query Language][#query-language] for filter syntax\n- `limit`: Maximum results\n- `distance`: Distance metric (\"L1\", \"L2\", \"cosine\"). case-sensitive\n- `order`: Sorting order. Options:\n  - `distance__asc`\n  - `distance__desc`\n  - `key__asc`\n  - `key__desc`\n  - `data__asc`\n  - `data__desc`\n  - `created__asc`\n  - `created__desc`\n  - `updated__asc`\n  - `updated__desc`\n\n#### Search and Vector Management\n\n##### `enable_search() -\u003e str`\nEnable full-text search for the database\n\n##### `disable_search(erase_index=True) -\u003e bool`\nDisable full-text search\n\n##### `enable_vector() -\u003e str`\nEnable vector similarity search capabilities\n\n##### `disable_vector(erase_index=True) -\u003e bool`\nDisable vector similarity search\n\n##### `drop(name, main_only=False) -\u003e bool`\nDrop the entire database or main table\n\n## Query Language\n\nOakDB supports a powerful, flexible query language for filtering and searching.\n\n\n### Basic Filtering\n\n```python\n# Exact match\ndb.fetch({\"score\": 25})\n\n# Multiple conditions (AND)\ndb.fetch({\"score__gte\": 18, \"game\": \"Mario Kart\"})\n\n# Multiple conditions (OR)\ndb.fetch([{\"score__gte\": 18}, {\"game\": \"Mario Kart\"}])\n\n# With full-text search\ndb.search(\"zelda\", {\"tag__in\": [\"rpg\"]})\n\n# With vector similarity search\ndb.similar(\"flying turtles\", {\"console\": \"3ds\"})\n```\n\n### Operators\n\n| Operator | Description | Example |\n|----------|-------------|---------|\n| `eq` | Equal to | `{\"score__eq\": 25}` or `{\"score\": 25}` |\n| `ne` | Not equal to | `{\"score__ne\": 25}` |\n| `lt` | Less than | `{\"score__lt\": 30}` |\n| `gt` | Greater than | `{\"score__gt\": 18}` |\n| `lte` | Less than or equal | `{\"score__lte\": 25}` |\n| `gte` | Greater than or equal | `{\"score__gte\": 18}` |\n| `starts` | Starts with | `{\"name__starts\": \"Nintendo\"}` |\n| `ends` | Ends with | `{\"name__ends\": \"Switch\"}` |\n| `contains` | Contains substring | `{\"description__contains\": \"Racing\"}` |\n| `!contains` | Does not contain substring | `{\"description__!contains\": \"Adventure\"}` |\n| `range` | Between two values | `{\"score__range\": [18, 30]}` |\n| `in` | In a list of values | `{\"status__in\": [\"active\", \"pending\"]}` |\n| `!in` | Not in a list of values | `{\"status__!in\": [\"active\", \"pending\"]}` |\n\n### Column Queries\n\nUse `_` prefix for direct column queries:\n```python\ndb.fetch({\"_created__gte\": \"2023-01-01\"})\n```\n\n### More examples\n\n#### Basic Search with multiple (OR) Filters\n```python\n# Multiple condition sets (OR)\ndb.fetch([\n    {\"score__gte\": 18, \"game__contains\": \"Mario\"},\n    {\"status\": \"active\"}\n])\n```\n\n#### Basic Search with Filters\n```python\n# Search for products in a specific category\nresults = db.search(\"laptop\",\n    filters={\n        \"category\": \"electronics\",\n        \"price__lte\": 1000\n    },\n    limit=10\n)\n```\n\n#### Nested JSON Filtering\n```python\n# Complex nested condition queries\nresults = db.fetch({\n    \"user.profile.age__gte\": 21,\n    \"user.settings.notifications__eq\": True,\n    \"user.addresses.0.city__contains\": \"Maputo\"\n})\n```\n\n#### Filters alongside similarity search\n```python\n# Find similar documents or products\nresults = db.similar(\"data science trends\",\n    filters={\n        \"year__gte\": 2020,\n        \"tags__in\": [\"AI\", \"ML\"],\n        \"region__ne\": \"restricted\"\n    },\n    limit=3,\n    distance=\"L2\"\n)\n```\n\n\n## Database Management\n\n### Database Configuration and Maintenance\n\nOakDB provides several methods to manage and configure your databases:\n\n#### Enabling and Disabling Features\n\n```python\n# Enable full-text search for a base\nideas.enable_search()\n\n# Disable full-text search\nideas.disable_search()\n\n# Enable vector similarity search\nideas.enable_vector()\n\n# Disable vector similarity search\nideas.disable_vector()\n```\n\nFull-text search and vector search can be enabled at the same time.\n\n#### Dropping Databases\n\n```python\n# Drop entire database (requires confirming database name)\nideas.drop(\"ideas\")\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fabdelhai%2Foakdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fabdelhai%2Foakdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fabdelhai%2Foakdb/lists"}