{"id":15089360,"url":"https://github.com/yaroslaff/SashimiDB","last_synced_at":"2025-09-26T09:31:43.949Z","repository":{"id":173408563,"uuid":"650590222","full_name":"yaroslaff/SashimiDB","owner":"yaroslaff","description":"Database with HTTP interface for structured in-memory datasets (json, yaml, SQL queries)","archived":false,"fork":false,"pushed_at":"2024-09-26T18:35:14.000Z","size":2171,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2024-11-27T20:32:52.607Z","etag":null,"topics":["api","backend","backend-api","cms","database","docker","headless","jamstack","json","python","rest","search","secure","secure-by-default","simple","yaml","yml"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yaroslaff.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"docs/SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-07T11:45:11.000Z","updated_at":"2024-09-26T18:35:18.000Z","dependencies_parsed_at":null,"dependency_job_id":"d51dbc61-c372-4c6d-ba05-01bb31a2ca41","html_url":"https://github.com/yaroslaff/SashimiDB","commit_stats":{"total_commits":85,"total_committers":1,"mean_commits":85.0,"dds":0.0,"last_synced_commit":"33639dabb690778aeeaa9753c36e11b8abbe6780"},"previous_names":["yaroslaff/exact","yaroslaff/sashimidb","yaroslaff/sashimi-server"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yaroslaff%2FSashimiDB","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yaroslaff%2FSashimiDB/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yaroslaff%2FSashimiDB/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yaroslaff%2FSashimiDB/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yaroslaff","download_url":"https://codeload.github.com/yaroslaff/SashimiDB/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":234303251,"owners_count":18811041,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","backend","backend-api","cms","database","docker","headless","jamstack","json","python","rest","search","secure","secure-by-default","simple","yaml","yml"],"created_at":"2024-09-25T08:45:12.098Z","updated_at":"2025-09-26T09:31:38.374Z","avatar_url":"https://github.com/yaroslaff.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# SashimiDB\nImagine, you can query your MySQL database right from your website JavaScript! Secure and fast.\n\nSashimiDB is simple and very fast database with REST API for reactive JavaScript/JAMstack websites. SashimiDB can work as an Headless CMS. See it as an alternative to Strapi or Algolia.\n\n## Features\n\nSashimiDB allows anonymous read-only queries (with Cross-origin control only) from web page JavaScript. \n\nQuery format is simple and very flexible (as flexible as Python expression).\n\nWrite operations (such as UPDATE or DELETE records or reload dataset) could be allowed optionally from whitelisted IPs with HTTP Bearer authentication. \n\nExact is simple, [secure](doc/SECURITY.md) and very fast (less then half a second for heavy search query in dataset of 1 million records).\n\nAlmost any dataset structure is supported, the only requirement - dataset must have tabular nature (list of dicts) like database tables and spreadsheets.\n\n## Limitations\nExact cannot work with private data. For instance, for online store, you may use Exact to serve dataset of products and feedbacks, but you will need your own backend to serve profile page. Everything you put to dataset is public and almost as easy to download as if you'd put dataset.json on website.\n\nExact is not very good (not very fast) if you have frequent changes in dataset (you have to make one HTTP request for each update, this is probably slower then simple SQL query to local database server. But you may update many records at once e.g. set 'id in [1, 22, 333]'). But even in this situation you still can use Exact (example: search engine for online store which should not recommend out of stock products):\n- You may update records after each purchase (e.g. sent UPDATE to set `onstock` field to `123` or to `onstock-1`)\n- You may update records only after important changes (e.g. when item is out of stock. Most likely this will happen quite rare)\n- You may delete records \n- You may use Exact for searching but adjust some details on page (how many products available on stock in realtime) with your backend.\n\n\n## Why to use Exact?\n\n### Save development time and money\nMaybe your project is online electronic store or IMDB-like movie database. Anyway you need fast, flexible and secure search backend for it. Not just for simplest queries like \"smartphones from lowest price to highest\" + \"smartphones of brand X and price between Y and Z\" but for any complex search query. \"Smartphones with price from X to Y, brand Samsung or Apple, with Retina screen, and what is min/max price?\". If someone could not find specific product to buy, he could not buy it from you. \n\nHow long to develop and debug this kind of search API (and what is estimated price)? What if you can get it in a minute? Fast, secure and very flexible search API, which is good for computer store, dating site, imdb and (*almost?*) anything. \n\n\"Smartphones with price from X to Y, brand Samsung or Apple, with Retina screen\" (`category==\"smartphones\" and price\u003e1 and price\u003c1000 and brand in [\"Apple\", \"Samsung\"] and \"retina\" in description.lower()`), \"Movies, where Jack Nicholson played with Audrey Hepburn\", \"Green or red t-shirts, XXL size, cotton\u003e80, sorted by price, min and max price\".\n\nAnd if later you will add more data to search, no need to modify backend, Exact already can search for it in no time. You only need to write front-end JS code to send queries to Exact.\n\nBut Exact not only for \"search\". If you need to display info about imdb movie with id 123567, you can just make query to get dataset element with `id==1234567` and render page. Thus, you can have one HTML/JS web page to display data about any record in dataset.\n\n### Get high lighthouse score and better position in SERP\n\nWith Exact you can use reactive search and other database operations from JavaScript, thus avoid Server-Side rendering and serve pre-renedered HTML pages to get very fast performance rating (google and users loves fast sites!).\n\nOur demo site [shinhub.ru](https://shinhub.ru/) got [100/100/100/100 lighthouse rating for desktop](https://pagespeed.web.dev/analysis/https-shinhub-ru/g39tdd1xg1?form_factor=desktop) and 99/100/100/100 mobile lighthouse rating. (And it's possible to alter web page design a little to get 100 performance rating, but I like current approach)\n\n\n### Secure by design: Isolation from main database\nAll software products are developed to be secure. Many of them are developed by brilliant high-paid programmers and security specialists. And most of them had [at least one](https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=google) vulnerability. Errare humanum est.\n\nWith Exact it's possible to isolate search backend as reliable as you want (even put it on other server without database access if you are paranoid like me). Even if (just theory) there is an vulnerability in Exact or [Evalidate](https://github.com/yaroslaff/evalidate), hacker can get access only to public data. \n\nSee [doc/SECURITY.md](doc/SECURITY.md) for more.\n\n\n## Quick start\n\nTo play with Exact, you can use our demo server at [back4app](https://www.back4app.com/) ([httpie](https://github.com/httpie/httpie) is recommended):\n~~~\nhttp POST https://exact-yaroslaff.b4a.run/ds/dummy limit=3\n~~~\n\nThis is free virtual docker container, if no reply - it's sleeping, just repeat request in a few seconds and it will reply very quickly. Or run container locally.\n\nOr if you prefer curl:\n~~~\ncurl -H 'Content-Type: application/json' -X POST https://exact-yaroslaff.b4a.run/ds/dummy -d '{\"expr\": \"price\u003c800 and brand==\\\"Apple\\\"\"}'\n~~~\n\n(pipe output to [jq](https://github.com/jqlang/jq) to get it formatted and colored)\n\nSee - [QUERY.md](doc/QUERY.md) for example queries.\n\n## Running your own instance (Alternative 1 (recommended): docker container)\n\nIf you want to run your own instance of exact, better to start with docker image.\n\nCreate following directory structure (/tmp/data):\n~~~\nmkdir -p /tmp/data/data\nmkdir /tmp/data/etc\n\n# make example dataset\nwget -O /tmp/data/data/test.json https://fakestoreapi.com/products\n~~~\n\ncreate basic config file `/tmp/data/etc/exact.yml`:\n~~~yaml\nlimit: 20\ndatadir:\n  - /data/data\n\ndatasets:\n  dummy:\n    url: https://dummyjson.com/products?limit=100\n    keypath:\n      - products\n    format: json\n    limit: 20\n~~~\n\nThis will create exact instance with two datasets, \"dummy\" (loaded from network) and \"test\" loaded from local file test.json from datadir.\n\n\nNow you can start docker container:\n~~~\nsudo docker run --rm --name exact -p 8000:80 -it -v /tmp/data/:/data/  yaroslaff/exact\n~~~\n\nAnd make test query: `http POST http://localhost:8000/ds/test 'expr=price\u003c10' limit=5`\n\n## Running your own instance (Alternative 2: as python app)\n1. Clone repo: `git clone https://github.com/yaroslaff/exact.git`\n2. install dependencties: `cd exact; poetry install`\n3. activate virtualenv: `poetry shell`\n4. `uvicorn exact:app`\n\n## Documentation\nPlease see files in `doc/`:\n- [QUERY.md](doc/QUERY.md)\n- [CONFIG.md](doc/CONFIG.md)\n- [SECURITY.md](doc/SECURITY.md)\n- [TROUBLESHOOTING.md](doc/TROUBLESHOOTING.md)\n\n## Memory usage\nDocker container with small JSON dataset consumes 41Mb (use plain python app \"alternative 2\", if you need even smaller memory footprint). When loading large file (1mil.json. 500+Mb), container takes 1.5Gb. Rule of thumb - container will use 3x times of JSON file size for large datasets.\n\n## Performance\nFor test, we use 1mil.json file, list of 1 million of products (each of 100 unique items is duplicated 10 000 times, see below). Searching for items with `price\u003c200` and limit=10 (820 000 matches), takes little more then 0.2 seconds. Aggregation request to find min and max price among whole 1 million dataset takes 0.43 seconds.\n\n## Tips and tricks\n- If you will always use upper/lower case in JSON datasets and in frontend, you can disable `upper`/`lower` functions and save few milliseconds on each request.\n- Remove all sensitive/not-needed fields when exporting to JSON. Leave only key fields and fields used for searching, such as price, size, color.\n- Use `limit` for every dataset, and set default `limit` globally in `exact.yml`. Sending your full database in response is probably never needed, but such requests will consume RAM/CPU/Bandwidth.\n\n\n## MySQL, MariaDB, PostgreSQL and other databases support\nExact uses [SQLAlchemy](https://www.sqlalchemy.org/) to work with database, so it can work with any sqlalchemy-compatible RDBMS, but you need to install proper python modules, e.g. `pip install mysqlclient` (for mysql/mariadb).\n\nhttps://docs.sqlalchemy.org/en/20/core/engines.html\n\nExample config\n~~~yaml\ndatasets:\n  contact:\n    db: mysql://scott:tiger@127.0.0.1/contacts\n    sql: SELECT * FROM contact\n~~~\n\nThis will create dataset contact from `contacts.contact` table.\n\n## Build docker image\n~~~\nsudo docker build -t yaroslaff/exact ./\n~~~\n\n## Sample data sources\n- https://fakestoreapi.com/products\n- https://www.mockaroo.com/\n- https://dummyjson.com/\n- https://github.com/prust/wikipedia-movie-data/\n\n\nPrepare 1 million items list '1mil.json':\n~~~\n$ python\nPython 3.9.2 (default, Feb 28 2021, 17:03:44) \n[GCC 10.2.1 20210110] on linux\nType \"help\", \"copyright\", \"credits\" or \"license\" for more information.\n\u003e\u003e\u003e import requests\n\u003e\u003e\u003e import  json\n\u003e\u003e\u003e data = requests.get('https://dummyjson.com/products?limit=100').json()['products'] * 10000\n\u003e\u003e\u003e with open('1mil.json', 'w') as fh:\n...   fh.write(json.dumps(data))\n... \n~~~\nThis makes file `1mil.json` (568Mb).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyaroslaff%2FSashimiDB","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyaroslaff%2FSashimiDB","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyaroslaff%2FSashimiDB/lists"}