Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dynajoe/tinydb
Database server written from scratch
https://github.com/dynajoe/tinydb
binary-data computer-science database first-principles parser server
Last synced: 21 days ago
JSON representation
Database server written from scratch
- Host: GitHub
- URL: https://github.com/dynajoe/tinydb
- Owner: dynajoe
- Created: 2018-06-11T21:39:55.000Z (over 6 years ago)
- Default Branch: main
- Last Pushed: 2023-02-25T07:37:25.000Z (almost 2 years ago)
- Last Synced: 2024-04-15T02:15:12.221Z (9 months ago)
- Topics: binary-data, computer-science, database, first-principles, parser, server
- Language: Go
- Homepage:
- Size: 311 KB
- Stars: 13
- Watchers: 3
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# TinyDB
![build](https://github.com/dynajoe/tinydb/actions/workflows/go.yml/badge.svg)
## Overview
A relational database engine that implements a subset of ANSI SQL.### Purpose
This project came to be as a learning exercise for digging into implementation details of database engines. Ultimately, the learnings from making TinyDB will be used to construct a class or series of conference talks on implementation details. A major benefit of writing a database engine that’s not production worthy is that it is exponentially easier to grok what’s going on and improve the learning experience.### Capabilities
* Go SQL Driver over net.Conn
* TCP server
* SQL Queries
* Parsing
* Transactions
* Prepared statements
* Raw
* Virtual machine with simple instruction set and unlimited (in theory) registers
* BTree binary file format compatible with SQLite### Future
* Programmability similar to that of plpgsql or plv8.
## Running
### Configuration
### TCP Server
### CLI## Internals
### Parsing
The query parser uses a set of simple parser combinators. The advantage of this approach is arguably its simplicity. The drawback is exponential time complexity in worst case. With the addition of "checkpoints" the amortized time complexity is polynomial.#### Lexer / Tokenizer
Input is broken into tokens that consist of one or more characters except EOF which is zero. For example:tsqlString: everything between two `'`'s. e.g. `'a simple string'` would produce a token containing the text `a simple string`.
tsqlSelect: SELECT in any case. e.g. SeLeCT.
tsqlBoolean: A token with true or false.
#### Scanner
#### Parser combinators
Some of the combinators defined in this project:
**required**: only succeeds if the parser succeeds otherwise, no input is consumed and the parser fails.
**optional**: always succeeds and may consume input if the parser succeeds.
**all**: requires that all parsers succeed or no input in consumed.
**zeroOrMore**: runs parser until it doesn't match anymore and always succeeds.
**separatedBy1**: this combinator is useful for parsing comma separated lists.
**oneOf**: executes each parser until a success. one parser must succeed.
**lazy**: calls a parser producing function each time it's invoked. This combinator is useful when a parser refers to itself.
**chainl**: requires at least one expression followed by an optional series of [op expression]
this combinator is used to eliminate left recursion and build a left-associative expression.Left recursion:
e.g. Expression -> Expression + Term
pseudo code: (would never terminate)```
void Expression() {
Expression();
match('+');
Term();
}
```left-to-right recursive descent parsers can't handle left recursion and this is just one of several possible
ways to eliminate left recursion. Though, this particular way doesn't handle indirect left recurision which
is a series of substitutions that ultimately lead to an infinite recursive call.Left associativity:
e.g. (from wikipedia) Consider the expression a ~ b ~ c.
If the operator ~ has left associativity, this expression would be interpreted as (a ~ b) ~ c.
If the operator has right associativity, the expression would be interpreted as a ~ (b ~ c).#### AST
#### Descriptive errors
### Storage
#### Binary format
#### Internal/Leaf Pages
### Virtual Machine
#### OpCodes
#### Code Generation
#### Optimizer## Bibliography
SQLite file format
ChiDB architecture