https://github.com/answerdotai/fastsql
https://github.com/answerdotai/fastsql
Last synced: 28 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/answerdotai/fastsql
- Owner: AnswerDotAI
- License: apache-2.0
- Created: 2024-08-01T19:40:43.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2026-01-09T23:50:11.000Z (about 1 month ago)
- Last Synced: 2026-01-10T21:43:08.459Z (about 1 month ago)
- Language: Jupyter Notebook
- Homepage: https://answerdotai.github.io/fastsql
- Size: 569 KB
- Stars: 64
- Watchers: 9
- Forks: 5
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# fastsql
## Install
pip install fastsql
## Overview
``` python
from fastcore.utils import *
from fastcore.net import urlsave
from fastsql import *
from fastsql.core import NotFoundError
```
We demonstrate `fastsql`‘s features here using the ’chinook’ sample
database.
``` python
url = 'https://github.com/lerocha/chinook-database/raw/master/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite'
path = Path('chinook.sqlite')
if not path.exists(): urlsave(url, path)
```
``` python
db = database("chinook.sqlite"); db
```
Database(sqlite:///chinook.sqlite)
Databases have a `t` property that lists all tables:
``` python
dt = db.t
dt
```
Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track
You can use this to grab a single table…:
``` python
# artist = dt.artists
# artist
```
``` python
artist = dt.Artist
artist
```
…or multiple tables at once:
``` python
dt['Artist','Album','Track','Genre','MediaType']
```
[,
,
,
,
]
It also provides auto-complete in Jupyter, IPython, and nearly any other
interactive Python environment:

You can check if a table is in the database already:
``` python
'Artist' in dt
```
True
Column work in a similar way to tables, using the `c` property:
``` python
ac = artist.c
ac
```
ArtistId, Name
Auto-complete works for columns too:

Columns, tables, and view stringify in a format suitable for including
in SQL statements. That means you can use auto-complete in f-strings.
``` python
qry = f"select * from {artist} where {ac.Name} like 'AC/%'"
print(qry)
```
select * from "Artist" where "Artist"."Name" like 'AC/%'
You can view the results of a select query using `q`:
``` python
db.q(qry)
```
[{'ArtistId': 1, 'Name': 'AC/DC'}]
Views can be accessed through the `v` property:
``` python
album = dt.Album
acca_sql = f"""select {album}.*
from {album} join {artist} using (ArtistId)
where {ac.Name} like 'AC/%'"""
db.create_view("AccaDaccaAlbums", acca_sql, replace=True)
acca_dacca = db.q(f"select * from {db.v.AccaDaccaAlbums}")
acca_dacca
```
[{'AlbumId': 1,
'Title': 'For Those About To Rock We Salute You',
'ArtistId': 1},
{'AlbumId': 4, 'Title': 'Let There Be Rock', 'ArtistId': 1}]
## Dataclass support
A `dataclass` type with the names, types, and defaults of the tables is
created using `dataclass()`:
``` python
album_dc = album.dataclass()
```
``` python
album_dc
```
fastsql.core.Album
Let’s try it:
``` python
album_obj = album_dc(**acca_dacca[0])
album_obj
```
Album(AlbumId=1, Title='For Those About To Rock We Salute You', ArtistId=1)
You can get the definition of the dataclass using fastcore’s
`dataclass_src` – everything is treated as nullable, in order to handle
auto-generated database values:
``` python
src = dataclass_src(album_dc)
hl_md(src, 'python')
```
``` python
@dataclass
class Album:
AlbumId: int | None = UNSET
Title: str | None = UNSET
ArtistId: int | None = UNSET
```
Because `dataclass()` is dynamic, you won’t get auto-complete in editors
like vscode – it’ll only work in dynamic environments like Jupyter and
IPython. For editor support, you can export the full set of dataclasses
to a module, which you can then import from:
``` python
create_mod(db, 'db_dc')
```
``` python
import sys
sys.path.insert(0, '.')
from db_dc import Track
Track()
```
Track(TrackId=UNSET, Name=UNSET, AlbumId=UNSET, MediaTypeId=UNSET, GenreId=UNSET, Composer=UNSET, Milliseconds=UNSET, Bytes=UNSET, UnitPrice=UNSET)
Indexing into a table does a query on primary key:
``` python
dt.Track[1]
```
Track(TrackId=1, Name='For Those About To Rock (We Salute You)', AlbumId=1, MediaTypeId=1, GenreId=1, Composer='Angus Young, Malcolm Young, Brian Johnson', Milliseconds=343719, Bytes=11170334, UnitPrice=Decimal('0.99'))
There’s a shortcut to select from a table – just call it as a function.
If you’ve previously called `dataclass()`, returned iterms will be
constructed using that class by default. There’s lots of params you can
check out, such as `limit`:
``` python
album(limit=2)
```
[Album(AlbumId=1, Title='For Those About To Rock We Salute You', ArtistId=1),
Album(AlbumId=2, Title='Balls to the Wall', ArtistId=2)]
Pass a truthy value as `with_pk` and you’ll get tuples of primary keys
and records:
``` python
album(with_pk=1, limit=2)
```
[(1,
Album(AlbumId=1, Title='For Those About To Rock We Salute You', ArtistId=1)),
(2, Album(AlbumId=2, Title='Balls to the Wall', ArtistId=2))]
Indexing also uses the dataclass by default:
``` python
album[5]
```
Album(AlbumId=5, Title='Big Ones', ArtistId=3)
If you set `xtra` fields, then indexing is also filtered by those. As a
result, for instance in this case, nothing is returned since album 5 is
not created by artist 1:
``` python
album.xtra(ArtistId=1)
try: album[5]
except NotFoundError: print("Not found")
```
Not found
The same filtering is done when using the table as a callable:
``` python
album()
```
[Album(AlbumId=1, Title='For Those About To Rock We Salute You', ArtistId=1),
Album(AlbumId=4, Title='Let There Be Rock', ArtistId=1)]
## Core design
The following methods accept `**kwargs`, passing them along to the first
`dict` param:
- `create`
- `transform`
- `transform_sql`
- `update`
- `insert`
- `upsert`
- `lookup`
We can access a table that doesn’t actually exist yet:
``` python
dt
```
Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track
``` python
cats = dt.Cats
cats
```
We can use keyword arguments to now create that table:
``` python
cats.create(id=int, name=str, weight=float, uid=int, pk='id')
hl_md(cats.schema, 'sql')
```
``` sql
CREATE TABLE "Cats" (
id INTEGER,
name VARCHAR,
weight FLOAT,
uid INTEGER,
PRIMARY KEY (id)
)
```
It we set `xtra` then the additional fields are used for `insert`,
`update`, and `delete`:
``` python
cats.xtra(uid=2)
cat = cats.insert(name='meow', weight=6)
```
The inserted row is returned, including the xtra ‘uid’ field.
``` python
cat
```
{'id': 1, 'name': 'meow', 'weight': 6.0, 'uid': 2}
Using `**` in `update` here doesn’t actually achieve anything, since we
can just pass a `dict` directly – it’s just to show that it works:
``` python
cat['name'] = "moo"
cat['uid'] = 1
cats.update(**cat)
cats()
```
[{'id': 1, 'name': 'moo', 'weight': 6.0, 'uid': 2}]
Attempts to update or insert with xtra fields are ignored.
An error is raised if there’s an attempt to update a record not matching
`xtra` fields:
``` python
cats.xtra(uid=1)
try: cats.update(**cat)
except NotFoundError: print("Not found")
```
Not found
This all also works with dataclasses:
``` python
cats.xtra(uid=2)
cats.dataclass()
cat = cats[1]
cat
```
Cats(id=1, name='moo', weight=6.0, uid=2)
``` python
cats.drop()
cats
```
Alternatively, you can create a table from a class. If it’s not already
a dataclass, it will be converted into one. In either case, the
dataclass will be created (or modified) so that `None` can be passed to
any field (this is needed to support fields such as automatic row ids).
``` python
class Cat: id:int; name:str; weight:float; uid:int
```
``` python
cats = db.create(Cat)
```
``` python
hl_md(cats.schema, 'sql')
```
``` sql
CREATE TABLE cat (
id INTEGER,
name VARCHAR,
weight FLOAT,
uid INTEGER,
PRIMARY KEY (id)
)
```
``` python
cat = Cat(name='咪咪', weight=9)
cats.insert(cat)
```
Cat(id=1, name='咪咪', weight=9.0, uid=None)
## Manipulating data
We try to make the following methods as flexible as possible. Wherever
possible, they support Python dictionaries, dataclasses, and classes.
### .insert()
Creates a record. Returns an instance of the updated record.
Insert using a dictionary.
``` python
cats.insert({'name': 'Rex', 'weight': 12.2})
```
Cat(id=2, name='Rex', weight=12.2, uid=None)
Insert using a dataclass.
``` python
CatDC = cats.dataclass()
cats.insert(CatDC(name='Tom', weight=10.2))
```
Cat(id=3, name='Tom', weight=10.2, uid=None)
Insert using a standard Python class
``` python
cat = cats.insert(Cat(name='Jerry', weight=5.2))
```
### .update()
Updates a record using a Python dict, dataclass, or object, and returns
an instance of the updated record.
Updating from a Python dict:
``` python
cats.update(dict(id=cat.id, name='Jerry', weight=6.2))
```
Cat(id=4, name='Jerry', weight=6.2, uid=None)
Updating from a dataclass:
``` python
cats.update(CatDC(id=cat.id, name='Jerry', weight=6.3))
```
Cat(id=4, name='Jerry', weight=6.3, uid=None)
Updating using a class:
``` python
cats.update(Cat(id=cat.id, name='Jerry', weight=5.7))
```
Cat(id=4, name='Jerry', weight=5.7, uid=None)
### .delete()
Removing data is done by providing the primary key value of the record.
``` python
# Farewell Jerry!
cats.delete(cat.id)
```
Cat(id=4, name='Jerry', weight=5.7, uid=None)
### Multi-field primary keys
Pass a collection of strings to create a multi-field pk:
``` python
class PetFood: catid:int; food:str; qty:int
petfoods = db.create(PetFood, pk=['catid','food'])
print(petfoods.schema)
```
CREATE TABLE pet_food (
catid INTEGER,
food VARCHAR,
qty INTEGER,
PRIMARY KEY (catid, food)
)
You can index into these using multiple values:
``` python
pf = petfoods.insert(PetFood(1, 'tuna', 2))
petfoods[1,'tuna']
```
PetFood(catid=1, food='tuna', qty=2)
Updates work in the usual way:
``` python
pf.qty=3
petfoods.update(pf)
```
PetFood(catid=1, food='tuna', qty=3)
You can also use `upsert` to update if the key exists, or insert
otherwise:
``` python
pf.qty=1
petfoods.upsert(pf)
petfoods()
```
[PetFood(catid=1, food='tuna', qty=1)]
``` python
pf.food='salmon'
petfoods.upsert(pf)
petfoods()
```
[PetFood(catid=1, food='tuna', qty=1), PetFood(catid=1, food='salmon', qty=1)]
`delete` takes a tuple of keys:
``` python
petfoods.delete((1, 'tuna'))
petfoods()
```
[PetFood(catid=1, food='salmon', qty=1)]
## Migrations
FastSQL supports schema migrations to evolve your database over time.
Migrations are SQL or Python files stored in a migrations directory,
numbered sequentially.
The database tracks the current schema version in a `_meta` table. When
you run migrations, only unapplied migrations are executed.
Let’s create a migration to add a priority field to our cats table:
``` python
# Create migrations directory
mig_dir = Path('cat_migrations')
mig_dir.mkdir(exist_ok=True)
# Create a migration to add priority column
migration_sql = 'alter table cat add column color text default "unknown";'
(mig_dir / '1-add_color_to_cat.sql').write_text(migration_sql)
```
56
Check the current schema version (will be 0 initially):
``` python
print(f"Current version: {db.version}")
```
Current version: 0
Run the migration:
``` python
db.migrate('cat_migrations')
```
Applied migration 1: 1-add_color_to_cat.sql
The database version is now updated, and the table structure reflects
the change:
``` python
print(f"New version: {db.version}")
print(f"\nUpdated schema:")
cats = dt.cat
hl_md(cats.schema, 'sql')
```
New version: 1
Updated schema:
``` sql
CREATE TABLE cat (
id INTEGER,
name VARCHAR,
weight FLOAT,
uid INTEGER,
color TEXT DEFAULT "unknown",
PRIMARY KEY (id)
)
```
Existing records now have the priority field with the default value, and
new records can use it too:
``` python
cats.insert({'name': 'Mr. Snuggles', 'weight': 8.5, 'color': 'tuxedo'})
cats()
```
[Cat(id=1, name='咪咪', weight=9.0, uid=None, color='unknown'),
Cat(id=2, name='Rex', weight=12.2, uid=None, color='unknown'),
Cat(id=3, name='Tom', weight=10.2, uid=None, color='unknown'),
Cat(id=4, name='Mr. Snuggles', weight=8.5, uid=None, color='tuxedo')]
If you run `migrate()` again, it won’t reapply migrations that have
already been applied:
``` python
db.migrate('cat_migrations') # No output - migration already applied
```
Migrations can also be Python scripts. Create a file like
`2-update_data.py` that accepts the database connection string as a
command line argument to perform more complex data transformations.
Python migration scripts must handle their own commits:
``` python
# migrations/2-update_data.py
import sys
from fastsql import database
conn_str = sys.argv[1]
db = database(conn_str)
# Perform complex data transformations
for cat in db.t.cat():
if cat.weight > 10:
db.t.cat.update({'id': cat.id, 'priority': 1})
# Python migrations must commit their own changes
db.conn.commit()
```