https://github.com/answerdotai/fastsql

Last synced: 28 days ago
JSON representation
Host: GitHub
URL: https://github.com/answerdotai/fastsql
Owner: AnswerDotAI
License: apache-2.0
Created: 2024-08-01T19:40:43.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2026-01-09T23:50:11.000Z (about 1 month ago)
Last Synced: 2026-01-10T21:43:08.459Z (about 1 month ago)
Language: Jupyter Notebook
Homepage: https://answerdotai.github.io/fastsql
Size: 569 KB
Stars: 64
Watchers: 9
Forks: 5
Open Issues: 3
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project

README

          # fastsql

## Install

    pip install fastsql

## Overview

``` python

from fastcore.utils import *

from fastcore.net import urlsave

from fastsql import *

from fastsql.core import NotFoundError

```

We demonstrate `fastsql`‘s features here using the ’chinook’ sample

database.

``` python

url = 'https://github.com/lerocha/chinook-database/raw/master/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite'

path = Path('chinook.sqlite')

if not path.exists(): urlsave(url, path)

```

``` python

db = database("chinook.sqlite"); db

```

    Database(sqlite:///chinook.sqlite)

Databases have a `t` property that lists all tables:

``` python

dt = db.t

dt

```

    Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track

You can use this to grab a single table…:

``` python

# artist = dt.artists

# artist

```

``` python

artist = dt.Artist

artist

```

    

…or multiple tables at once:

``` python

dt['Artist','Album','Track','Genre','MediaType']

```

    [,

     ,

     ,

     ,

     ]

It also provides auto-complete in Jupyter, IPython, and nearly any other

interactive Python environment:



You can check if a table is in the database already:

``` python

'Artist' in dt

```

    True

Column work in a similar way to tables, using the `c` property:

``` python

ac = artist.c

ac

```

    ArtistId, Name

Auto-complete works for columns too:



Columns, tables, and view stringify in a format suitable for including

in SQL statements. That means you can use auto-complete in f-strings.

``` python

qry = f"select * from {artist} where {ac.Name} like 'AC/%'"

print(qry)

```

    select * from "Artist" where "Artist"."Name" like 'AC/%'

You can view the results of a select query using `q`:

``` python

db.q(qry)

```

    [{'ArtistId': 1, 'Name': 'AC/DC'}]

Views can be accessed through the `v` property:

``` python

album = dt.Album

acca_sql = f"""select {album}.*

from {album} join {artist} using (ArtistId)

where {ac.Name} like 'AC/%'"""

db.create_view("AccaDaccaAlbums", acca_sql, replace=True)

acca_dacca = db.q(f"select * from {db.v.AccaDaccaAlbums}")

acca_dacca

```

    [{'AlbumId': 1,

      'Title': 'For Those About To Rock We Salute You',

      'ArtistId': 1},

     {'AlbumId': 4, 'Title': 'Let There Be Rock', 'ArtistId': 1}]

## Dataclass support

A `dataclass` type with the names, types, and defaults of the tables is

created using `dataclass()`:

``` python

album_dc = album.dataclass()

```

``` python

album_dc

```

    fastsql.core.Album

Let’s try it:

``` python

album_obj = album_dc(**acca_dacca[0])

album_obj

```

    Album(AlbumId=1, Title='For Those About To Rock We Salute You', ArtistId=1)

You can get the definition of the dataclass using fastcore’s

`dataclass_src` – everything is treated as nullable, in order to handle

auto-generated database values:

``` python

src = dataclass_src(album_dc)

hl_md(src, 'python')

```

``` python

@dataclass

class Album:

    AlbumId: int | None = UNSET

    Title: str | None = UNSET

    ArtistId: int | None = UNSET

```

Because `dataclass()` is dynamic, you won’t get auto-complete in editors

like vscode – it’ll only work in dynamic environments like Jupyter and

IPython. For editor support, you can export the full set of dataclasses

to a module, which you can then import from:

``` python

create_mod(db, 'db_dc')

```

``` python

import sys

sys.path.insert(0, '.')

from db_dc import Track

Track()

```

    Track(TrackId=UNSET, Name=UNSET, AlbumId=UNSET, MediaTypeId=UNSET, GenreId=UNSET, Composer=UNSET, Milliseconds=UNSET, Bytes=UNSET, UnitPrice=UNSET)

Indexing into a table does a query on primary key:

``` python

dt.Track[1]

```

    Track(TrackId=1, Name='For Those About To Rock (We Salute You)', AlbumId=1, MediaTypeId=1, GenreId=1, Composer='Angus Young, Malcolm Young, Brian Johnson', Milliseconds=343719, Bytes=11170334, UnitPrice=Decimal('0.99'))

There’s a shortcut to select from a table – just call it as a function.

If you’ve previously called `dataclass()`, returned iterms will be

constructed using that class by default. There’s lots of params you can

check out, such as `limit`:

``` python

album(limit=2)

```

    [Album(AlbumId=1, Title='For Those About To Rock We Salute You', ArtistId=1),

     Album(AlbumId=2, Title='Balls to the Wall', ArtistId=2)]

Pass a truthy value as `with_pk` and you’ll get tuples of primary keys

and records:

``` python

album(with_pk=1, limit=2)

```

    [(1,

      Album(AlbumId=1, Title='For Those About To Rock We Salute You', ArtistId=1)),

     (2, Album(AlbumId=2, Title='Balls to the Wall', ArtistId=2))]

Indexing also uses the dataclass by default:

``` python

album[5]

```

    Album(AlbumId=5, Title='Big Ones', ArtistId=3)

If you set `xtra` fields, then indexing is also filtered by those. As a

result, for instance in this case, nothing is returned since album 5 is

not created by artist 1:

``` python

album.xtra(ArtistId=1)

try: album[5]

except NotFoundError: print("Not found")

```

    Not found

The same filtering is done when using the table as a callable:

``` python

album()

```

    [Album(AlbumId=1, Title='For Those About To Rock We Salute You', ArtistId=1),

     Album(AlbumId=4, Title='Let There Be Rock', ArtistId=1)]

## Core design

The following methods accept `**kwargs`, passing them along to the first

`dict` param:

- `create`

- `transform`

- `transform_sql`

- `update`

- `insert`

- `upsert`

- `lookup`

We can access a table that doesn’t actually exist yet:

``` python

dt

```

    Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track

``` python

cats = dt.Cats

cats

```

    

We can use keyword arguments to now create that table:

``` python

cats.create(id=int, name=str, weight=float, uid=int, pk='id')

hl_md(cats.schema, 'sql')

```

``` sql

CREATE TABLE "Cats" (

    id INTEGER, 

    name VARCHAR, 

    weight FLOAT, 

    uid INTEGER, 

    PRIMARY KEY (id)

)

```

It we set `xtra` then the additional fields are used for `insert`,

`update`, and `delete`:

``` python

cats.xtra(uid=2)

cat = cats.insert(name='meow', weight=6)

```

The inserted row is returned, including the xtra ‘uid’ field.

``` python

cat

```

    {'id': 1, 'name': 'meow', 'weight': 6.0, 'uid': 2}

Using `**` in `update` here doesn’t actually achieve anything, since we

can just pass a `dict` directly – it’s just to show that it works:

``` python

cat['name'] = "moo"

cat['uid'] = 1

cats.update(**cat)

cats()

```

    [{'id': 1, 'name': 'moo', 'weight': 6.0, 'uid': 2}]

Attempts to update or insert with xtra fields are ignored.

An error is raised if there’s an attempt to update a record not matching

`xtra` fields:

``` python

cats.xtra(uid=1)

try: cats.update(**cat)

except NotFoundError: print("Not found")

```

    Not found

This all also works with dataclasses:

``` python

cats.xtra(uid=2)

cats.dataclass()

cat = cats[1]

cat

```

    Cats(id=1, name='moo', weight=6.0, uid=2)

``` python

cats.drop()

cats

```

    

Alternatively, you can create a table from a class. If it’s not already

a dataclass, it will be converted into one. In either case, the

dataclass will be created (or modified) so that `None` can be passed to

any field (this is needed to support fields such as automatic row ids).

``` python

class Cat: id:int; name:str; weight:float; uid:int

```

``` python

cats = db.create(Cat)

```

``` python

hl_md(cats.schema, 'sql')

```

``` sql

CREATE TABLE cat (

    id INTEGER, 

    name VARCHAR, 

    weight FLOAT, 

    uid INTEGER, 

    PRIMARY KEY (id)

)

```

``` python

cat = Cat(name='咪咪', weight=9)

cats.insert(cat)

```

    Cat(id=1, name='咪咪', weight=9.0, uid=None)

## Manipulating data

We try to make the following methods as flexible as possible. Wherever

possible, they support Python dictionaries, dataclasses, and classes.

### .insert()

Creates a record. Returns an instance of the updated record.

Insert using a dictionary.

``` python

cats.insert({'name': 'Rex', 'weight': 12.2})

```

    Cat(id=2, name='Rex', weight=12.2, uid=None)

Insert using a dataclass.

``` python

CatDC = cats.dataclass()

cats.insert(CatDC(name='Tom', weight=10.2))

```

    Cat(id=3, name='Tom', weight=10.2, uid=None)

Insert using a standard Python class

``` python

cat = cats.insert(Cat(name='Jerry', weight=5.2))

```

### .update()

Updates a record using a Python dict, dataclass, or object, and returns

an instance of the updated record.

Updating from a Python dict:

``` python

cats.update(dict(id=cat.id, name='Jerry', weight=6.2))

```

    Cat(id=4, name='Jerry', weight=6.2, uid=None)

Updating from a dataclass:

``` python

cats.update(CatDC(id=cat.id, name='Jerry', weight=6.3))

```

    Cat(id=4, name='Jerry', weight=6.3, uid=None)

Updating using a class:

``` python

cats.update(Cat(id=cat.id, name='Jerry', weight=5.7))

```

    Cat(id=4, name='Jerry', weight=5.7, uid=None)

### .delete()

Removing data is done by providing the primary key value of the record.

``` python

# Farewell Jerry!

cats.delete(cat.id)

```

    Cat(id=4, name='Jerry', weight=5.7, uid=None)

### Multi-field primary keys

Pass a collection of strings to create a multi-field pk:

``` python

class PetFood: catid:int; food:str; qty:int

petfoods = db.create(PetFood, pk=['catid','food'])

print(petfoods.schema)

```

    CREATE TABLE pet_food (

        catid INTEGER, 

        food VARCHAR, 

        qty INTEGER, 

        PRIMARY KEY (catid, food)

    )

You can index into these using multiple values:

``` python

pf = petfoods.insert(PetFood(1, 'tuna', 2))

petfoods[1,'tuna']

```

    PetFood(catid=1, food='tuna', qty=2)

Updates work in the usual way:

``` python

pf.qty=3

petfoods.update(pf)

```

    PetFood(catid=1, food='tuna', qty=3)

You can also use `upsert` to update if the key exists, or insert

otherwise:

``` python

pf.qty=1

petfoods.upsert(pf)

petfoods()

```

    [PetFood(catid=1, food='tuna', qty=1)]

``` python

pf.food='salmon'

petfoods.upsert(pf)

petfoods()

```

    [PetFood(catid=1, food='tuna', qty=1), PetFood(catid=1, food='salmon', qty=1)]

`delete` takes a tuple of keys:

``` python

petfoods.delete((1, 'tuna'))

petfoods()

```

    [PetFood(catid=1, food='salmon', qty=1)]

## Migrations

FastSQL supports schema migrations to evolve your database over time.

Migrations are SQL or Python files stored in a migrations directory,

numbered sequentially.

The database tracks the current schema version in a `_meta` table. When

you run migrations, only unapplied migrations are executed.

Let’s create a migration to add a priority field to our cats table:

``` python

# Create migrations directory

mig_dir = Path('cat_migrations')

mig_dir.mkdir(exist_ok=True)

# Create a migration to add priority column

migration_sql = 'alter table cat add column color text default "unknown";'

(mig_dir / '1-add_color_to_cat.sql').write_text(migration_sql)

```

    56

Check the current schema version (will be 0 initially):

``` python

print(f"Current version: {db.version}")

```

    Current version: 0

Run the migration:

``` python

db.migrate('cat_migrations')

```

    Applied migration 1: 1-add_color_to_cat.sql

The database version is now updated, and the table structure reflects

the change:

``` python

print(f"New version: {db.version}")

print(f"\nUpdated schema:")

cats = dt.cat

hl_md(cats.schema, 'sql')

```

    New version: 1

    Updated schema:

``` sql

CREATE TABLE cat (

    id INTEGER, 

    name VARCHAR, 

    weight FLOAT, 

    uid INTEGER, 

    color TEXT DEFAULT "unknown", 

    PRIMARY KEY (id)

)

```

Existing records now have the priority field with the default value, and

new records can use it too:

``` python

cats.insert({'name': 'Mr. Snuggles', 'weight': 8.5, 'color': 'tuxedo'})

cats()

```

    [Cat(id=1, name='咪咪', weight=9.0, uid=None, color='unknown'),

     Cat(id=2, name='Rex', weight=12.2, uid=None, color='unknown'),

     Cat(id=3, name='Tom', weight=10.2, uid=None, color='unknown'),

     Cat(id=4, name='Mr. Snuggles', weight=8.5, uid=None, color='tuxedo')]

If you run `migrate()` again, it won’t reapply migrations that have

already been applied:

``` python

db.migrate('cat_migrations')  # No output - migration already applied

```

Migrations can also be Python scripts. Create a file like

`2-update_data.py` that accepts the database connection string as a

command line argument to perform more complex data transformations.

Python migration scripts must handle their own commits:

``` python

# migrations/2-update_data.py

import sys

from fastsql import database

conn_str = sys.argv[1]

db = database(conn_str)

# Perform complex data transformations

for cat in db.t.cat():

    if cat.weight > 10:

        db.t.cat.update({'id': cat.id, 'priority': 1})

# Python migrations must commit their own changes

db.conn.commit()

```
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/answerdotai/fastsql

Awesome Lists containing this project

README