https://github.com/frictionlessdata/tableschema-sql-py
Generate SQL tables, load and extract data, based on JSON Table Schema descriptors.
https://github.com/frictionlessdata/tableschema-sql-py
Last synced: 7 months ago
JSON representation
Generate SQL tables, load and extract data, based on JSON Table Schema descriptors.
- Host: GitHub
- URL: https://github.com/frictionlessdata/tableschema-sql-py
- Owner: frictionlessdata
- License: mit
- Created: 2015-08-17T17:50:23.000Z (over 10 years ago)
- Default Branch: main
- Last Pushed: 2023-07-19T17:43:48.000Z (almost 3 years ago)
- Last Synced: 2025-09-09T04:57:58.753Z (8 months ago)
- Language: Python
- Homepage:
- Size: 320 KB
- Stars: 62
- Watchers: 16
- Forks: 17
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
- jimsghstars - frictionlessdata/tableschema-sql-py - Generate SQL tables, load and extract data, based on JSON Table Schema descriptors. (Python)
README
# tableschema-sql-py
[](https://travis-ci.org/frictionlessdata/tableschema-sql-py)
[](https://coveralls.io/r/frictionlessdata/tableschema-sql-py?branch=master)
[](https://pypi.python.org/pypi/tableschema-sql)
[](https://github.com/frictionlessdata/tableschema-sql-py)
[](https://gitter.im/frictionlessdata/chat)
Generate and load SQL tables based on [Table Schema](http://specs.frictionlessdata.io/table-schema/) descriptors.
## Features
- implements `tableschema.Storage` interface
- provides additional features like indexes and updating
## Contents
- [Getting Started](#getting-started)
- [Installation](#installation)
- [Documentation](#documentation)
- [API Reference](#api-reference)
- [`Storage`](#storage)
- [Contributing](#contributing)
- [Changelog](#changelog)
## Getting Started
### Installation
The package use semantic versioning. It means that major versions could include breaking changes. It's highly recommended to specify `package` version range in your `setup/requirements` file e.g. `package>=1.0,<2.0`.
```bash
pip install tableschema-sql
```
## Documentation
```python
from datapackage import Package
from tableschema import Table
from sqlalchemy import create_engine
# Create sqlalchemy engine
engine = create_engine('sqlite://')
# Save package to SQL
package = Package('datapackage.json')
package.save(storage='sql', engine=engine)
# Load package from SQL
package = Package(storage='sql', engine=engine)
package.resources
```
## API Reference
### `Storage`
```python
Storage(self, engine, dbschema=None, prefix='', reflect_only=None, autoincrement=None)
```
SQL storage
Package implements
[Tabular Storage](https://github.com/frictionlessdata/tableschema-py#storage)
interface (see full documentation on the link):

> Only additional API is documented
__Arguments__
- __engine (object)__: `sqlalchemy` engine
- __dbschema (str)__: name of database schema
- __prefix (str)__: prefix for all buckets
- __reflect_only (callable)__:
a boolean predicate to filter the list of table names when reflecting
- __autoincrement (str/dict)__:
add autoincrement column at the beginning.
- if a string it's an autoincrement column name
- if a dict it's an autoincrements mapping with column
names indexed by bucket names, for example,
`{'bucket1': 'id', 'bucket2': 'other_id}`
#### `storage.create`
```python
storage.create(self, bucket, descriptor, force=False, indexes_fields=None)
```
Create bucket
__Arguments__
- __indexes_fields (str[])__:
list of tuples containing field names, or list of such lists
#### `storage.write`
```python
storage.write(self, bucket, rows, keyed=False, as_generator=False, update_keys=None, buffer_size=1000, use_bloom_filter=True)
```
Write to bucket
__Arguments__
- __keyed (bool)__:
accept keyed rows
- __as_generator (bool)__:
returns generator to provide writing control to the client
- __update_keys (str[])__:
update instead of inserting if key values match existent rows
- __buffer_size (int=1000)__:
maximum number of rows to try and write to the db in one batch
- __use_bloom_filter (bool=True)__:
should we use a bloom filter to optimize DB update performance
(in exchange for some setup time)
## Contributing
> The project follows the [Open Knowledge International coding standards](https://github.com/okfn/coding-standards).
Recommended way to get started is to create and activate a project virtual environment.
To install package and development dependencies into active environment:
```bash
$ make install
```
To run tests with linting and coverage:
```bash
$ make test
```
## Changelog
Here described only breaking and the most important changes. The full changelog and documentation for all released versions could be found in nicely formatted [commit history](https://github.com/frictionlessdata/tableschema-sql-py/commits/master).
#### v1.3
- Implemented constraints loading to a database
#### v1.2
- Add option to configure buffer size, bloom filter use (#77)
#### v1.1
- Added support for the `autoincrement` parameter to be a mapping
- Fixed autoincrement support for SQLite and MySQL
#### v1.0
- Initial driver implementation.