https://github.com/frictionlessdata/tableschema-sql-py
Generate SQL tables, load and extract data, based on JSON Table Schema descriptors.
https://github.com/frictionlessdata/tableschema-sql-py
Last synced: 5 months ago
JSON representation
Generate SQL tables, load and extract data, based on JSON Table Schema descriptors.
- Host: GitHub
- URL: https://github.com/frictionlessdata/tableschema-sql-py
- Owner: frictionlessdata
- License: mit
- Created: 2015-08-17T17:50:23.000Z (over 9 years ago)
- Default Branch: main
- Last Pushed: 2023-07-19T17:43:48.000Z (over 1 year ago)
- Last Synced: 2024-11-06T11:04:26.826Z (5 months ago)
- Language: Python
- Homepage:
- Size: 320 KB
- Stars: 61
- Watchers: 17
- Forks: 19
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
- jimsghstars - frictionlessdata/tableschema-sql-py - Generate SQL tables, load and extract data, based on JSON Table Schema descriptors. (Python)
README
# tableschema-sql-py
[](https://travis-ci.org/frictionlessdata/tableschema-sql-py)
[](https://coveralls.io/r/frictionlessdata/tableschema-sql-py?branch=master)
[](https://pypi.python.org/pypi/tableschema-sql)
[](https://github.com/frictionlessdata/tableschema-sql-py)
[](https://gitter.im/frictionlessdata/chat)Generate and load SQL tables based on [Table Schema](http://specs.frictionlessdata.io/table-schema/) descriptors.
## Features
- implements `tableschema.Storage` interface
- provides additional features like indexes and updating## Contents
- [Getting Started](#getting-started)
- [Installation](#installation)
- [Documentation](#documentation)
- [API Reference](#api-reference)
- [`Storage`](#storage)
- [Contributing](#contributing)
- [Changelog](#changelog)## Getting Started
### Installation
The package use semantic versioning. It means that major versions could include breaking changes. It's highly recommended to specify `package` version range in your `setup/requirements` file e.g. `package>=1.0,<2.0`.
```bash
pip install tableschema-sql
```## Documentation
```python
from datapackage import Package
from tableschema import Table
from sqlalchemy import create_engine# Create sqlalchemy engine
engine = create_engine('sqlite://')# Save package to SQL
package = Package('datapackage.json')
package.save(storage='sql', engine=engine)# Load package from SQL
package = Package(storage='sql', engine=engine)
package.resources
```## API Reference
### `Storage`
```python
Storage(self, engine, dbschema=None, prefix='', reflect_only=None, autoincrement=None)
```
SQL storagePackage implements
[Tabular Storage](https://github.com/frictionlessdata/tableschema-py#storage)
interface (see full documentation on the link):
> Only additional API is documented
__Arguments__
- __engine (object)__: `sqlalchemy` engine
- __dbschema (str)__: name of database schema
- __prefix (str)__: prefix for all buckets
- __reflect_only (callable)__:
a boolean predicate to filter the list of table names when reflecting
- __autoincrement (str/dict)__:
add autoincrement column at the beginning.
- if a string it's an autoincrement column name
- if a dict it's an autoincrements mapping with column
names indexed by bucket names, for example,
`{'bucket1': 'id', 'bucket2': 'other_id}`#### `storage.create`
```python
storage.create(self, bucket, descriptor, force=False, indexes_fields=None)
```
Create bucket__Arguments__
- __indexes_fields (str[])__:
list of tuples containing field names, or list of such lists#### `storage.write`
```python
storage.write(self, bucket, rows, keyed=False, as_generator=False, update_keys=None, buffer_size=1000, use_bloom_filter=True)
```
Write to bucket__Arguments__
- __keyed (bool)__:
accept keyed rows
- __as_generator (bool)__:
returns generator to provide writing control to the client
- __update_keys (str[])__:
update instead of inserting if key values match existent rows
- __buffer_size (int=1000)__:
maximum number of rows to try and write to the db in one batch
- __use_bloom_filter (bool=True)__:
should we use a bloom filter to optimize DB update performance
(in exchange for some setup time)## Contributing
> The project follows the [Open Knowledge International coding standards](https://github.com/okfn/coding-standards).
Recommended way to get started is to create and activate a project virtual environment.
To install package and development dependencies into active environment:```bash
$ make install
```To run tests with linting and coverage:
```bash
$ make test
```## Changelog
Here described only breaking and the most important changes. The full changelog and documentation for all released versions could be found in nicely formatted [commit history](https://github.com/frictionlessdata/tableschema-sql-py/commits/master).
#### v1.3
- Implemented constraints loading to a database
#### v1.2
- Add option to configure buffer size, bloom filter use (#77)
#### v1.1
- Added support for the `autoincrement` parameter to be a mapping
- Fixed autoincrement support for SQLite and MySQL#### v1.0
- Initial driver implementation.