https://github.com/catherinedevlin/ddl-generator

Guesses table DDL based on data
https://github.com/catherinedevlin/ddl-generator

Last synced: 8 days ago
JSON representation

Guesses table DDL based on data

Host: GitHub
URL: https://github.com/catherinedevlin/ddl-generator
Owner: catherinedevlin
License: mit
Created: 2014-03-08T17:15:32.000Z (about 11 years ago)
Default Branch: master
Last Pushed: 2022-09-09T08:16:26.000Z (over 2 years ago)
Last Synced: 2025-03-31T08:07:46.634Z (16 days ago)
Language: HTML
Size: 226 KB
Stars: 276
Watchers: 16
Forks: 35
Open Issues: 20
Metadata Files:
- Readme: README.rst
- Changelog: HISTORY.rst
- Contributing: CONTRIBUTING.rst
- License: LICENSE

Awesome Lists containing this project

jimsghstars - catherinedevlin/ddl-generator - Guesses table DDL based on data (HTML)

README

        =============

DDL Generator

=============

Infers SQL DDL (Data Definition Language) from table data.

Use at command line::

    $ ddlgenerator -i postgresql '[{"Name": "Alfred", "species": "wart hog", "kg": 22}]'

    DROP TABLE generated_table;

    CREATE TABLE generated_table (

	    name VARCHAR(6) NOT NULL, 

	    kg INTEGER NOT NULL, 

	    species VARCHAR(8) NOT NULL 

    )

    ;

    INSERT INTO generated_table (kg, Name, species) VALUES (22, 'Alfred', 'wart hog');

    

Reads data from files::

    $ ddlgenerator postgresql mydata.yaml > mytable.sql

Enables one-line creation of tables with their data

    $ ddlgenerator --inserts postgresql mydata.json | psql 

To use in Python::

    >>> from ddlgenerator.ddlgenerator import Table

    >>> table = Table([{"Name": "Alfred", "species": "wart hog", "kg": 22}])

    >>> sql = table.sql('postgresql', inserts=True)

Supported data formats

----------------------

- Pure Python

- YAML

- JSON

- CSV

- Pickle

- HTML

Features

--------

- Supports all SQL dialects supported by SQLAlchemy

- Coerces data into most specific data type valid on all column's values

- Takes table name from file name

- Guesses format of input data if unspecified by file extension

- with ``-i``/``--inserts`` flag, adds INSERT statements

- with ``-u``/``--uniques`` flag, surmises UNIQUE constraints from data

- Handles nested data, creating child tables as needed

- Reads HTML tables, including those embedded in noisy websites

Options

-------

::

      -h, --help            show this help message and exit

      -k KEY, --key KEY     Field to use as primary key

      -r, --reorder         Reorder fields alphabetically, ``key`` first

      -u, --uniques         Include UNIQUE constraints where data is unique

      -t, --text            Use variable-length TEXT columns instead of VARCHAR

      -d, --drops           Include DROP TABLE statements

      -i, --inserts         Include INSERT statements

      --no-creates          Do not include CREATE TABLE statements

      --save-metadata-to FILENAME

			    Save table definition in FILENAME for later --use-

			    saved-metadata run

      --use-metadata-from FILENAME

			    Use metadata saved in FROM for table definition, do

			    not re-analyze table structure

      -l LOG, --log LOG     log level (CRITICAL, FATAL, ERROR, DEBUG, INFO, WARN)

Generate SQLAlchemy models

--------------------------

Use ``sqlalchemy`` as the model to generate Python for defining SQLAlchemy

models::

    $ ddlgenerator sqlalchemy '[{"Name": "Alfred", "species": "wart hog", "kg": 22}]'

    Table0 = Table('Table0', metadata, 

      Column('species', Unicode(length=8), nullable=False), 

      Column('kg', Integer(), nullable=False), 

      Column('name', Unicode(length=6), nullable=False), 

      schema=None)

Generate Django models

----------------------

If Django is installed on the path then using ``django`` as the model will run the

generated ddl through Django's ``inspectdb`` management command to produce a model

file::

    $ ddlgenerator django '[{"Name": "Alfred", "species": "wart hog", "kg": 22}]'

    # This is an auto-generated Django model module.

    # You'll have to do the following manually to clean this up:

    #   * Rearrange models' order

    #   * Make sure each model has one field with primary_key=True

    #   * Remove `managed = False` lines if you wish to allow Django to create and delete the table

    # Feel free to rename the models, but don't rename db_table values or field names.

    #

    # Also note: You'll have to insert the output of 'django-admin.py sqlcustom [appname]'

    # into your database.

    from __future__ import unicode_literals

    from django.db import models

    class Table0(models.Model):

        species = models.CharField(max_length=8)

        kg = models.IntegerField()

        name = models.CharField(max_length=6)

        class Meta:

            managed = False

            db_table = 'Table0'

Large tables

------------

As of now, ``ddlgenerator`` is not well-designed for table sizes approaching

your system's available memory.

One approach to save time and memory for large tables is to break your input data into multiple

files, then run ``ddlgenerator`` with ``--save-metadata`` against a small 

but representative sample.  Then run with ``--no-creates`` and ``-use-saved-metadata``

to generate INSERTs from the remaining files without needing to re-determine the

column types each time.

Installing

----------

Requires Python3.

From PyPI::

    pip3 install ddlgenerator

From source::

    git clone https://github.com/catherinedevlin/ddl-generator.git

    cd ddl-generator

    pip3 install .

Alternatives

------------

* `csvkit.csvsql `_

* `pandas.read_*` methods

* `prequel `_ for SQLite

Credits

-------

- Mike Bayer for sqlalchemy

- coldfix and Mark Ransom for their StackOverflow answers

- Audrey Roy for cookiecutter

- Brandon Lorenz for Django model generation

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/catherinedevlin/ddl-generator

Awesome Lists containing this project

README