{"id":13586006,"url":"https://github.com/simonw/sqlite-transform","last_synced_at":"2025-04-07T14:33:08.644Z","repository":{"id":57470599,"uuid":"219372133","full_name":"simonw/sqlite-transform","owner":"simonw","description":"Tool for running transformations on columns in a SQLite database","archived":true,"fork":false,"pushed_at":"2021-08-02T22:07:57.000Z","size":64,"stargazers_count":30,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-10-30T02:38:18.248Z","etag":null,"topics":["datasette-io","datasette-tool","sqlite"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/simonw.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-11-03T22:07:53.000Z","updated_at":"2023-01-28T07:04:45.000Z","dependencies_parsed_at":"2022-09-26T17:40:34.329Z","dependency_job_id":null,"html_url":"https://github.com/simonw/sqlite-transform","commit_stats":null,"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonw%2Fsqlite-transform","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonw%2Fsqlite-transform/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonw%2Fsqlite-transform/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonw%2Fsqlite-transform/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/simonw","download_url":"https://codeload.github.com/simonw/sqlite-transform/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223284964,"owners_count":17119813,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["datasette-io","datasette-tool","sqlite"],"created_at":"2024-08-01T15:05:16.127Z","updated_at":"2024-11-06T04:30:30.495Z","avatar_url":"https://github.com/simonw.png","language":"Python","readme":"# sqlite-transform\n\n![No longer maintained](https://img.shields.io/badge/no%20longer-maintained-red)\n[![PyPI](https://img.shields.io/pypi/v/sqlite-transform.svg)](https://pypi.org/project/sqlite-transform/)\n[![Changelog](https://img.shields.io/github/v/release/simonw/sqlite-transform?include_prereleases\u0026label=changelog)](https://github.com/simonw/sqlite-transform/releases)\n[![Tests](https://github.com/simonw/sqlite-transform/workflows/Test/badge.svg)](https://github.com/simonw/sqlite-transform/actions?query=workflow%3ATest)\n[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/dogsheep/sqlite-transform/blob/main/LICENSE)\n\nTool for running transformations on columns in a SQLite database.\n\n\u003e **:warning: This tool is no longer maintained**\n\u003e\n\u003e I added a new tool to [sqlite-utils](https://sqlite-utils.datasette.io/) called [sqlite-utils convert](https://sqlite-utils.datasette.io/en/stable/cli.html#converting-data-in-columns) which provides a super-set of the functionality originally provided here. `sqlite-transform` is no longer maintained, and I recommend switching to using `sqlite-utils convert` instead.\n\n## How to install\n\n    pip install sqlite-transform\n\n## parsedate and parsedatetime\n\nThese subcommands will run all values in the specified column through `dateutils.parser.parse()` and replace them with the result, formatted as an ISO timestamp or ISO date.\n\nFor example, if a row in the database has an `opened` column which contains `10/10/2019 08:10:00 PM`, running the following command:\n\n    sqlite-transform parsedatetime my.db mytable opened\n\nWill result in that value being replaced by `2019-10-10T20:10:00`.\n\nUsing the `parsedate` subcommand here would result in `2019-10-10` instead.\n\nIn the case of ambiguous dates such as `03/04/05` these commands both default to assuming American-style `mm/dd/yy` format. You can pass `--dayfirst` to specify that the day should be assumed to be first, or `--yearfirst` for the year.\n\n## jsonsplit\n\nThe `jsonsplit` subcommand takes columns that contain a comma-separated list, for example a `tags` column containing records like `\"trees,park,dogs\"` and converts it into a JSON array `[\"trees\", \"park\", \"dogs\"]`.\n\nThis is useful for taking advantage of Datasette's [Facet by JSON array](https://docs.datasette.io/en/stable/facets.html#facet-by-json-array) feature.\n\n    sqlite-transform jsonsplit my.db mytable tags\n\nIt defaults to splitting on commas, but you can specify a different delimiter character using the `--delimiter` option, for example:\n\n    sqlite-transform jsonsplit \\\n        my.db mytable tags --delimiter ';'\n\nValues within the array will be treated as strings, so a column containing `123,552,775` will be converted into the JSON array `[\"123\", \"552\", \"775\"]`.\n\nYou can specify a different type for these values using `--type int` or `--type float`, for example:\n\n    sqlite-transform jsonsplit \\\n        my.db mytable tags --type int\n\nThis will result in that column being converted into `[123, 552, 775]`.\n\n## lambda for executing your own code\n\nThe `lambda` subcommand lets you specify Python code which will be executed against the column.\n\nHere's how to convert a column to uppercase:\n\n    sqlite-transform lambda my.db mytable mycolumn --code='str(value).upper()'\n\nThe code you provide will be compiled into a function that takes `value` as a single argument. You can break your function body into multiple lines, provided the last line is a `return` statement:\n\n    sqlite-transform lambda my.db mytable mycolumn --code='value = str(value)\n    return value.upper()'\n\nYou can also specify Python modules that should be imported and made available to your code using one or more `--import` options:\n\n    sqlite-transform lambda my.db mytable mycolumn \\\n        --code='\"\\n\".join(textwrap.wrap(value, 10))' \\\n        --import=textwrap\n\nThe `--dry-run` option will output a preview of the transformation against the first ten rows, without modifying the database.\n\n## Saving the result to a separate column\n\nEach of these commands accepts optional `--output` and `--output-type` options. These can be used to save the result of the transformation to a separate column, which will be created if the column does not already exist.\n\nTo save the result of `jsonsplit` to a new column called `json_tags`, use the following:\n\n    sqlite-transform jsonsplit my.db mytable tags \\\n      --output json_tags\n\nThe type of the created column defaults to `text`, but a different column type can be specified using `--output-type`. This example will create a new floating point column called `float_id` with a copy of each item's ID increased by 0.5:\n\n    sqlite-transform lambda my.db mytable id \\\n      --code 'float(value) + 0.5' \\\n      --output float_id \\\n      --output-type float\n\nYou can drop the original column at the end of the operation by adding `--drop`.\n\n## Splitting a column into multiple columns\n\nSometimes you may wish to convert a single column into multiple derived columns. For example, you may have a `location` column containing `latitude,longitude` values which you wish to split out into separate `latitude` and `longitude` columns.\n\nYou can achieve this using the `--multi` option to `sqlite-transform lambda`. This option expects your `--code` function to return a Python dictionary: new columns well be created and populated for each of the keys in that dictionary.\n\nFor the `latitude,longitude` example you would use the following:\n\n    sqlite-transform lambda demo.db places location \\\n      --code 'return {\n        \"latitude\": float(value.split(\",\")[0]),\n        \"longitude\": float(value.split(\",\")[1]),\n      }' --multi\n\nThe type of the returned values will be taken into account when creating the new columns. In this example, the resulting database schema will look like this:\n\n```sql\nCREATE TABLE [places] (\n    [location] TEXT,\n    [latitude] FLOAT,\n    [longitude] FLOAT\n);\n```\nThe code function can also return `None`, in which case its output will be ignored.\n\nYou can drop the original column at the end of the operation by adding `--drop`.\n\n## Disabling the progress bar\n\nBy default each command will show a progress bar. Pass `-s` or `--silent` to hide that progress bar.\n","funding_links":[],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimonw%2Fsqlite-transform","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsimonw%2Fsqlite-transform","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimonw%2Fsqlite-transform/lists"}