{"id":13557510,"url":"https://github.com/etalab/csvapi","last_synced_at":"2025-04-14T19:41:58.391Z","repository":{"id":36936111,"uuid":"129385788","full_name":"etalab/csvapi","owner":"etalab","description":"An instant JSON API for your CSV","archived":false,"fork":false,"pushed_at":"2023-07-25T16:23:15.000Z","size":1867,"stargazers_count":31,"open_issues_count":66,"forks_count":7,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-03-28T08:03:48.578Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/etalab.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2018-04-13T10:15:51.000Z","updated_at":"2025-03-22T10:30:24.000Z","dependencies_parsed_at":"2024-01-19T18:02:53.016Z","dependency_job_id":null,"html_url":"https://github.com/etalab/csvapi","commit_stats":null,"previous_names":[],"tags_count":24,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/etalab%2Fcsvapi","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/etalab%2Fcsvapi/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/etalab%2Fcsvapi/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/etalab%2Fcsvapi/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/etalab","download_url":"https://codeload.github.com/etalab/csvapi/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248948715,"owners_count":21187922,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T12:04:23.585Z","updated_at":"2025-04-14T19:41:58.359Z","avatar_url":"https://github.com/etalab.png","language":"Python","readme":"# csvapi\n\n\"Instantly\" publish an API for a CSV hosted anywhere on the internet. Also supports Excel files.\n\nThis tool is used by [data.gouv.fr](https://www.data.gouv.fr) to show a preview of hosted CSV and XLS files.\n\n## Installation\n\nRequires Python 3.9+ and a Unix OS with the `file` command available.\n\n```shell\npython3 -m venv pyenv \u0026\u0026 . pyenv/bin/activate\npip install csvapi\n```\n\nFor development:\n\n```shell\npoetry install\n```\n\n## Quickstart\n\n```shell\npoetry run csvapi serve -h 0.0.0.0 -p 8000\n```\n\n## Command line options\n\n```shell\n$ poetry run csvapi serve --help\nUsage: csvapi serve [OPTIONS]\n\nOptions:\n    --ssl-key TEXT             Path to SSL key\n    --ssl-cert TEXT            Path to SSL certificate\n    --cache / --no-cache       Do not parse CSV again if DB already exists\n    --reload                   Automatically reload if code change detected\n    --debug                    Enable debug mode - useful for development\n    -p, --port INTEGER         port for server, defaults to 8001\n    -h, --host TEXT            host for server, defaults to 127.0.0.1\n    --dbs DIRECTORY            Where to store sqlite DBs\n    --help                     Show this message and exit.\n```\n\n## Deploy\n\nWith SSL, using [Hypercorn](https://pgjones.gitlab.io/hypercorn/):\n\n```shell\nhypercorn csvapi.webservice:app -b 0.0.0.0:443 --keyfile key.pem --ca-certs cert.pem\n```\n\nSee [the documentation](https://pgjones.gitlab.io/hypercorn/usage.html) for more options.\n\nYou can use the environment variable `CSVAPI_CONFIG_FILE` to point to a custom configuration file.\n\n## API usage\n\n### Conversion\n\n`/apify?url=http://somewhere.com/a/file.csv`\n\nThis converts a CSV to an SQLite database (w/ `agate`) and returns the following response:\n\n```json\n{\"ok\": true, \"endpoint\": \"http://localhost:8001/api/cde857960e8dc24c9cbcced673b496bb\"}\n```\n\n### Parameters\n\nSome parameters can be used in the query string.\n\n#### `encoding`\n\n**default**: _automatic detection_\n\nYou can force an encoding (e.g. `utf-8`) using this parameter, instead of relying on the automatic detection.\n\n\n### Data API\n\nThis is the `endpoint` attribute of the previous response.\n\n`/api/\u003cmd5-url-hash\u003e`\n\nThis queries a previously converted API file and returns the first 100 rows like this:\n\n```json\n    {\n        \"ok\": true,\n        \"rows\": [[], []],\n        \"columns\": [],\n        \"query_ms\": 1\n    }\n```\n\n### Parameters\n\nSome parameters can be used in the query string.\n\n#### `_size`\n\n**default**: `100`\n\nThis will limit the query to a certain number of rows. For instance to get only 250 rows:\n\n`/api/\u003cmd5-url-hash\u003e?_size=250`\n\n#### `_sort` and `_sort_desc`\n\nUse those to sort by a column. `sort` will sort by ascending order, `sort_desc` by descending order.\n\n`/api/\u003cmd5-url-hash\u003e?_sort=\u003ccolumn-name\u003e`\n\n#### `_offset`\n\nUse this to add on offset. Combined with `_size` it allows pagination.\n\n`/api/\u003cmd5-url-hash\u003e?_size=1\u0026_offset=1`\n\n#### `_shape`\n\n**default**: `lists`\n\nThe `_shape` argument is used to specify the format output of the json. It can take the value `objects` to get an array of objects instead of an array of arrays:\n\n`/api/\u003cmd5-url-hash\u003e?_shape=objects`\n\nFor instance, instead of returning:\n\n```json\n{\n    \"ok\": true,\n    \"query_ms\": 0.4799365997,\n    \"rows\": [\n        [1, \"Justice\", \"0101\", 57663310],\n        [2, \"Justice\", \"0101\", 2255129],\n        [3, \"Justice\", \"0101\", 36290]\n    ],\n    \"columns\": [\"rowid\", \"Mission\", \"Programme\", \"Consommation de CP\"]\n}\n```\n\nIt will return:\n\n```json\n{\n    \"ok\": true,\n    \"query_ms\": 2.681016922,\n    \"rows\": [\n    {\n        \"rowid\": 1,\n        \"Mission\": \"Justice\",\n        \"Programme\": \"0101\",\n        \"Consommation de CP\": 57663310\n    },\n    {\n        \"rowid\": 2,\n        \"Mission\": \"Justice\",\n        \"Programme\": \"0101\",\n        \"Consommation de CP\": 2255129\n    },\n    {\n        \"rowid\": 3,\n        \"Mission\": \"Justice\",\n        \"Programme\": \"0101\",\n        \"Consommation de CP\": 36290\n    }],\n    \"columns\": [\"rowid\", \"Mission\", \"Programme\", \"Consommation de CP\"]\n}\n```\n\n#### `_rowid`\n\n**default**: `show`\n\nThe `_rowid` argument is used to display or hide rowids in the returned data. Use `_rowid=hide` to hide.\n\n`/api/\u003cmd5-url-hash\u003e?_shape=objects\u0026_rowid=hide`\n\n```json\n{\n    \"ok\": true,\n    \"query_ms\": 2.681016922,\n    \"rows\": [\n    {\n        \"Mission\": \"Justice\",\n        \"Programme\": \"0101\",\n        \"Consommation de CP\": 57663310\n    },\n    {\n        \"Mission\": \"Justice\",\n        \"Programme\": \"0101\",\n        \"Consommation de CP\": 2255129\n    },\n    {\n        \"Mission\": \"Justice\",\n        \"Programme\": \"0101\",\n        \"Consommation de CP\": 36290\n    }],\n    \"columns\": [\"Mission\", \"Programme\", \"Consommation de CP\"]\n}\n```\n\n#### `_total`\n\n**default**: `show`\n\nThe `_total` argument is used to display or hide the total number of rows (independent of pagination) in the returned data. Use `_total=hide` to hide.\n\n```json\n{\n    \"ok\": true,\n    \"query_ms\": 2.681016922,\n    \"rows\": [\n    {\n        \"Mission\": \"Justice\",\n        \"Programme\": \"0101\",\n        \"Consommation de CP\": 57663310\n    },\n    {\n        \"Mission\": \"Justice\",\n        \"Programme\": \"0101\",\n        \"Consommation de CP\": 2255129\n    },\n    {\n        \"Mission\": \"Justice\",\n        \"Programme\": \"0101\",\n        \"Consommation de CP\": 36290\n    }],\n    \"columns\": [\"Mission\", \"Programme\", \"Consommation de CP\"],\n    \"total\": 3\n}\n```\n\n#### Column based filters\n\nBy adding `{column}__{comparator}={value}` to the query string, you can filter the results based on the following criterions:\n- `{column}` must be a valid column in your CSV\n- `{comparator}` is `exact` (SQL `= {value}`) or `contains` (SQL `LIKE %{value}%`)\n- `{value}` is the value you're filtering the column against\n\nYou can add multiple filters, they will be joined with a `AND` at the SQL level.\n\n## Credits\n\nInspired by the excellent [Datasette](https://github.com/simonw/datasette).\n","funding_links":[],"categories":["Python","API","others"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fetalab%2Fcsvapi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fetalab%2Fcsvapi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fetalab%2Fcsvapi/lists"}