{"id":17361112,"url":"https://github.com/pacman82/odbc2parquet","last_synced_at":"2025-04-14T08:53:44.772Z","repository":{"id":37234785,"uuid":"290899681","full_name":"pacman82/odbc2parquet","owner":"pacman82","description":"A command line tool to query an ODBC data source and write the result into a parquet file.","archived":false,"fork":false,"pushed_at":"2025-04-02T16:16:12.000Z","size":1352,"stargazers_count":235,"open_issues_count":4,"forks_count":20,"subscribers_count":11,"default_branch":"main","last_synced_at":"2025-04-07T01:09:23.776Z","etag":null,"topics":["odbc","parquet"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pacman82.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"Contributing.md","funding":".github/FUNDING.yml","license":"License","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":["pacman82"],"patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"custom":null}},"created_at":"2020-08-27T23:01:01.000Z","updated_at":"2025-04-05T17:24:24.000Z","dependencies_parsed_at":"2023-10-25T18:34:57.702Z","dependency_job_id":"ef280521-0b0b-4170-906d-61fe559c4f65","html_url":"https://github.com/pacman82/odbc2parquet","commit_stats":{"total_commits":1072,"total_committers":10,"mean_commits":107.2,"dds":"0.49626865671641796","last_synced_commit":"1c9df415bb6f6bd6f9372343f1532aad1ee662b5"},"previous_names":[],"tags_count":123,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pacman82%2Fodbc2parquet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pacman82%2Fodbc2parquet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pacman82%2Fodbc2parquet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pacman82%2Fodbc2parquet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pacman82","download_url":"https://codeload.github.com/pacman82/odbc2parquet/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248852108,"owners_count":21171839,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["odbc","parquet"],"created_at":"2024-10-15T19:30:52.309Z","updated_at":"2025-04-14T08:53:44.744Z","avatar_url":"https://github.com/pacman82.png","language":"Rust","funding_links":["https://github.com/sponsors/pacman82"],"categories":["Tools"],"sub_categories":["Command-line"],"readme":"# ODBC to Parquet\n\n[![Licence](https://img.shields.io/crates/l/odbc2parquet)](https://github.com/pacman82/odbc2parquet/blob/master/License)\n[![Crates.io](https://img.shields.io/crates/v/odbc2parquet)](https://crates.io/crates/odbc2parquet)\n\nA command line tool to query an ODBC data source and write the result into a parquet file.\n\n* Small memory footprint. Only holds one batch at a time in memory.\n* Fast. Makes efficient use of ODBC bulk reads, to lower IO overhead.\n* Flexible. Query any ODBC data source you have a driver for. MySQL, MS SQL, Excel, ...\n\n## Mapping of types in queries\n\nThe tool queries the ODBC Data source for type information and maps it to parquet type as such:\n\n| ODBC SQL Type              | Parquet Type                 |\n|----------------------------|------------------------------|\n| Decimal(p \u003c 39, s)         | Decimal(p,s)                 |\n| Numeric(p \u003c 39, s)         | Decimal(p,s)                 |\n| Bit                        | Boolean                      |\n| Double                     | Double                       |\n| Real                       | Float                        |\n| Float(p: 0..24)            | Float                        |\n| Float(p \u003e= 25)             | Double                       |\n| Tiny Integer Signed        | Int8 Signed                  |\n| Tiny Integer Unsigned      | Int8 Unsigned                |\n| Small Integer              | Int16                        |\n| Integer                    | Int32                        |\n| Big Int                    | Int64                        |\n| Date                       | Date                         |\n| Time(p: 0..3)*             | Time Milliseconds            |\n| Time(p: 4..6)*             | Time Microseconds            |\n| Time(p: 7..9)*             | Time Nanoseconds             |\n| Timestamp(p: 0..3)         | Timestamp Milliseconds       |\n| Timestamp(p: 4..6)         | Timestamp Microseconds       |\n| Timestamp(p \u003e= 7)          | Timestamp Nanoseconds        |\n| Datetimeoffset(p: 0..3)    | Timestamp Milliseconds (UTC) |\n| Datetimeoffset(p: 4..6)    | Timestamp Microseconds (UTC) |\n| Datetimeoffset(p \u003e= 7)     | Timestamp Nanoseconds (UTC)  |\n| Varbinary                  | Byte Array                   |\n| Long Varbinary             | Byte Array                   |\n| Binary                     | Fixed Length Byte Array      |\n| All others                 | Utf8 Byte Array              |\n\n`p` is short for `precision`. `s` is short for `scale`. Intervals are inclusive.\n* Time is only supported for Microsoft SQL Server\n\n## Installation\n\n### Prerequisites\n\nTo work with this tool you need an ODBC driver manager and an ODBC driver for the data source you want to access.\n\n#### Windows\n\nAn ODBC driver manager is already preinstalled on windows. So is the `ODBC data sources (64Bit)` and `ODBC data sources (32Bit)` app which you can use to discover which drivers are already available on your system.\n\n#### Linux\n\nThis tool links both at runtime and during build against `libodbc.so`. To get it you should install [unixODBC](http://www.unixodbc.org/). You can do this using your systems packet manager. For *ubuntu* you run:\n\n```shell\nsudo apt install unixodbc-dev\n```\n\n#### OS-X\n\nThis tool links both at runtime and during build against `libodbc.so`. To get it you should install [unixODBC](http://www.unixodbc.org/). To install it I recommend the [homebrew](https://brew.sh/) packet manager, which allows you to install it using:\n\n```shell\nbrew install unixodbc\n```\n\n### Via scoop package manager\n\nIf you have [scoop package manager](https://scoop.sh) installed (Windows only), you can install this with:\n\n```shell\nscoop install odbc2parquet\n```\n\n### Download binary from GitHub\n\n\u003chttps://github.com/pacman82/odbc2parquet/releases/latest\u003e\n\n*Note*: Download the 32 Bit version if you want to connect to data sources using 32 Bit drivers and download the 64 Bit version if you want to connect via 64 Bit drivers. It won't work vice versa.\n\n### Via Cargo\n\nIf you have a rust tool chain installed, you can install this tool via cargo.\n\n```shell script\ncargo install odbc2parquet\n```\n\n### Build in docker `from scratch`\n\n```dockerfile\nFROM rust:alpine AS builder\n\nRUN apk add --no-cache musl-dev unixodbc-static\n# In addition to unixodbc you also want to install the database drivers you need and `COPY` them over to the `runner`\n\nWORKDIR /src/odbc2parquet\nCOPY . .\nRUN cargo build --release\n\nFROM scratch AS runner\n\nCOPY --from=builder /src/odbc2parquet/target/release/odbc2parquet /usr/local/bin/\n\nENTRYPOINT [\"/usr/local/bin/odbc2parquet\"]\n```\n\nYou can install `cargo` from here \u003chttps://rustup.rs/\u003e.\n\n## Usage\n\nUse `odbc2parquet --help` to see all commands.\n\n### Query\n\nUse `odbc2parquet help query` to see all options related to fetching data.\n\n#### Query using connection string\n\n```bash\nodbc2parquet query \\\n--connection-string \"Driver={ODBC Driver 18` for SQL Server};Server=localhost;UID=SA;PWD=\u003cYourStrong@Passw0rd\u003e;TrustServerCertificate=yes;\" \\\nout.par  \\\n\"SELECT * FROM Birthdays\"\n```\n\n#### Query using data source name\n\n```bash\nodbc2parquet query \\\n--dsn my_db \\\n--password \"\u003cYourStrong@Passw0rd\u003e\" \\\n--user \"SA\" \\\nout.par1 \\\n\"SELECT * FROM Birthdays\"\n```\n\n#### Use parameters in query\n\n```shell\nodbc2parquet query \\\n--connection-string \"Driver={ODBC Driver 18 for SQL Server};Server=localhost;UID=SA;PWD=\u003cYourStrong@Passw0rd\u003e;TrustServerCertificate=yes;\" \\\nout.par  \\\n\"SELECT * FROM Birthdays WHERE year \u003e ? and year \u003c ?\" \\\n1990 2010\n```\n\n### List available ODBC drivers\n\n```bash\nodbc2parquet list-drivers\n```\n\n### List available ODBC data sources\n\n```bash\nodbc2parquet list-data-sources\n```\n\n### Inserting data into a database\n\n```shell\nodbc2parquet insert \\\n--connection-string \"Driver={ODBC Driver 18 for SQL Server};Server=localhost;UID=SA;PWD=\u003cYourStrong@Passw0rd\u003e;TrustServerCertificate=yes;\" \\\ninput.par \\\nMyTable\n```\n\nUse `odbc2parquet help insert` to see all options related to inserting data.\n\n### Shell Completions\n\nThe completions sub command supports generating completions for various shells. Here is e.g. how you might want to add shell completions for powershell:\n\n```powershell\nif (!(Test-Path -Path $PROFILE)) {\n  New-Item -ItemType File -Path $PROFILE -Force\n}\nAdd-Content -Path $PROFILE -Value '(\u0026 odbc2parquet completions powershell) | Out-String | Invoke-Expression'\n```\n\n## Links\n\nThanks to @samaguire there is a script for Powershell users which helps you to download a bunch of tables to a folder: \u003chttps://github.com/samaguire/odbc2parquet-PSscripts\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpacman82%2Fodbc2parquet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpacman82%2Fodbc2parquet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpacman82%2Fodbc2parquet/lists"}