{"id":27933452,"url":"https://github.com/timescale/outflux","last_synced_at":"2025-05-07T04:58:50.296Z","repository":{"id":46881439,"uuid":"162480401","full_name":"timescale/outflux","owner":"timescale","description":"Export data from InfluxDB to TimescaleDB","archived":false,"fork":false,"pushed_at":"2023-10-11T09:05:04.000Z","size":328,"stargazers_count":94,"open_issues_count":11,"forks_count":24,"subscribers_count":17,"default_branch":"main","last_synced_at":"2025-05-07T04:58:44.847Z","etag":null,"topics":["influx","migration","time-series","timescaledb"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/timescale.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-12-19T19:15:05.000Z","updated_at":"2025-05-02T01:02:29.000Z","dependencies_parsed_at":"2024-06-18T22:39:58.280Z","dependency_job_id":"3e7651d4-9fda-4476-89bd-9daa08b393d0","html_url":"https://github.com/timescale/outflux","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/timescale%2Foutflux","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/timescale%2Foutflux/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/timescale%2Foutflux/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/timescale%2Foutflux/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/timescale","download_url":"https://codeload.github.com/timescale/outflux/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252816948,"owners_count":21808704,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["influx","migration","time-series","timescaledb"],"created_at":"2025-05-07T04:58:49.777Z","updated_at":"2025-05-07T04:58:50.289Z","avatar_url":"https://github.com/timescale.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Outflux - Migrate InfluxDB to TimescaleDB\n[![Go Report Card](https://goreportcard.com/badge/github.com/timescale/outflux)](https://goreportcard.com/report/github.com/timescale/outflux)\n\n\nThis repo contains code for exporting complete InfluxDB databases or selected measurements to TimescaleDB.\n\n## Table of Contents\n\n1. [Installation](#installation)\n  - [Installing from source](#installing-from-source)\n  - [Binary releases](#binary-releases)\n2. [How to use](#how-to-use)\n  - [Before using it](#before-using-it)\n  - [Connection params](#connection-params)\n  - [Schema Transfer](#schema-transfer)\n  - [Migrate](#migrate)\n  - [Examples](#examples)\n3. [Connection](#connection)\n  - [TimescaleDB connection params](#timescaledb-connection-params)\n  - [InfluxDB connection params](#influxdb-connection-params)\n4. [Known limitations](#known-limitations)\n\n## Installation\n\n### Binary releases\n\nWe provide binaries for GNU/Linux, Windows and MacOS with each release, these\ncan be found under [releases]. To use outflux, download the binary, extract the\ncompressed tarball and run the executable.\n\n```bash\nwget https://github.com/timescale/outflux/releases/download/v0.3.0/outflux_0.3.0_Linux_x86_64.tar.gz\ntar xf outflux_0.3.0_Linux_x86_64.tar.gz\n./outflux --help\n```\n\n[releases]: https://github.com/timescale/outflux/releases\n\n### Installing from source\n\nOutflux is a Go project managed by `go modules`. You can download it \nin any directory and on the first build it will download it's required dependencies.\n\nDepending on where you downloaded it and the go version you're using, you may \n need to set the `GO111MODULE` to `auto`, `on` or `off`. Learn about the `GO111MODULE` \n environment variable [here](https://golang.org/cmd/go/#hdr-Module_support).\n\n```bash\n# Fetch the source code of Outflux in any directory\n$ git clone git@github.com:timescale/outflux.git\n$ cd ./outflux\n\n# Install the Outflux binary (will automaticly detect and download)\n# dependencies.\n$ cd cmd/outflux\n$ GO111MODULE=auto go install\n\n# Building without installing will also fetch the required dependencies\n$ GO111MODULE=auto go build ./... \n```\n\n## How to use\n\nOutflux supports InfluxDB 1.x.\n\nOutflux should support using the 1.x query APIs for InfluxDB 2.x and 3.x. You\nwill need to enable the 1.x APIs to use them. Consult the InfluxDB\ndocumentation for more details.\n\n### Before using it\n\nIt is recommended that you have and InfluxDB database with some data. For\ntesting purposes you can check out the [TSBS Data Loader Tool], which is part\nof the Time Series Benchmark Suite. It can generate large amounts of data to\nload into influx. Data can be generated with [one command], just specify the\nformat as 'influx', and then load it in with [another command].\n\n[TSBS Data Loader Tool]: https://github.com/timescale/tsbs\n[one command]: https://github.com/timescale/tsbs#data-generation\n[another command]: https://github.com/timescale/tsbs#data-generation\n\n### Connection params\n\nDetailed information about how to pass the connection parameters to Outflux can be found at the bottom of this document at the [Connection](#connection) section.\n\n### Schema Transfer\n\nThe Outflux CLI has two commands. The first one is `schema-transfer`. This\ncommand discoverx the schema of an InfluxDB database, or specific measurements\nin an InfluxDB database, and (depending on the strategy selected) create or\nverify a TimescaleDB database that could hold the data.\n\nThe possible flags for the command can be seen by running: \n\n```bash\n$ cd $GOPATH/bin/\n$ ./outflux schema-transfer --help\n```\n\nUsage is `outflux schema-transfer database [measure1 measure2 ...] [flags]`,\nwhere `database` is the name of the InfluxDB database you wish to export,\n`[measure1 ...] ` are optional and if specified will export only those\nmeasurements from the selected database. Additionally, you can specify the\nretention policy with the `retention-policy` flag.\n\nFor example `outflux schema-transfer benchmark cpu mem` will discover the\nschema for the `cpu` and `mem` measurements from the `benchmark` database.\n\nAvailable flags for schema-transfer are:\n\n| flag                      | type    | default               | description |\n|---------------------------|---------|-----------------------|-------------|\n| input-server              | string  | http://localhost:8086 | Location of the input database, http(s)://location:port. |\n| input-pass                | string  |                       | Password to use when connecting to the input database |\n| input-user                | string  |                       | Username to use when connecting to the input database |\n| input-unsafe-https        | bool    | false                 | Should 'InsecureSkipVerify' be passed to the input connection |\n| retention-policy          | string  | autogen               | The retention policy to select the tags and fields from |\n| output-conn               | string  | sslmode=disable       | Connection string to use to connect to the output database|\n| output-schema             | string  |                       | The schema of the output database that the data will be inserted into |\n| schema-strategy           | string  | CreateIfMissing       | Strategy to use for preparing the schema of the output database. Valid options: ValidateOnly, CreateIfMissing, DropAndCreate, DropCascadeAndCreate |\n| tags-as-json              | bool    | false                 | If this flag is set to true, then the Tags of the influx measures being exported will be combined into a single JSONb column in Timescale |\n| tags-column               | string  | tags                  | When `tags-as-json` is set, this column specifies the name of the JSON column for the tags |\n| fields-as-json            | bool    | false                 | If this flag is set to true, then the Fields of the influx measures being exported will be combined into a single JSONb column in Timescale |\n| fields-column             | string  | fields                | When `fields-as-json` is set, this column specifies the name of the JSON column for the fields |\n| multishard-int-float-cast | bool    | false                 | If a field is Int64 in one shard, and Float64 in another, with this flag it will be cast to Float64 despite possible data loss |\n| quiet                     | bool    | false                 | If specified will suppress any log to STDOUT |\n\n### Migrate\n\nThe second command of the Outflux CLI is `migrate`. The possible flags for the command can be seen by running:\n\n```bash\n$ cd $GOPATH/bin/\n$ ./outflux migrate --help\n```\n\nUsage is `outflux migrate database [measure1 measure2 ...] [flags]`, where\n`database` is the name of the InfluxDB database you wish to export,\n`[measure1 measure2 ...]` are optional and if specified will export only those\nmeasurements from the selected database. \n\nThe retention policy can be specified with the `retention-policy` flag. By\ndefault, the 'autogen' retention policy is used.\n\nFor example `outflux migrate benchmark cpu mem` will export the `cpu` and `mem`\nmeasurements from the `benchmark` database. On the other hand\n`outflux migrate benchmark` will export all measurements in the `benchmark`\ndatabase.\n\nAvailable flags are:\n\n| flag                       | type    | default               | description|\n|----------------------------|---------|-----------------------|------------|\n| input-server               | string  | http://localhost:8086 | Location of the input database, http(s)://location:port. |\n| input-pass                 | string  |                       | Password to use when connecting to the input database |\n| input-user                 | string  |                       | Username to use when connecting to the input database |\n| input-unsafe-https         | bool    | false                 | Should 'InsecureSkipVerify' be passed to the input connection |\n| retention-policy           | string  | autogen               | The retention policy to select the data from |\n| limit                      | uint64  | 0                     | If specified will limit the export points to its value. 0 = NO LIMIT |\n| from                       | string  |                       | If specified will export data with a timestamp \u003e= of its value. Accepted format: RFC3339 |\n| to                         | string  |                       | If specified will export data with a timestamp \u003c= of its value. Accepted format: RFC3339 |\n| output-conn                | string  | sslmode=disable       | Connection string to use to connect to the output database|\n| output-schema              | string  | public                | The schema of the output database that the data will be inserted into. |\n| schema-strategy            | string  | CreateIfMissing       | Strategy to use for preparing the schema of the output database. Valid options: ValidateOnly, CreateIfMissing, DropAndCreate, DropCascadeAndCreate |\n| chunk-size                 | uint16  | 15000                 | The export query will request data in chunks of this size. Must be \u003e 0 |\n| batch-size                 | uint16  | 8000                  | The size of the batch inserted in to the output database |\n| data-buffer                | uint16  | 15000                 | Size of the buffer holding exported data ready to be inserted in the output database |\n| max-parallel               | uint8   | 2                     | Number of parallel measure extractions. One InfluxDB measure is exported using 1 worker |\n| rollback-on-external-error | bool    | true                  | If set, when an error occurs while extracting the data, the insertion will be rollbacked. Otherwise it will try to commit |\n| tags-as-json     | bool    | false                 | If this flag is set to true, then the Tags of the influx measures being exported will be combined into a single JSONb column in Timescale |\n| tags-column      | string  | tags                  | When `tags-as-json` is set, this column specifies the name of the JSON column for the tags |\n| fields-as-json   | bool    | false                 | If this flag is set to true, then the Fields of the influx measures being exported will be combined into a single JSONb column in Timescale |\n| fields-column    | string  | fields                | When `fields-as-json` is set, this column specifies the name of the JSON column for the fields |\n| multishard-int-float-cast | bool    | false                 | If a field is Int64 in one shard, and Float64 in another, with this flag it will be cast to Float64 despite possible data loss |\n| quiet                      | bool    | false                 | If specified will suppress any log to STDOUT |\n\n### Examples\n\n* Use environment variables for determining output db connection\n```bash\n$ PGPORT=5433\n$ PGDATABASE=test\n$ PGUSER=test\n...\n$ ./outflux schema-transfer benchmark\n```\n\n* Export the complete 'benchmark' database on 'localhost:8086' to the 'targetdb' database on localhost:5432. Use environment variable to set InfluxDB password\n\n```bash\n$ PGDATABASE=some_default_db\n$ INFLUX_PASSWORD=test\n...\n$ outflux migrate benchmark \\\n\u003e --input-user=test \\\n\u003e --input-pass=test \\\n\u003e --output-conn='dbname=targetdb user=test password=test' \\\n```\n\n* Export only measurement 'cpu' from 'two_week' retention policy in the 'benchmark' database. \nDrop the existing '\"two_week.cpu\"' table in 'targetdb' if exists, create if not\n```bash\n$ outflux migrate benchmark two_week.cpu \\\n\u003e --input-user=test \\\n\u003e --input-pass=test \\\n\u003e --output-conn='dbname=targetdb user=test pass=test'\\\n\u003e --schema-strategy=DropAndCreate\n```\n\n* Export only the 1,000,000 rows from measurements 'cpu' and 'mem' from 'benchmark', starting from Jan 14th 2019 09:00\n```bash\n$ ./outflux migrate benchmark cpu mem \\\n\u003e --input-user=test \\\n\u003e --input-pass=test \\\n\u003e --limit=1000000 \\\n\u003e --from=2019-01-01T09:00:00Z\n```\n\n\n## Connection \n\n### TimescaleDB connection params\n\nThe connection parameters to the TimescaleDB instance can be passed to Outflux in several ways. One is through the Postgres Environment Variables. Supported environment variables are: `PGHOST, PGPORT, PGDATABASE, PGUSER, PGPASSWORD, PGSSLMODE, PGSSLCERT, PGSSLKEY, PGSSLROOTCERT, PGAPPNAME, PGCONNECT_TIMEOUT`. If they are not specified defaults used are: host=localhost, dbname=postgres, pguser=$USER, and sslmode=disable.\n\nThe values of the enviroment variables can be **OVERRIDEN** by specifying the '--output-conn' flag when executing Outflux. \n\nThe connection string can be in the format URI or DSN format:\n* example URI: \"postgresql://username:password@host:port/dbname?connect_timeout=10\"\n* example DSN: \"user=username password=password host=1.2.3.4 port=5432 dbname=mydb sslmode=disable\"\n\n### InfluxDB connection params\n\nThe connection parameters to the InfluxDB instance can be passed also through flags or environment variables. Supported/Expected environment variables are: `INFLUX_USERNAME, INFLUX_PASSWORD`.\nThese are the same environment variables that the InfluxDB CLI uses. \n\nIf they are not set, or if you wish to override them, you can do so with the `--input-user` and `--input-pass`. \nAlso you can specify to Outflux to skip HTTPS verification when communicating with the InfluxDB server by setting the \n`--input-unsafe-https` flag to `true`. \n\n## Known limitations\n\n### Fields with different data types across shards\n\nOutflux doesn't support fields that have the same name but different data types across shards in InfluxDB, \n**UNLESS** the field is an `integer` and `float` in the InfluxDB shards. \nInfluxDB can store the fields as `integer` (64bit integer), `float` (64bit float), `string`, and `boolean`.\nYou can specify the `multishard-int-float-cast` flag. This will tell Outflux to cast the `integer` values to \n`float` values. A 64bit float can't hold all the int64 values, so this may result in scrambled data (for values \u003e 2^53). \n\nIf the same field is of any of the other possible InfluxDB types, an error will be thrown, since the values can't be \nconverted.\n\nThis is also an issue even if you select a time interval in which a field has a consistent type, but exists as a different type\nin a shard outside of that interval. This is because the `SHOW FIELD KEYS FROM measurement_name` doesn't accept a time interval\nfor which you would be asking\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftimescale%2Foutflux","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftimescale%2Foutflux","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftimescale%2Foutflux/lists"}