Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/artie-labs/transfer
Database replication platform that leverages change data capture. Stream production data from databases to your data warehouse (Snowflake, BigQuery, Redshift, Databricks) in real-time.
https://github.com/artie-labs/transfer
apache-kafka bigquery cdc change-data-capture data-integration data-pipelines database debezium elt golang kafka redshift snowflake
Last synced: 6 days ago
JSON representation
Database replication platform that leverages change data capture. Stream production data from databases to your data warehouse (Snowflake, BigQuery, Redshift, Databricks) in real-time.
- Host: GitHub
- URL: https://github.com/artie-labs/transfer
- Owner: artie-labs
- License: other
- Created: 2022-11-06T05:50:02.000Z (about 2 years ago)
- Default Branch: master
- Last Pushed: 2024-11-27T07:12:28.000Z (15 days ago)
- Last Synced: 2024-11-27T07:31:59.233Z (15 days ago)
- Topics: apache-kafka, bigquery, cdc, change-data-capture, data-integration, data-pipelines, database, debezium, elt, golang, kafka, redshift, snowflake
- Language: Go
- Homepage: https://artie.com
- Size: 3.05 MB
- Stars: 602
- Watchers: 9
- Forks: 30
- Open Issues: 10
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
- awesome-repositories - artie-labs/transfer - Database replication platform that leverages change data capture. Stream production data from databases to your data warehouse (Snowflake, BigQuery, Redshift, Databricks) in real-time. (Go)
README
Artie Transfer is a real-time data replication solution for databases and data warehouses/lakes.
Typical ETL solutions rely on batched processes or schedulers (i.e. DAGs, Airflow), which means the data in the downstream data warehouse is often several hours to days old. This problem is exacerbated as data volumes grow, as batched processes take increasingly longer to run.
Artie leverages change data capture (CDC) and stream processing to perform data syncs in a more efficient way, which enables sub-minute latency.
Benefits of Artie Transfer:
- Sub-minute data latency: always have access to live production data.
- Ease of use: just set up a simple configuration file, and you're good to go!
- Automatic table creation and schema detection: Artie infers schemas and automatically merges changes to downstream destinations.
- Reliability: Artie has automatic retries and processing is idempotent.
- Scalability: handle anywhere from 1GB to 100+ TB of data.
- Monitoring: built-in error reporting along with rich telemetry statistics.Take a look at this [guide](#getting-started) to get started!
## Architecture
## Examples
To run Artie Transfer's stack locally, please refer to the [examples folder](https://github.com/artie-labs/transfer/tree/master/examples).
## Getting started
[Getting started guide](https://artie.com/docs/open-source/running-artie/overview)
## What is currently supported?
Transfer is aiming to provide coverage across all OLTPs and OLAPs databases. Currently Transfer supports:
- Message Queues
- Kafka (default)
- Google Pub/Sub- [Destinations](https://artie.com/docs/destinations):
- BigQuery
- Databricks
- Microsoft SQL Server
- Redshift
- S3
- Snowflake- [Sources](https://artie.com/docs/sources):
- DocumentDB
- DynamoDB
- Microsoft SQL Server
- MongoDB
- MySQL
- PostgreSQL_If the database you are using is not on the list, feel free to file for a [feature request](https://github.com/artie-labs/transfer/issues/new)._
## Configuration File
* [Artie Transfer configuration file guide](https://artie.com/docs/open-source/running-artie/options)
* [Examples of configuration files](https://artie.com/docs/open-source/running-artie/examples)## Telemetry
[Artie Transfer's telemetry guide](https://artie.com/docs/telemetry/overview)
## Tests
Transfer is written in Go and uses [counterfeiter](https://github.com/maxbrunsfeld/counterfeiter) to mock.
To run the tests, run the following commands:```sh
make generate
make test
```## Release
Artie Transfer is released through [GoReleaser](https://goreleaser.com/), and we use it to cross-compile our binaries on the [releases](https://github.com/artie-labs/transfer/releases) as well as our Dockerhub. If your operating system or architecture is not supported, please file a feature request!
## License
Artie Transfer is licensed under ELv2. Please see the [LICENSE](https://github.com/artie-labs/transfer/blob/master/LICENSE.txt) file for additional information. If you have any licensing questions please email [email protected].