Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/apache/doris-streamloader

Stream Loader for Apache Doris
https://github.com/apache/doris-streamloader

bigquery database dbt delta-lake elt etl hadoop hive hudi iceberg lakehouse olap query-engine real-time redshift snowflake spark sql

Last synced: 4 months ago
JSON representation

Stream Loader for Apache Doris

Awesome Lists containing this project

README

        

# Apache Doris Streamloader

A robust, high-performance and user-friendly alternative to the traditional curl-based Stream Load.

## Key Features

- **Parallel Loading**: Split data files automatically and perform parallel loading
- **Support for Multiple Files and Directories**: Support multiple files and directories load with one shot
- **Path Traversal Support**: Support path traversal when the source files are in directories
- **Resilience and Continuity**: Resume loading from previous failures and cancellations
- **Automatic Retry Mechanism**: Retry automatically when failure
- **Comprehensive and Concise Input Parameters**

## Usage

```shell
doris-streamloader --source_file={FILE_LIST} --url={FE_OR_BE_SERVER_URL}:{PORT} --header={STREAMLOAD_HEADER} --db={TARGET_DATABASE} --table={TARGET_TABLE}
```

- `FILE_LIST`: directory or file list, support \* wildcard
- `FE_OR_BE_SERVER_URL` & `PORT`: Doris FE or BE hostname or IP and HTTP port
- `STREAMLOAD_HEADER`: supports all headers as `curl` Stream Load does,multiple headers are separated by '?'
- `TARGET_DATABASE` & `TARGET_TABLE`: indicate the target database and table where the data will be loaded

e.g.:

```shell
doris-streamloader --source_file="data.csv" --url="http://localhost:8330" --header="column_separator:|?columns:col1,col2" --db="testdb" --table="testtbl"
```

For additional details and options, refer to our comprehensive docs below.

## Docs

[User Guide](https://doris.apache.org/docs/ecosystem/doris-streamloader)

[中文使用文档](https://doris.apache.org/zh-CN/docs/ecosystem/doris-streamloader)

## Build

To build Streamloader, ensure you have golang installed (version >= 1.19.9). For example, on CentOS:

```
yum install golang
```

Then, navigate to the doris-streamloader directory and execute:

```
cd doris-streamloader && sh build.sh
```

## License

[Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)