https://github.com/gigapi/gigapi-docs
GigAPI: DuckDB Powered Parquet Storage Engine & Data Pond
https://github.com/gigapi/gigapi-docs
Last synced: 8 months ago
JSON representation
GigAPI: DuckDB Powered Parquet Storage Engine & Data Pond
- Host: GitHub
- URL: https://github.com/gigapi/gigapi-docs
- Owner: gigapi
- License: agpl-3.0
- Created: 2025-04-11T11:22:14.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-04-11T15:53:37.000Z (about 1 year ago)
- Last Synced: 2025-05-12T17:27:51.381Z (about 1 year ago)
- Homepage: https://gigapipe.com
- Size: 32.2 KB
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 
#
GigAPI Storage Engine
Like a durable parquet floor, GigAPI provides rock-solid data foundation — so you can focus on queries
> GigAPI eliminates classic storage and server limits, unlocking virtually infinite cardinality without compromising query speed and performance. Its DuckDB and Arrow powered engine handles massive parallel ingestion and self-compation of data for heavy aggregations and complex SQL queries, delivering consistent performance as your system scales without storage or cardinality limitations and price tag
##
Write Support
As write requests come in to GigAPI they are parsed and progressively appeanded to parquet files alongside their metadata. The ingestion buffer is flushed to disk at configurable intervals using a hive partitioning schema. Generated parquet files and their respective metadata are progressively compacted and sorted over time based on configuration parameters.
###
API
GigAPI provides an HTTP API for clients to write, currently supporting the InfluxDB Line Protocol format
* _more ingestion protocols coming soon_
###
Data Schema
GigAPI is a schema-on-write database managing databases, tables and schemas on the fly. New columns can be added or removed over time, leaving reconciliation up to readers.
```bash
/data
/mydb
/weather
/date=2025-04-10
/hour=14
*.parquet
metadata.json
/hour=15
*.parquet
metadata.json
```
##
Read Support
As read requests come in to GigAPI they are parsed and transpiled using the GigAPI Metadata catalog to resolve data location based on database, table and timerange in requests. Series can be used with or without time ranges, ie for calculating averages, etc.
```bash
$ curl -X POST "http://localhost:9999/query?db=mydb" \
-H "Content-Type: application/json" \
-d '{"query": "SELECT count(*), avg(temperature) FROM weather"}'
```
```json
{"results":[{"avg(temperature)":87.025,"count_star()":"40"}]}
```
> GigAPI readers can be implemented in any language and with any OLAP engine supporting Parquet files.
##
GigAPI Diagram
```mermaid
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#6a329f',
'primaryTextColor': '#fff',
'primaryBorderColor': '#7C0000',
'lineColor': '#6f329f',
'secondaryColor': '#006100',
'tertiaryColor': '#fff'
}
}
}%%
graph TD;
GigAPI-->ParquetWriter;
ParquetWriter-->Storage;
ParquetWriter-->Metadata;
Storage-->Compactor;
Compactor-->Storage;
Compactor-->Metadata;
Storage-.->LocalFS;
Storage-.->S3;
HTTP-API-- GET/POST --> GigAPI;
DuckDB-->Storage;
DuckDB-->Metadata;
subgraph GigAPI[GigAPI Server]
ParquetWriter
Compactor
Metadata;
DuckDB;
end
```
##
License

> Gigapipe is released under the GNU Affero General Public License v3.0 ©️ HEPVEST BV, All Rights Reserved.