An open API service indexing awesome lists of open source software.

https://github.com/alessandro-schmidt/duckdb

This is a showcase of what can DuckDB do.
https://github.com/alessandro-schmidt/duckdb

csv duckdb python python3

Last synced: about 1 year ago
JSON representation

This is a showcase of what can DuckDB do.

Awesome Lists containing this project

README

          

# 🦆 DuckDB - The Fast & Lightweight Analytical Database

![DuckDB](https://miro.medium.com/v2/resize:fit:1112/0*m9FEabk7UPOOtfr1.png)
*A lightweight, high-performance database for analytics and data science.*

## 📌 What is DuckDB?
**DuckDB** is an **in-process, columnar, OLAP database** designed for fast analytical queries on structured data. Unlike traditional databases, DuckDB operates **fully in-memory** and requires **no external server**, making it ideal for **data analysis, machine learning, and ETL workflows**.

---

## 🚀 Why Use DuckDB?

✅ **Lightweight & Serverless** – No need for setup; just install and use.
✅ **Blazing Fast Analytics** – Optimized for columnar storage and vectorized query execution.
✅ **SQL Support** – Works like PostgreSQL, with full SQL compatibility.
✅ **Seamless Integration** – Compatible with **Python, R, Pandas, Parquet, and CSV files**.
✅ **Great for ETL** – Load, transform, and export data efficiently.

📀 **Example Use Case:**
> DuckDB is perfect for handling **large datasets in memory** while keeping the query execution extremely fast.

---

## 🛠️ Installation

### **Python (Recommended)**
```sh
pip install duckdb
```

### **CLI (Command Line Interface)**
For standalone usage:
```sh
curl -LO https://duckdb.org/releases/latest/duckdb_cli-linux-amd64.zip
unzip duckdb_cli-linux-amd64.zip
chmod +x duckdb
./duckdb
```

### **Other Languages**
DuckDB also supports **R, Java, Node.js, and C++**. Check the [official installation guide](https://duckdb.org/docs/installation/) for more details.

---

## 🔥 How to Use DuckDB

### **1️⃣ Querying a CSV File Directly**
```sql
SELECT * FROM read_csv_auto('life_expectancy.csv') LIMIT 5;
```

### **2️⃣ Creating a Table and Inserting Data**
```sql
CREATE TABLE life_expectancy (
Country VARCHAR,
Female_Life_Expectancy DOUBLE,
Both_Sexes_Life_Expectancy DOUBLE,
Male_Life_Expectancy DOUBLE
);

INSERT INTO life_expectancy VALUES
('Zanthera', 82.3, 78.9, 75.1),
('Drakonia', 85.7, 80.5, 75.8);
```

### **3️⃣ Exporting Data to a CSV**
```sql
COPY life_expectancy TO 'updated_life_expectancy.csv' (HEADER, DELIMITER ',');
```

---

## 💊 Use Cases

### **💩 Data Science & Analytics**
- Work with **CSV, JSON, and Parquet files** directly.
- Use DuckDB as a **Pandas alternative** for large data processing.
- Perform **complex SQL queries on structured data** without a database server.

### **💩 ETL Pipelines**
- Load and process large datasets **efficiently**.
- Easily export transformed data to different formats (CSV, Parquet).
- Integrate with **Apache Arrow & Pandas** for high-performance workflows.

### **💩 Embedded Analytics**
- Use DuckDB inside **Jupyter Notebooks** for quick data analysis.
- Deploy DuckDB within **Flask / FastAPI applications** for fast query execution.

---

## 📚 Resources & Documentation

👉 [Official Website](https://duckdb.org/)
👉 [DuckDB GitHub](https://github.com/duckdb/duckdb)
👉 [Python API Documentation](https://duckdb.org/docs/api/python.html)

For questions or contributions, feel free to open an **Issue** or a **Pull Request**! 🚀

---

## 🤝 Contributing
Want to contribute? Fork this repository and submit a Pull Request!

📩 **Contact:** [Alessandro's Github](https://github.com/Alessandro-Schmidt)

---

🚀 **Happy Coding with DuckDB!** 🦆🎯