https://github.com/alessandro-schmidt/duckdb
This is a showcase of what can DuckDB do.
https://github.com/alessandro-schmidt/duckdb
csv duckdb python python3
Last synced: about 1 year ago
JSON representation
This is a showcase of what can DuckDB do.
- Host: GitHub
- URL: https://github.com/alessandro-schmidt/duckdb
- Owner: Alessandro-Schmidt
- Created: 2025-03-11T20:08:08.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-11T20:57:25.000Z (about 1 year ago)
- Last Synced: 2025-03-11T21:34:32.673Z (about 1 year ago)
- Topics: csv, duckdb, python, python3
- Language: Python
- Homepage: https://duckdb.org/
- Size: 18.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🦆 DuckDB - The Fast & Lightweight Analytical Database

*A lightweight, high-performance database for analytics and data science.*
## 📌 What is DuckDB?
**DuckDB** is an **in-process, columnar, OLAP database** designed for fast analytical queries on structured data. Unlike traditional databases, DuckDB operates **fully in-memory** and requires **no external server**, making it ideal for **data analysis, machine learning, and ETL workflows**.
---
## 🚀 Why Use DuckDB?
✅ **Lightweight & Serverless** – No need for setup; just install and use.
✅ **Blazing Fast Analytics** – Optimized for columnar storage and vectorized query execution.
✅ **SQL Support** – Works like PostgreSQL, with full SQL compatibility.
✅ **Seamless Integration** – Compatible with **Python, R, Pandas, Parquet, and CSV files**.
✅ **Great for ETL** – Load, transform, and export data efficiently.
📀 **Example Use Case:**
> DuckDB is perfect for handling **large datasets in memory** while keeping the query execution extremely fast.
---
## 🛠️ Installation
### **Python (Recommended)**
```sh
pip install duckdb
```
### **CLI (Command Line Interface)**
For standalone usage:
```sh
curl -LO https://duckdb.org/releases/latest/duckdb_cli-linux-amd64.zip
unzip duckdb_cli-linux-amd64.zip
chmod +x duckdb
./duckdb
```
### **Other Languages**
DuckDB also supports **R, Java, Node.js, and C++**. Check the [official installation guide](https://duckdb.org/docs/installation/) for more details.
---
## 🔥 How to Use DuckDB
### **1️⃣ Querying a CSV File Directly**
```sql
SELECT * FROM read_csv_auto('life_expectancy.csv') LIMIT 5;
```
### **2️⃣ Creating a Table and Inserting Data**
```sql
CREATE TABLE life_expectancy (
Country VARCHAR,
Female_Life_Expectancy DOUBLE,
Both_Sexes_Life_Expectancy DOUBLE,
Male_Life_Expectancy DOUBLE
);
INSERT INTO life_expectancy VALUES
('Zanthera', 82.3, 78.9, 75.1),
('Drakonia', 85.7, 80.5, 75.8);
```
### **3️⃣ Exporting Data to a CSV**
```sql
COPY life_expectancy TO 'updated_life_expectancy.csv' (HEADER, DELIMITER ',');
```
---
## 💊 Use Cases
### **💩 Data Science & Analytics**
- Work with **CSV, JSON, and Parquet files** directly.
- Use DuckDB as a **Pandas alternative** for large data processing.
- Perform **complex SQL queries on structured data** without a database server.
### **💩 ETL Pipelines**
- Load and process large datasets **efficiently**.
- Easily export transformed data to different formats (CSV, Parquet).
- Integrate with **Apache Arrow & Pandas** for high-performance workflows.
### **💩 Embedded Analytics**
- Use DuckDB inside **Jupyter Notebooks** for quick data analysis.
- Deploy DuckDB within **Flask / FastAPI applications** for fast query execution.
---
## 📚 Resources & Documentation
👉 [Official Website](https://duckdb.org/)
👉 [DuckDB GitHub](https://github.com/duckdb/duckdb)
👉 [Python API Documentation](https://duckdb.org/docs/api/python.html)
For questions or contributions, feel free to open an **Issue** or a **Pull Request**! 🚀
---
## 🤝 Contributing
Want to contribute? Fork this repository and submit a Pull Request!
📩 **Contact:** [Alessandro's Github](https://github.com/Alessandro-Schmidt)
---
🚀 **Happy Coding with DuckDB!** 🦆🎯