An open API service indexing awesome lists of open source software.

https://github.com/zazza123/hamana

A python library for seamless data extraction, storage, and SQL-based analysis using pandas and SQLite.
https://github.com/zazza123/hamana

analysis data python

Last synced: 5 months ago
JSON representation

A python library for seamless data extraction, storage, and SQL-based analysis using pandas and SQLite.

Awesome Lists containing this project

README

          



hamana



Illustrations by @gaiaparte



Supported Python Versions
PyPI version
Tests
Coverage
Statistics
License

---


Documentation: https://zazza123.github.io/hamana



**hamana** (*Hamster Analysis*) is a Python library designed to simplify data analysis by combining the practicality of **pandas** and **SQL** in an open-source environment. This library was born from the experience of working in a large company where tools like `SAS` were often used as "shortcuts" to perform SQL queries across different data sources, without fully leveraging their potential. With the goal of providing a free and accessible alternative, `hamana` replicates these functionalities in an open-source context.

## Why Choose `hamana`?

Hamana Explain

- **Support for Multiple Data Sources**: Connect to various data sources such as relational databases, CSV files, mainframes, and more.
- **SQLite Integration**: Save data locally in an SQLite database, either as a file or in memory.
- **SQL + pandas**: Combine the power of `SQL` with the flexibility of `pandas` for advanced analysis.
- **Open Source**: Available to everyone without licensing costs.
- **Why "Hamster"?**: Because hamsters are awesome!

## Key Features

### 1. Data Extraction

Hamana allows you to extract data from a variety of sources:

- Relational databases (SQLite, Oracle, etc.)
- CSV, Excel, JSON, and other common file formats
- Legacy sources like mainframes

Extractions are automatically saved as `pandas` **DataFrames**, making data manipulation simple and intuitive.

### 2. SQLite Storage

Each extraction can be saved in an **SQLite** database, enabling you to:

- Store data locally for future use
- Perform `SQL` queries to combine extractions from different sources

### 3. Data Analysis

With Hamana, you can:

- Use `pandas` to quickly and flexibly manipulate data
- Write `SQL` queries directly on datasets stored in SQLite
- Integrate `SQL` and `pandas` into a single workflow for advanced analysis

## Installation

Hamana is available on [PyPI](https://pypi.org/project/hamana/), and you can install it easily with pip:

```bash
pip install hamana
```

## Usage Example

Here is an example of how to use Hamana to connect to a data source, extract information, and combine it with another table:

```python
import hamana as hm

# connect hamana database
hm.connect()

# connect to Oracle database
oracle_db = hm.connector.db.Oracle.new(
host = "localhost",
port = 1521,
user = "user",
password = "password"
)

# define, execute and store a query
orders = hm.Query("SELECT * FROM orders")
oracle_db.to_sqlite(orders, table_name = "orders")

# load a CSV file and store it in SQLite
customers = hm.connector.file.CSV("customers.csv")
customers.to_sqlite(table_name = "customers")

# combine the two tables using SQL
customers_orders = hm.execute("""
SELECT
c.customer_name
, o.order_date
, o.total
FROM customers c
JOIN orders o ON
c.customer_id = o.customer_id"""
)

# use `pandas` for further analysis
print(customers_orders.result.head())

# close connection
hm.disconnect()
```

## How to Contribute

If you want to contribute to Hamana:

1. Fork the repository.
2. Create a branch for your changes:

```bash
git checkout -b feature/your-feature-name
```

3. Submit a pull request describing the changes.

All contributions are welcome!

## License

This project is distributed under the **BSD 3-Clause "New" or "Revised"** license.

## Contact

For questions or suggestions, you can open an **Issue** on GitHub or contact me directly.

---
Thank you for choosing Hamana!