Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/hadarsharon/grizzlys
User-friendly Python DataFrames 🔵🟡 powered by Julia 🔴🟢🟣
https://github.com/hadarsharon/grizzlys
big-data data data-analysis data-engineering data-frame data-frames data-science dataframe dataframe-library dataframes dataframes-jl julia python
Last synced: 18 days ago
JSON representation
User-friendly Python DataFrames 🔵🟡 powered by Julia 🔴🟢🟣
- Host: GitHub
- URL: https://github.com/hadarsharon/grizzlys
- Owner: hadarsharon
- License: mit
- Created: 2024-03-18T21:56:12.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-04-07T07:15:52.000Z (9 months ago)
- Last Synced: 2024-04-07T18:44:11.784Z (9 months ago)
- Topics: big-data, data, data-analysis, data-engineering, data-frame, data-frames, data-science, dataframe, dataframe-library, dataframes, dataframes-jl, julia, python
- Language: Python
- Homepage:
- Size: 470 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
![grizzlys](https://github.com/hadarsharon/grizzlys/blob/main/docs/logos/grizzlys-logo-cubes-with-text.png?raw=true "grizzlys")
[![Code style: Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/charliermarsh/ruff/main/assets/badge/v2.json&label=Formatter)](https://github.com/charliermarsh/ruff)
[![Linting: Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/charliermarsh/ruff/main/assets/badge/v2.json&label=Linter)](https://github.com/charliermarsh/ruff)
[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit)](https://github.com/pre-commit/pre-commit)# grizzlys: User-friendly Python DataFrames powered by Julia
**grizzlys** is a Python package that provides a native interface on top of Julia's popular
[___DataFrames.jl___](https://github.com/JuliaData/DataFrames.jl) package.As a user-friendly alternative to existing Python packages such as __pandas__ and __polars__, it is designed to be a
convenient & easy to use DataFrames tool for data analysts, data engineers and data scientists alike, while still
providing high performance and abstractions, thanks to Julia's high-performance computing capabilities.## Why you might consider using grizzlys
✅ You are transitioning into Python from a **Julia** or **R** programming background
✅ You are accustomed to working with **Jupyter notebooks** (or a REPL) and performing exploratory data
analysis **(EDA)** on-the-fly✅ You need a quick-and-dirty data wrangling tool that provides readymade **macros** and **convenience
functions** out of the box✅ You work with **statistics** or **linear algebra** often and require a wide range of
statistical/algebraic functions to be well-integrated with your DataFrames## What is grizzlys (currently) NOT well-suited for
❌ __Larger-than-memory datasets__ - grizzlys' current implementation relies on data being stored in-memory, and therefore
it is not a good choice if you work with datasets that don't fit in your machine's RAM.For such cases, using [__Polars__](https://github.com/pola-rs/polars) or
[__Dask DataFrames__](https://docs.dask.org/en/stable/dataframe.html) would be a much better choice as of now.❌ __Lazy Evaluation__ - Similar to the above, grizzlys is currently designed to be fully eager, which means it always
immediately executes your code, as opposed to building a task/computation graph or thereabout and delaying execution
until it's needed.❌ __Backwards compatibility__ - grizzlys is based on a relatively new programming language in Julia, and is developed
using an advanced version of Python, with little regard to end-of-life versions or any compatibility with Python 2.7,
for example.You should therefore not rely on grizzlys for integrations with very old code or any other legacy/deprecated tools and
implementations.❌ __Best-in-class Performance__ - Though Julia is widely considered a very high-performance language (it is actually a
major reason why it's used under the hood here), grizzlys is still a work-in-progress (WIP) and therefore does not
currently aim to compete with, or outperform, other high-performance DataFrame libraries, such as
[__Polars__](https://github.com/pola-rs/polars) (written in Rust) or
[__Modin__](https://github.com/modin-project/modin) (Multi-threaded pandas).This, of course, might no longer be a limitation in the future, as __grizzlys__ will have undergone optimizations and
maturation.
[Go to Top](#grizzlys-user-friendly-python-dataframes-powered-by-julia)