https://github.com/jrycw/turtle-island
  
  
    A Utility Kit for Polars Expressions. 
    https://github.com/jrycw/turtle-island
  
polars python
        Last synced: 4 months ago 
        JSON representation
    
A Utility Kit for Polars Expressions.
- Host: GitHub
 - URL: https://github.com/jrycw/turtle-island
 - Owner: jrycw
 - License: mit
 - Created: 2024-09-12T08:02:28.000Z (about 1 year ago)
 - Default Branch: master
 - Last Pushed: 2025-07-19T10:31:05.000Z (4 months ago)
 - Last Synced: 2025-07-19T15:37:05.400Z (4 months ago)
 - Topics: polars, python
 - Language: Python
 - Homepage: https://jrycw.github.io/turtle-island/
 - Size: 603 KB
 - Stars: 3
 - Watchers: 1
 - Forks: 0
 - Open Issues: 0
 - 
            Metadata Files:
            
- Readme: README.md
 - License: LICENSE
 
 
Awesome Lists containing this project
- awesome-polars - turtle-island - A lightweight utility library for writing Polars Expressions by [@jrycw](https://github.com/jrycw). (Libraries/Packages/Scripts / Polars plugins)
 
README
          [](https://github.com/jrycw/turtle-island/actions/workflows/ci_tests.yml)
[](https://codecov.io/gh/jrycw/turtle-island)
[](https://github.com/jrycw/turtle-island/actions/workflows/ci_docs.yml)
[](https://img.shields.io/github/license/jrycw/turtle-island)
# 🐢 Turtle Island
**Turtle Island** is a lightweight utility library that provides helper functions to reduce boilerplate when writing [Polars](https://pola.rs) expressions. It aims to simplify common expression patterns and improve developer productivity when working with the Polars API.
> ⚠️ **Disclaimer**: This project is in early development. The API is still evolving and may change without notice. Use with caution in production environments.
## 🚀 Installation
**Turtle Island** is not yet published on PyPI. The recommended way to install it is using `uv add`:
```bash
uv add git+https://github.com/jrycw/turtle-island.git
```
## 📦 Recommended Import
To keep your code clean and idiomatic, it's recommended to import **Turtle Island** as a top-level module:
```python
import turtle_island as ti
```
## ⚙️ Core Spirit
The core spirit of **Turtle Island** is to embrace **expressions over columns**.
When wrangling data, it's common to create temporary helper columns as part of the transformation process. However, many of these columns are just intermediate artifacts — not part of the final output we actually want. They exist solely to assist with intermediate steps.
Polars offers a powerful distinction between contexts and expressions, allowing us to focus on expression-based transformations without needing to materialize every intermediate result as a column. **Turtle Island** builds on this principle, encouraging users to rely more on expressions — flexible, composable, and context-aware — rather than temporary columns.
## ✨ Selected Functions
### case_when()
A more ergonomic way to write chained `when-then-otherwise` logic in Polars:
```python
df = pl.DataFrame({"x": [1, 2, 3, 4]})
expr_ti = ti.case_when(
    caselist=[(pl.col("x") < 2, pl.lit("small")),
              (pl.col("x") < 4, pl.lit("medium"))],
    otherwise=pl.lit("large"),
).alias("size_ti")
expr_pl = (
    pl.when(pl.col("x") < 2)
    .then(pl.lit("small"))
    .when(pl.col("x") < 4)
    .then(pl.lit("medium"))
    .otherwise(pl.lit("large"))
    .alias("size_pl")
)
df.with_columns(expr_ti, expr_pl)
```
```
shape: (4, 3)
┌─────┬─────────┬─────────┐
│ x   ┆ size_ti ┆ size_pl │
│ --- ┆ ---     ┆ ---     │
│ i64 ┆ str     ┆ str     │
╞═════╪═════════╪═════════╡
│ 1   ┆ small   ┆ small   │
│ 2   ┆ medium  ┆ medium  │
│ 3   ┆ medium  ┆ medium  │
│ 4   ┆ large   ┆ large   │
└─────┴─────────┴─────────┘
```
### make_index()
Adds a sequential index column to the DataFrame:
```python
df = pl.DataFrame({"a": [1, 3, 5], "b": [2, 4, 6]})
df.select(ti.make_index(), pl.all())
```
```
shape: (3, 3)
┌───────┬─────┬─────┐
│ index ┆ a   ┆ b   │
│ ---   ┆ --- ┆ --- │
│ u32   ┆ i64 ┆ i64 │
╞═══════╪═════╪═════╡
│ 0     ┆ 1   ┆ 2   │
│ 1     ┆ 3   ┆ 4   │
│ 2     ┆ 5   ┆ 6   │
└───────┴─────┴─────┘
```
### bucketize()
Assign values to rows in a round-robin pattern using Polars expressions:
```python
df = pl.DataFrame({"x": [1, 2, 3, 4, 5]})
df.with_columns(
    ti.bucketize(pl.col("x"), pl.col("x").add(100)).alias("bucketized")
)
```
```
shape: (5, 2)
┌─────┬────────────┐
│ x   ┆ bucketized │
│ --- ┆ ---        │
│ i64 ┆ i64        │
╞═════╪════════════╡
│ 1   ┆ 1          │
│ 2   ┆ 102        │
│ 3   ┆ 3          │
│ 4   ┆ 104        │
│ 5   ┆ 5          │
└─────┴────────────┘
```
### bucketize_lit()
Assign values to rows in a round-robin pattern using literal values:
```python
df = pl.DataFrame({"x": [1, 2, 3, 4, 5]})
df.with_columns(ti.bucketize_lit(True, False).alias("bucketized"))
```
```
shape: (5, 2)
┌─────┬────────────┐
│ x   ┆ bucketized │
│ --- ┆ ---        │
│ i64 ┆ bool       │
╞═════╪════════════╡
│ 1   ┆ true       │
│ 2   ┆ false      │
│ 3   ┆ true       │
│ 4   ┆ false      │
│ 5   ┆ true       │
└─────┴────────────┘
```
### is_every_nth_row()
Mark every second row:
```python
df = pl.DataFrame({"x": [1, 2, 3, 4, 5]})
df.with_columns(ti.is_every_nth_row(2))
```
```
shape: (5, 2)
┌─────┬──────────────┐
│ x   ┆ bool_nth_row │
│ --- ┆ ---          │
│ i64 ┆ bool         │
╞═════╪══════════════╡
│ 1   ┆ true         │
│ 2   ┆ false        │
│ 3   ┆ true         │
│ 4   ┆ false        │
│ 5   ┆ true         │
└─────┴──────────────┘
```
To invert the result:
```python
df.with_columns(~ti.is_every_nth_row(2))
```
```
shape: (5, 2)
┌─────┬──────────────┐
│ x   ┆ bool_nth_row │
│ --- ┆ ---          │
│ i64 ┆ bool         │
╞═════╪══════════════╡
│ 1   ┆ false        │
│ 2   ┆ true         │
│ 3   ┆ false        │
│ 4   ┆ true         │
│ 5   ┆ false        │
└─────┴──────────────┘
```
### move_cols_to_start()
Reorder columns so that selected columns appear first:
```python
df = pl.DataFrame({"a": [1, 2, 3], "b": ["x", "y", "z"], "c": [4.4, 5.5, 6.6]})
df.select(ti.move_cols_to_start(["b", "c"]))
```
Or by data type:
```python
df.select(ti.move_cols_to_start([pl.Float64, pl.String]))
```
```
shape: (3, 3)
┌─────┬─────┬─────┐
│ b   ┆ c   ┆ a   │
│ --- ┆ --- ┆ --- │
│ str ┆ f64 ┆ i64 │
╞═════╪═════╪═════╡
│ x   ┆ 4.4 ┆ 1   │
│ y   ┆ 5.5 ┆ 2   │
│ z   ┆ 6.6 ┆ 3   │
└─────┴─────┴─────┘
```
### move_cols_to_end()
Reorder columns so that selected columns appear last:
```python
df = pl.DataFrame({"a": [1, 2, 3], "b": ["x", "y", "z"], "c": [4.4, 5.5, 6.6]})
df.select(ti.move_cols_to_end(["a", "b"]))
```
Or by data type:
```python
df.select(ti.move_cols_to_end([pl.String, pl.Int64]))
```
```
shape: (3, 3)
┌─────┬─────┬─────┐
│ c   ┆ a   ┆ b   │
│ --- ┆ --- ┆ --- │
│ f64 ┆ i64 ┆ str │
╞═════╪═════╪═════╡
│ 4.4 ┆ 1   ┆ x   │
│ 5.5 ┆ 2   ┆ y   │
│ 6.6 ┆ 3   ┆ z   │
└─────┴─────┴─────┘
```
### make_hyperlink()
Create an HTML anchor tag (``) from two columns — link text and URL:
```python
df = pl.DataFrame({"name": ["GitHub"], "url": ["https://github.com/"]})
df.select(ti.make_hyperlink("name", "url").alias("link"))
```
```
shape: (1, 1)
┌──────────────────────────────────────────────────────────┐
│ link                                                     │
│ ---                                                      │
│ str                                                      │
╞══════════════════════════════════════════════════════════╡
│ GitHub │
└──────────────────────────────────────────────────────────┘
```