An open API service indexing awesome lists of open source software.

https://github.com/thanos/fastset-nif

The is a NIF wrapping Rust's HashSet - POC: NOT FOR PRODUCTION
https://github.com/thanos/fastset-nif

data-structures-and-algorithms elixir nif rust rust-lang

Last synced: about 1 month ago
JSON representation

The is a NIF wrapping Rust's HashSet - POC: NOT FOR PRODUCTION

Awesome Lists containing this project

README

          

# FastsetNif

FastsetNif is an Elixir module that provides fast set operations through Rust NIFs using Rustler. This module leverages Rust's standard library HashSet for high-performance set operations, providing a familiar API similar to Elixir's MapSet but with native performance.

## Features

- **High Performance**: Uses Rust's optimized HashSet implementation
- **Memory Safe**: Leverages Rust's memory safety guarantees
- **Rich API**: Comprehensive set operations including union, intersection, and difference
- **Type Safe**: Proper struct-based API with Rustler
- **Cross-platform**: Works on all platforms supported by Rust

## Installation

Add `fastset_nif` to your list of dependencies in `mix.exs`:

```elixir
def deps do
[
{:fastset_nif, "~> 0.1.0"}
]
end
```

## Prerequisites

- Rust toolchain (install via [rustup](https://rustup.rs/))
- Elixir 1.14 or later

## Usage

```elixir
# Create a new set
set = FastsetNif.new()

# Add elements
set = FastsetNif.add(set, "hello")
set = FastsetNif.add(set, "world")

# Check membership
FastsetNif.member?(set, "hello") # Returns true
FastsetNif.member?(set, "missing") # Returns false

# Get set size
FastsetNif.size(set) # Returns 2

# Convert to list
FastsetNif.to_list(set) # Returns ["hello", "world"]

# Set operations
set2 = FastsetNif.new()
set2 = FastsetNif.add(set2, "world")
set2 = FastsetNif.add(set2, "elixir")

# Union
union_set = FastsetNif.union(set, set2)

# Intersection
intersection_set = FastsetNif.intersection(set, set2)

# Difference
difference_set = FastsetNif.difference(set, set2)
```

## API Reference

### Core Operations

- `new()` - Creates a new empty set
- `add(set, element)` - Adds an element to the set
- `remove(set, element)` - Removes an element from the set
- `member?(set, element)` - Checks if an element exists in the set
- `size(set)` - Gets the number of elements in the set
- `to_list(set)` - Converts the set to a list

### Set Operations

- `union(set1, set2)` - Returns a new set with all elements from both sets
- `intersection(set1, set2)` - Returns a new set with elements present in both sets
- `difference(set1, set2)` - Returns a new set with elements from set1 not in set2

## Performance Benchmarks

### Intersection Operation Performance

Recent benchmark results comparing FastSetNif with Elixir's built-in MapSet for intersection operations:

#### System Information
- **Operating System**: macOS
- **CPU**: Apple M1 Max (10 cores)
- **Memory**: 64 GB
- **Elixir Version**: 1.18.4
- **Erlang Version**: 28.0.1
- **JIT**: Enabled

#### Benchmark Configuration
- **Warmup**: 1 second
- **Execution Time**: 5 seconds
- **Memory Measurement Time**: 2 seconds
- **Reduction Time**: 2 seconds
- **Parallel**: 1 process

#### Performance Results

| Metric | FastSetNif | MapSet | Comparison |
|--------|------------|---------|------------|
| **Speed** | 0.25 K ips | 1.19 K ips | **4.75x slower** |
| **Average Time** | 3.99 ms | 0.84 ms | +3.15 ms |
| **Median Time** | 3.86 ms | 0.83 ms | +3.03 ms |
| **99th Percentile** | 4.70 ms | 0.93 ms | +3.77 ms |

#### Memory Usage

| Metric | FastSetNif | MapSet | Comparison |
|--------|------------|---------|------------|
| **Memory Usage** | 167.91 KB | 476.88 KB | **0.35x memory usage** (-308.98 KB) |

#### Reduction Count

| Metric | FastSetNif | MapSet | Comparison |
|--------|------------|---------|------------|
| **Average** | 0.87 K | 74.82 K | **0.01x reduction count** |
| **Median** | 0.80 K | 74.81 K | -74.01 K |
| **99th Percentile** | 3.07 K | 75.07 K | -72.00 K |

#### Summary

- **Performance**: FastSetNif is currently **4.75x slower** than MapSet for intersection operations
- **Memory**: FastSetNif uses **significantly less memory** (0.35x) compared to MapSet
- **Reduction Count**: FastSetNif shows much lower reduction counts, indicating more efficient internal processing

> **Note**: These benchmarks represent the current implementation state. Performance optimizations are ongoing, and the memory efficiency advantage may be beneficial for memory-constrained environments.

## Building from Source

### Prerequisites

- Rust toolchain (install via [rustup](https://rustup.rs/))
- Elixir 1.14 or later
- Erlang/OTP 24 or later

### Build Steps

1. Clone the repository:
```bash
git clone
cd fastset-nif
```

2. Install dependencies:
```bash
mix deps.get
```

3. Build the project (this will compile the Rust code):
```bash
mix compile
```

4. Run tests:
```bash
mix test
```

## Development

### Project Structure

```
fastset-nif/
├── lib/
│ └── fastset_nif.ex # Main Elixir module
├── native/
│ └── fastset_nif/ # Rust NIF implementation
│ ├── Cargo.toml # Rust dependencies
│ └── src/
│ └── lib.rs # Rust NIF code
├── mix.exs # Mix project configuration
└── README.md # This file
```

### Adding New Operations

To add new set operations:

1. Add the function declaration to `lib/fastset_nif.ex`
2. Implement the Rust function in `native/fastset_nif/src/lib.rs`
3. Add the function to the `rustler::init!` macro
4. Rebuild the project with `mix compile`

### Testing

The project includes basic tests to verify functionality:

```bash
mix test
```

## Performance Considerations

- NIFs run in the same thread as the calling Erlang process
- Long-running NIFs can block the Erlang scheduler
- Use NIFs for CPU-intensive operations, not I/O operations
- Consider using dirty NIFs for long-running operations

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests for new functionality
5. Ensure all tests pass
6. Submit a pull request

## License

This project is licensed under the MIT License - see the LICENSE file for details.

## Acknowledgments

- Built with [Rustler](https://github.com/rusterlium/rustler)
- Uses Rust's standard library HashSet for optimal performance
- Leverages Rust's memory safety and performance guarantees