An open API service indexing awesome lists of open source software.

https://github.com/lc-rezende/demo-thefuzz

Jupyter Lab notebook to test and explore fuzzy string matching with TheFuzz.
https://github.com/lc-rezende/demo-thefuzz

fuzzy-matching jupyter-notebook python thefuzz

Last synced: about 1 month ago
JSON representation

Jupyter Lab notebook to test and explore fuzzy string matching with TheFuzz.

Awesome Lists containing this project

README

          

[![CI](https://github.com/lc-rezende/demo-thefuzz/actions/workflows/main.yml/badge.svg?branch=main)](https://github.com/lc-rezende/demo-thefuzz/actions/workflows/main.yml)

# TheFuzz Lab - Fuzzy String Matching in Python

This repository contains a Jupyter Lab notebook designed as a **lab environment** to explore and experiment with the capabilities of the [`TheFuzz`](https://github.com/seatgeek/thefuzz) Python library (formerly known as *fuzzywuzzy*).

TheFuzz provides intuitive and flexible tools for **fuzzy string matching**, making it useful for approximate string comparisons, data deduplication, and search applications.

---

## 🧠 Why Use Fuzzy Matching?

Fuzzy string matching is useful when:

- Strings may have typos or inconsistent formatting
- You need to find "close enough" matches instead of exact ones
- You're deduplicating records (e.g., names, addresses) from noisy data

---

## 📘 What You'll Find Here

- Basic usage examples of `fuzz` and `process` modules
- Experiments with partial ratio, token sort ratio, and token set ratio
- Examples of matching strings against a list of choices

---

## 🚀 Getting Started

1. Clone this repo or open the notebook in Jupyter Lab.
2. Explore and modify the examples to understand how fuzzy matching works.
3. Try your own datasets or string inputs to see how TheFuzz behaves.

---

## 🔗 References

- GitHub: [TheFuzz by SeatGeek](https://github.com/seatgeek/thefuzz)