https://github.com/fsndzomga/anonLLM

anonLLM: Anonymize Personally Identifiable Information (PII) for Large Language Model APIs
https://github.com/fsndzomga/anonLLM

Last synced: 3 months ago
JSON representation

anonLLM: Anonymize Personally Identifiable Information (PII) for Large Language Model APIs

Host: GitHub
URL: https://github.com/fsndzomga/anonLLM
Owner: fsndzomga
Created: 2023-09-08T19:27:26.000Z (almost 2 years ago)
Default Branch: master
Last Pushed: 2024-08-12T00:14:18.000Z (11 months ago)
Last Synced: 2025-03-13T22:46:18.995Z (4 months ago)
Language: Python
Homepage:
Size: 34.2 KB
Stars: 55
Watchers: 1
Forks: 8
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome-MLSecOps - AnonLLM

README

        # anonLLM: Anonymize Personally Identifiable Information (PII) for Large Language Model APIs

![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)

anonLLM is a Python package designed to anonymize personally identifiable information (PII) in text data before it's sent to Language Model APIs like GPT-3. The goal is to protect user privacy by ensuring that sensitive data such as names, email addresses, and phone numbers are anonymized.

# Features

Anonymize names

Anonymize email addresses

Anonymize phone numbers

Support for multiple country-specific phone number formats

Reversible anonymization (de-anonymization)

Installation

To install anonLLM, run:

```bash

pip install anonLLM

```

# Quick Start

Here's how to get started with anonLLM:

```python

from anonLLM.llm import OpenaiLanguageModel

from dotenv import load_dotenv

load_dotenv()

# Anonymize a text

text = "Write a CV for me: My name is Alice Johnson, "\

    "email: [email protected], phone: +1 234-567-8910."\

    "I am a machine learning engineer."

# Anonymization is handled under the hood

llm = OpenaiLanguageModel()

response = llm.generate(text)

print(response)

```

In this example, the response will contain the correct name provided.

At the same time, no PII will be sent to OpenAI.

You can also use anonLLM to generate structured outputs in a JSON format.

You just have to define a pydantic model for your output, and use the output_format argument like this:

```python

from pydantic import BaseModel

from anonLLM.llm import OpenaiLanguageModel

from dotenv import load_dotenv

load_dotenv()

llm = OpenaiLanguageModel(anonymize=False, temperature=1)

class Person(BaseModel):

    name: str

    sex: str

    age: int

    email: str

response = llm.generate(

    prompt="Generate a person",

    output_format=Person

)

print(response)

# Returns: {'name': 'Alex', 'sex': 'Male', 'age': 32, 'email': '[email protected]'}

```

# Contributing

We welcome contributions!

# License

This project is licensed under the MIT License.

## Star History

[![Star History Chart](https://api.star-history.com/svg?repos=fsndzomga/anonLLM&type=Date)](https://star-history.com/#fsndzomga/anonLLM&Date)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/fsndzomga/anonLLM

Awesome Lists containing this project

README