An open API service indexing awesome lists of open source software.

https://github.com/ismielabir/txtcleanen

txtcleanen
https://github.com/ismielabir/txtcleanen

nlp python-package text-cleaning text-preprocessing

Last synced: about 1 month ago
JSON representation

txtcleanen

Awesome Lists containing this project

README

          

# txtcleanen

**txtcleanen** is a simple Python package for cleaning English text by removing HTML tags, URLs, emojis, numbers, punctuation, and extra whitespace — ideal for Natural Language Processing (NLP) and text preprocessing tasks.

---

## ✨ Features

- Remove HTML tags
- Remove URLs
- Remove emojis
- Remove digits and punctuation
- Normalize Unicode text
- Compact multiple spaces into one

---

## 🚀 Installation

```bash
pip install txtcleanen
```

## Example
```
import txtcleanen

text = "Hello 😊 World! Visit https://example.com now!"
clean_text = txtcleanen(text)

print(clean_text)

# Output: "Hello World Visit now"
```