https://github.com/mrqadeer/internet_words_remover
Python module designed to replace common internet slang and abbreviations with their full forms, enhancing the readability of informal text. It efficiently cleans text data from chats, social media, and online communication. The module also supports tokenization and integrates seamlessly with pandas for batch processing of text in DataFrames.
https://github.com/mrqadeer/internet_words_remover
pandas python3 text-preprocessing
Last synced: about 1 month ago
JSON representation
Python module designed to replace common internet slang and abbreviations with their full forms, enhancing the readability of informal text. It efficiently cleans text data from chats, social media, and online communication. The module also supports tokenization and integrates seamlessly with pandas for batch processing of text in DataFrames.
- Host: GitHub
- URL: https://github.com/mrqadeer/internet_words_remover
- Owner: mrqadeer
- License: mit
- Created: 2024-03-26T09:32:46.000Z (about 2 years ago)
- Default Branch: master
- Last Pushed: 2024-06-25T19:40:45.000Z (almost 2 years ago)
- Last Synced: 2025-01-28T21:16:51.863Z (over 1 year ago)
- Topics: pandas, python3, text-preprocessing
- Language: Python
- Homepage: https://pypi.org/project/internet-words-remover/
- Size: 31.3 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# Internet Words Remover
Internet Words Remover is a Python module that replaces common internet slang and abbreviations with their full forms. It can be used to clean text data containing informal language commonly used in chats, social media, and online communication.
## Installation
You can install Internet Words Remover using pip:
```bash
pip install internet_words_remover
```
## How to Use
```python
from internet_words_remover import words_remover
text="OMG! It works! Osm"
cleaned=words_remover(text)
print(cleaned)
```
**Output**
```
oh my god It works! Awesome
```
### Tokenization
If you are intrested to get tokens of your give string then use follow code.
```python
from internet_words_remover import words_remover
text="OMG! It works! Osm"
cleaned=words_remover(text,is_token=True)
print(cleaned)
```
**Output**
```
['oh', 'my', 'god', 'It', 'works!', 'Awesome']
```
### Bonus
**It also works on pandas series**
```python
from internet_words_remover import words_remover
import pandas as pd
data={
'Name':['Qadeer'],
'Message':['Hi gm TIL something new. PTL']
}
df=pd.DataFrame(data)
df['Message'].apply(words_remover,is_token=True)
```
**Output**
```
['Hi', 'good', 'morning', 'today', 'I', 'learned', 'something', 'new.', 'praise', 'the', 'lord']
```
#### Catch me on
[Github](https://github.com/mrqadeer)
[LinkedIn](https://www.linkedin.com/in/mr-qadeer-3499a4205/)
#### Thanks
##### Keep Learning and Exploring!
##### License: [MIT](https://mit-license.org/)