https://github.com/whosgriffith/datamizer
Python package that lets you change sensitive data from a .CSV file, creating a new file with fake data. This allows the new data to be used for training, testing or analytics, without compromising private information.
https://github.com/whosgriffith/datamizer
csv pandas python
Last synced: 6 months ago
JSON representation
Python package that lets you change sensitive data from a .CSV file, creating a new file with fake data. This allows the new data to be used for training, testing or analytics, without compromising private information.
- Host: GitHub
- URL: https://github.com/whosgriffith/datamizer
- Owner: whosgriffith
- License: mit
- Created: 2023-01-22T08:16:43.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2023-01-23T21:02:52.000Z (about 3 years ago)
- Last Synced: 2025-09-27T15:24:41.496Z (6 months ago)
- Topics: csv, pandas, python
- Language: Python
- Homepage:
- Size: 4.88 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Datamizer for Python
[](https://badge.fury.io/py/datamizer)
This is a simple package that lets you change the sensitive data from a .CSV file creating a new file with fake data.
This allows the new data to be used for training, testing or analytics, without compromising private information.
## Installation
Run the following command to install the package:
```
pip install datamizer
```
## Usage
1- Instanciate the Datamizer class, pass the path to the CSV file, and optionally the CSV delimiter.
```python
from datamizer import Datamizer
csv_datamize = Datamizer('file.csv')
```
2- Use `fake()` to anonymize the columns with sensitive data, passing the `column`,`provider`, and optionally `consistent` args.
```python
csv_datamize.fake('Username', 'user_name', consistent=True)
csv_datamize.fake('First name', 'first_name', consistent=True)
csv_datamize.fake('Last name', 'last_name', consistent=True)
csv_datamize.fake('email', 'email', consistent=True)
csv_datamize.fake('Money', 'pricetag')
```
3- Write a new CSV file with the fake data, passing the path to the new file and optionally `index=True` to include the index.
```python
csv_datamize.write_csv('users.csv')
```