https://github.com/erdogant/df2onehot
Convert a unstructured array into a stuctured dataframe.
https://github.com/erdogant/df2onehot
onehot-encoding preprocessing python structuring
Last synced: 4 months ago
JSON representation
Convert a unstructured array into a stuctured dataframe.
- Host: GitHub
- URL: https://github.com/erdogant/df2onehot
- Owner: erdogant
- License: other
- Created: 2020-03-04T21:02:27.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2025-04-25T09:32:07.000Z (about 1 year ago)
- Last Synced: 2026-01-10T18:08:28.210Z (5 months ago)
- Topics: onehot-encoding, preprocessing, python, structuring
- Language: Python
- Homepage: https://erdogant.github.io/df2onehot
- Size: 6.67 MB
- Stars: 3
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
README
# df2onehot
[](https://img.shields.io/pypi/pyversions/df2onehot)
[](https://pypi.org/project/df2onehot/)
[](https://github.com/erdogant/df2onehot/blob/master/LICENSE)
[](https://pepy.tech/project/df2onehot/month)
[](https://pepy.tech/project/df2onehot)
[](https://zenodo.org/badge/latestdoi/245003302)
[](https://erdogant.github.io/df2onehot/)
``df2onehot`` is a Python package to convert unstructured DataFrames into structured dataframes, such as one-hot dense arrays.
#
**⭐️ Star this repo if you like it ⭐️**
#
#### Install df2onehot from PyPI
```bash
pip install df2onehot
```
#### Import df2onehot package
```python
from df2onehot import df2onehot
```
#
### [Documentation pages](https://erdogant.github.io/df2onehot/)
On the [documentation pages](https://erdogant.github.io/df2onehot/) you can find detailed information about the working of the ``df2onehot`` with many examples.
### Examples
```python
results = df2onehot(df)
```
```python
# Force features (int or float) to be numeric if unique non-zero values are above percentage.
out = df2onehot(df, perc_min_num=0.8)
```
```python
# Remove categorical features for which less then 2 values exists.
out = df2onehot(df, y_min=2)
```
```python
# Combine two rules above.
out = df2onehot(df, y_min=2, perc_min_num=0.8)
```
#
* [Example: Process Mixed dataset](https://erdogant.github.io/df2onehot/pages/html/Examples.html#)
#
* [Example: Extracting nested columns](https://erdogant.github.io/df2onehot/pages/html/Examples.html#extracting-nested-columns)
#
* [Example: Setting custom dtypes](https://erdogant.github.io/df2onehot/pages/html/Examples.html#custom-dtypes)
#
#### Maintainers
* Erdogan Taskesen, github: [erdogant](https://github.com/erdogant)
* Contributions are welcome.
* If you wish to buy me a Coffee for this work, it is very appreciated :)