https://github.com/coatless-py-pkg/pybabynames
Python port of the R data package {babynames} with support for either Pandas or Polars data being loaded.
https://github.com/coatless-py-pkg/pybabynames
data-package data-science pandas polars python python-data-package python3
Last synced: about 2 months ago
JSON representation
Python port of the R data package {babynames} with support for either Pandas or Polars data being loaded.
- Host: GitHub
- URL: https://github.com/coatless-py-pkg/pybabynames
- Owner: coatless-py-pkg
- Created: 2024-09-14T22:36:24.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-09-15T23:00:44.000Z (over 1 year ago)
- Last Synced: 2025-10-29T05:48:52.926Z (6 months ago)
- Topics: data-package, data-science, pandas, polars, python, python-data-package, python3
- Language: Python
- Homepage:
- Size: 19.9 MB
- Stars: 3
- Watchers: 1
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG
Awesome Lists containing this project
README
# pybabynames
[](https://pypi.org/project/pybabynames/)
[](https://github.com/coatless-py-pkg/pybabynames/actions/workflows/test.yml)
[](https://github.com/coatless-py-pkg/pybabynames/releases)
[](https://github.com/coatless-py-pkg/pybabynames/blob/main/LICENSE)
Python port of the R data package [`babynames`](https://cran.r-project.org/package=babynames). This package provides US baby names data from the Social Security Administration (SSA). It contains all names used for at least 5 children of either sex in the United States. The package features the ability to switch between the data being imported as a Polars DataFrame (default) or a Pandas DataFrame by setting an environment variable.
> [!NOTE]
>
> Please note that the `pybabynames` package is a community-driven initiative and is not affiliated with Posit, Tidyverse, or the main babynames R package.
> Its evolution and maintenance stem solely from the collective efforts of community members.
## Installation
Install this library using `pip` into an environment that **already has** either Pandas or Polars installed.
```bash
pip install pybabynames
```
Missing Pandas or Polars? You can install these packages using:
```bash
pip install polars
pip install pandas
```
## Usage
```python
import pybabynames as bn
# Retrieve DataFrame of baby names
babynames = bn.babynames
# Retrieve DataFrame of applicant data for SSN
applicants = bn.applicants
# Retrieve DataFrame of Birth Data
births = bn.births
# Retrieve DataFrame of life expectancy
lifetables = bn.lifetables
```
> [!IMPORTANT]
>
> By default, we'll attempt to use the `polars` module. You can switch back to using `pandas` by
> specifying before `babynames` import statement an environment flag like so:
>
> ```python
> # Specify desired DataFrame framework
> import os
> os.environ["DATAFRAME_FRAMEWORK"] = "pandas"
>
> # Load the package
> import pybabynames as bn
> ```
## Development
To contribute to this library, first checkout the code. Then create a new virtual environment:
```bash
cd pybabynames
python -m venv venv
source venv/bin/activate
```
Now install the dependencies and test dependencies:
```bash
python -m pip install -e '.[test]'
```
To run the tests:
```bash
python -m pytest
```
## Acknowledgement
This Python package is a port of the R Data package [`babynames`](https://cran.r-project.org/package=babynames) by [Hadley Wickham](https://hadley.nz/).