Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/franciscojavierarceo/pyipums
A Python Library for working with IPUMS
https://github.com/franciscojavierarceo/pyipums
ipums ipumsr
Last synced: 25 days ago
JSON representation
A Python Library for working with IPUMS
- Host: GitHub
- URL: https://github.com/franciscojavierarceo/pyipums
- Owner: franciscojavierarceo
- License: mit
- Created: 2023-04-09T01:17:35.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-01-30T17:26:12.000Z (9 months ago)
- Last Synced: 2024-09-15T22:06:00.074Z (about 2 months ago)
- Topics: ipums, ipumsr
- Language: Jupyter Notebook
- Homepage: https://pypi.org/project/pyipums/
- Size: 43.2 MB
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# PyIPUMS
![pypums (1)](https://github.com/franciscojavierarceo/pyipums/assets/4163062/398bb2b5-8974-4eb9-84a0-92938da184c1)
PyIPUMS is a library for working with data from [IPUMS](https://www.ipums.org/).
# Example
Example that provides the IPUMS metadata in a dictionary.
```python
import json
import pandas as pd
from src.pyipums.parse_xml import read_ipums_ddi
from ipumspy import readers, ddidef read_ipums_micro(ddi, data_file_path, n_max=None):
# Read the fixed-width data file using the extracted column information
df = pd.read_fwf(
data_file_path,
dtypes=ddi["column_dtypes"],
colspecs=ddi["column_specs"],
header=None,
names=ddi["columns"],
nrows=n_max,
compression="gzip",
)return df
def main():
ddi_file_path = "./usa_00003.xml"
data_file_path = "./usa_00003.dat.gz"
cps_ddi = read_ipums_ddi(ddi_file_path)
print(json.dumps(cps_ddi["file_metadata"], indent=2))
cps_data = read_ipums_micro(cps_ddi, data_file_path, n_max=100)
print(cps_data.head())
```# Modifying
If you are looking to make changes to the library I recommend using [poetry](https://python-poetry.org/docs/).
```
poetry env use 3.8
pyenv shell 3.8
poetry shell
```# License
MIT