https://github.com/deepwaterpaladin/statscanpy
Basic package for querying & downloading StatsCan data by table name.
https://github.com/deepwaterpaladin/statscanpy
api data
Last synced: 5 months ago
JSON representation
Basic package for querying & downloading StatsCan data by table name.
- Host: GitHub
- URL: https://github.com/deepwaterpaladin/statscanpy
- Owner: deepwaterpaladin
- License: apache-2.0
- Created: 2024-07-22T15:54:56.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-08-21T12:33:17.000Z (almost 2 years ago)
- Last Synced: 2025-08-27T15:54:31.357Z (10 months ago)
- Topics: api, data
- Language: Python
- Homepage: https://pypi.org/project/statscanpy/
- Size: 107 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# StatsCanPy
[](https://github.com/deepwaterpaladin/statscanpy/actions/workflows/qa-tests.yml)
[](https://github.com/deepwaterpaladin/statscanpy/actions/workflows/python-publish.yml)
Basic package for querying & downloading [StatsCan](https://www.statcan.gc.ca/en/start) data by table name. Saves data into a dataframe (`Pandas` or `PySpark`).
Allows for querying datasets via plain text search or table ID.
## Installation
`pip install statscanpy`
## Usage
### Basic Usage
```python
from statscanpy import StatsCanPy
# if isSpark==True, data returns will be in PySpark; otherwise it will return as a pandas.DataFrame
statscan = StatsCanPy(path="/data/saved/here", isSpark=True)
```
### Getting Table ID from Table Name
```python
statscan.get_table_id_from_name("Railway industry operating statistics by mainline companies")
>>> TOP MATCH:
Railway industry operating statistics by mainline companies: 23-10-0055-01
Accessible at: https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=2310005501
```
### Getting Table Data from Table Name
```python
await statscan.get_table_from_name("Household spending, Canada, regions and provinces")
>>> TOP MATCH:
Household spending, Canada, regions and provinces: 11-10-0222-01
Accessible at: https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=1110022201
DataFrame[REF_DATE: date, GEO: string, ...]
```
### Searching for Table(s) by String
```python
statscan.find_table_id_from_name("GDP", limit=15)
>>> TOP 15 MATCHES:
1. Gross domestic product (GDP) at basic prices, by industry, monthly, growth rates: 36-10-0434-02
Accessible at: https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=3610043402
2. Gross domestic product, expenditure-based, provincial and territorial, annual: 36-10-0222-01
Accessible at: https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=3610022201
...
```
## Further Reading
- [StatsCan Data](https://www150.statcan.gc.ca/n1/en/type/data?MM=1)
- [StatsCan API](https://www.statcan.gc.ca/en/developers/wds/user-guide)