Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ryanfobel/utility-bill-scraper
Download energy usage data and estimate CO2 emissions from utility websites or pdf bills.
https://github.com/ryanfobel/utility-bill-scraper
carbon-footprint climate-crisis web-scraper
Last synced: about 1 month ago
JSON representation
Download energy usage data and estimate CO2 emissions from utility websites or pdf bills.
- Host: GitHub
- URL: https://github.com/ryanfobel/utility-bill-scraper
- Owner: ryanfobel
- License: bsd-3-clause
- Created: 2020-03-28T02:42:02.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2023-01-06T05:00:15.000Z (almost 2 years ago)
- Last Synced: 2024-10-30T20:19:33.241Z (about 2 months ago)
- Topics: carbon-footprint, climate-crisis, web-scraper
- Language: Jupyter Notebook
- Homepage:
- Size: 11.4 MB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 17
-
Metadata Files:
- Readme: README.ipynb
- License: LICENSE.md
Awesome Lists containing this project
README
{
"cells": [
{
"cell_type": "markdown",
"id": "fe91fb5d",
"metadata": {},
"source": [
"# Utility bill scraper\n",
"\n",
"[![build](https://github.com/ryanfobel/utility-bill-scraper/actions/workflows/build.yml/badge.svg?branch=main)](https://github.com/ryanfobel/utility-bill-scraper/actions/workflows/build.yml)\n",
"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/ryanfobel/utility-bill-scraper/main)\n",
"[![PyPI version shields.io](https://img.shields.io/pypi/v/utility-bill-scraper.svg)](https://pypi.python.org/pypi/utility-bill-scraper/)\n",
"\n",
"Download energy usage data and estimate CO2 emissions from utility websites or pdf bills.\n",
"\n",
"## What is this?\n",
"\n",
"The science is clear — global temperatures are rising and we need to drastically reduce our use of fossil fuels if we want to keep our planet habitable for future generations. Many governments around the world are declaring [climate emergencies](https://qz.com/1786781/which-cities-have-declared-climate-emergencies/) and are setting ambitious targets to reduce emissions (e.g., [net zero by 2050](https://www.ipcc.ch/sr15/), [50% reduction by 2030](https://www.npr.org/2021/04/16/987667828/how-the-u-s-could-halve-climate-emissions-by-2030)). While broad systemic changes are clearly required, individual action is also important. For those living in the [Global North](https://en.wikipedia.org/wiki/Global_North_and_Global_South), the majority of fossil-fuel emissions arise from heating/cooling our homes, using electricity, transportation, and the food we eat. It's obvious that we need to rapidly transition off fossil fuels, which will require (1) **clear targets**, (2) **a plan to achieve them**, and (3) **tools for measuring progress**.\n",
"\n",
"There are [many](https://app.projectneutral.org/) [existing](https://coolclimate.berkeley.edu/calculator) [carbon](https://www.nature.org/en-us/get-involved/how-to-help/carbon-footprint-calculator/) [footprint](https://www.carbonfootprint.com/calculator.aspx) [calculators](https://www3.epa.gov/carbon-footprint-calculator/), but they often require manual data entry, leading most people to try them once to get a static snapshot at a point in time. While useful for gaining a high-level understanding of your personal emission sources, it would be much better if this footprint could be automatically updated over time to provide people with **feedback** on the impact of their actions. This project aims to do just that — to assist individuals with collecting data from utility companies (e.g., electricity and natural gas) by automatically downloading their data and converting usage into CO2 emissions.\n",
"\n",
"![monthly_co2_emissions](https://raw.githubusercontent.com/ryanfobel/utility-bill-scraper/main/notebooks/canada/on/images/monthly_co2_emissions_natural_gas.png)\n",
"\n",
"\n",
"\n",
"## Table of contents\n",
"\n",
"- [Supported utilities](#supported-utilities)\n",
"- [Install](#install)\n",
"- [Data storage](#data-storage)\n",
"- [Getting and plotting data using the Python API](#getting-and-plotting-data-using-the-python-api)\n",
" - [Update data](#update-data)\n",
" - [Plot monthly gas consumption](#plot-monthly-gas-consumption)\n",
" - [Convert gas consumption to CO2 emissions](#convert-gas-consumption-to-co2-emissions)\n",
" - [Plot CO2 emissions versus previous years](#plot-co2-emissions-versus-previous-years)\n",
"- [Command line utilities](#command-line-utilities)\n",
" - [Update data](#update-data-1)\n",
" - [Export data](#export-data)\n",
" - [Options](#options)\n",
" - [Environment variables](#environment-variables)\n",
"- [Contributors](#contributors)\n",
"\n",
"\n",
"\n",
"## Supported utilities\n",
"\n",
"The simplest way to get started is to click on one of the following links, which will open a session on https://mybinder.org where you can try downloading some data. **Note: after you click on the link, it will take a couple of minutes to load an interactive Jupyter notebook.** Then follow the instructions (e.g., provide your `username` and `password`) to run the notebook directly from your browser.\n",
"\n",
" * [Kitchener Utilities (gas & water)](https://mybinder.org/v2/gh/ryanfobel/utility-bill-scraper/main?labpath=notebooks%2Fcanada%2Fon%2Fkitchener_utilities.ipynb)\n",
" * [Kitchener-Wilmot Hydro](https://mybinder.org/v2/gh/ryanfobel/utility-bill-scraper/main?labpath=notebooks%2Fcanada%2Fon%2Fkitchener_wilmot_hydro.ipynb)\n",
" \n",
"## Install\n",
"\n",
"```sh\n",
"pip install utility-bill-scraper\n",
"```\n",
"\n",
"## Data storage\n",
"\n",
"All data is stored in a file located at `$DATA_PATH/$UTILITY_NAME/monthly.csv`. The path to this file can be set as input argument when initializing an API object via the `data_path` argument.\n",
"\n",
"```\n",
"└───data\n",
" └───Kitchener Utilities\n",
" └───monthly.csv\n",
" └───statements\n",
" │───2021-10-18 - Kitchener Utilities - $102.30.pdf\n",
" ...\n",
" └───2021-06-15 - Kitchener Utilities - $84.51.pdf\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "d948345a",
"metadata": {},
"source": [
"## Getting and plotting data using the Python API\n",
"\n",
"### Update data\n",
"\n",
"```python\n",
"import utility_bill_scraper.canada.on.kitchener_utilities as ku\n",
"\n",
"api = ku.KitchenerUtilitiesAPI(username='username', password='password')\n",
"\n",
"# Get new statements.\n",
"updates = api.update()\n",
"if updates is not None:\n",
" print(f\"{ len(updates) } statements_downloaded\")\n",
"api.history(\"monthly\").tail()\n",
"```\n",
"![history tail](https://raw.githubusercontent.com/ryanfobel/utility-bill-scraper/main/notebooks/canada/on/images/history_tail.png)"
]
},
{
"cell_type": "markdown",
"id": "53a55126",
"metadata": {},
"source": [
"\n"
]
},
{
"cell_type": "markdown",
"id": "4e325c48",
"metadata": {
"lines_to_next_cell": 0
},
"source": [
"### Plot monthly gas consumption\n",
"\n",
"```python\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = ku_api.history(\"monthly\")\n",
"\n",
"plt.figure()\n",
"plt.bar(df.index, df[\"Gas Consumption\"], width=0.9, alpha=0.5)\n",
"plt.xticks(rotation=90)\n",
"plt.title(\"Monthly Gas Consumption\")\n",
"plt.ylabel(\"m$^3$\")\n",
"```\n",
"\n",
"![monthly gas consumption](https://raw.githubusercontent.com/ryanfobel/utility-bill-scraper/main/notebooks/canada/on/images/monthly_gas_consumption.png)"
]
},
{
"cell_type": "markdown",
"id": "14c83ce4",
"metadata": {},
"source": [
"### Convert gas consumption to CO2 emissions\n",
"\n",
"```python\n",
"from utility_bill_scraper import GAS_KGCO2_PER_CUBIC_METER\n",
"\n",
"df[\"kgCO2\"] = df[\"Gas Consumption\"] * GAS_KGCO2_PER_CUBIC_METER\n",
"```\n",
"\n",
"### Plot Annual CO2 emissions\n",
"\n",
"```python\n",
"from utility_bill_scraper import GAS_KGCO2_PER_CUBIC_METER\n",
"\n",
"df[\"kgCO2\"] = df[\"Gas Consumption\"] * GAS_KGCO2_PER_CUBIC_METER\n",
"df[\"year\"] = [int(x[0:4]) for x in df.index]\n",
"df[\"month\"] = [int(x[5:7]) for x in df.index]\n",
"\n",
"plt.figure()\n",
"df.groupby(\"year\").sum()[\"Gas Consumption\"].plot.bar(width=bin_width, alpha=alpha)\n",
"plt.ylabel(\"m$^3$\")\n",
"ylim = plt.ylim()\n",
"ax = plt.gca()\n",
"ax2 = ax.twinx()\n",
"plt.ylabel(\"tCO$_2$e\")\n",
"plt.ylim([GAS_KGCO2_PER_CUBIC_METER * y / 1e3 for y in ylim])\n",
"plt.title(\"Annual CO$_2$e emissions from natural gas\")\n",
"```\n",
"\n",
"![annual co2_emissions](https://raw.githubusercontent.com/ryanfobel/utility-bill-scraper/main/notebooks/canada/on/images/annual_co2_emissions_natural_gas.png)"
]
},
{
"cell_type": "markdown",
"id": "e920d813",
"metadata": {},
"source": [
"## Command line utilities\n",
"\n",
"Update and export your utility data from the command line.\n",
"\n",
"### Update data\n",
"\n",
"```sh\n",
"> ubs --utilty-name \"Kitchener Utilities\" update --user $USER --password $PASSWORD\n",
"```\n",
"\n",
"### Export data\n",
"\n",
"```sh\n",
"> ubs --utilty-name \"Kitchener Utilities\" export --output monthly.csv\n",
"```\n",
"\n",
"### Options\n",
"\n",
"```sh\n",
"> ubs --help\n",
"usage: ubs [-h] [-e ENV] [--data-path DATA_PATH] [--utility-name UTILITY_NAME]\n",
" [--google-sa-credentials GOOGLE_SA_CREDENTIALS]\n",
" {update,export} ...\n",
"\n",
"ubs (Utility bill scraper)\n",
"\n",
"optional arguments:\n",
" -h, --help show this help message and exit\n",
" -e ENV, --env ENV path to .env file\n",
" --data-path DATA_PATH\n",
" folder containing the data file and statements\n",
" --utility-name UTILITY_NAME\n",
" name of the utility\n",
" --google-sa-credentials GOOGLE_SA_CREDENTIALS\n",
" google service account credentials\n",
"\n",
"subcommands:\n",
" {update,export} available sub-commands\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "73b9f438",
"metadata": {},
"source": [
"### Environment variables\n",
"\n",
"Note that many options can be set via environment variables (useful for continuous integration and/or working with containers). The following can be set in your shell or via a `.env` file passed using the `-e` option.\n",
"\n",
"```sh\n",
"DATA_PATH=\"folder containing the data file and statements\"\n",
"UTILITY_NAME=\"name of the utility\"\n",
"GOOGLE_SA_CREDENTIALS=\"google service account credentials\"\n",
"USER=\"username\"\n",
"PASSWORD=\"password\"\n",
"SAVE_STATEMENTS=\"save downloaded statements (default=True)\"\n",
"MAX_DOWNLOADS=\"maximum number of statements to download\"\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "e76afd7f",
"metadata": {},
"source": [
"## Contributors\n",
"\n",
"* Ryan Fobel ([@ryanfobel](https://github.com/ryanfobel))"
]
}
],
"metadata": {
"jupytext": {
"cell_metadata_filter": "tags,-all",
"notebook_metadata_filter": "-all"
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.0"
}
},
"nbformat": 4,
"nbformat_minor": 5
}