https://github.com/climateimpactlab/impactlab_api_mockup_2
https://github.com/climateimpactlab/impactlab_api_mockup_2
Last synced: 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/climateimpactlab/impactlab_api_mockup_2
- Owner: ClimateImpactLab
- Created: 2017-03-08T23:24:15.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2017-03-09T01:15:09.000Z (about 9 years ago)
- Last Synced: 2025-09-10T03:51:39.462Z (7 months ago)
- Language: Python
- Size: 25.4 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.ipynb
Awesome Lists containing this project
README
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from __future__ import absolute_import\n",
"from impactlab import impactlab\n",
"from impactlab.mockfs import DataAPI, Variable, SuperIndex\n",
"\n",
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# initialize the mockup of a DataFS api with variable support\n",
"api = DataAPI()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Specifying a simple pipeline job"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The goal is to have the ability to write atomic pipeline components and then map them across our data sets."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Example: multiply `popop` by `tas`"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def compute_mortality(popop, tas):\n",
" '''\n",
" Demonstrates a simple atomic computation\n",
" '''\n",
"\n",
" return popop.value * tas.value\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This can be parameterized using the `impactlab.uses` functions and run using `impactlab.updates`:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"@impactlab.uses(popop=api.get_variable('/GCP/socioeconomics/popop'), tas=api.get_variable('/GCP/climate/tas'))\n",
"@impactlab.iters()\n",
"@impactlab.updates(api.get_variable('/GCP/impacts/mortality'))\n",
"def mortality(popop, tas):\n",
" '''\n",
" Demonstrates a simple computation job\n",
"\n",
" The `impactlab.uses` decorator accepts keyword arguments of the form \n",
" {name: obj}, where name is the name of argument to pass and obj is a mockfs\n",
" Variable object.\n",
"\n",
" The `impactlab.iters` decorator drives a for-loop over all combinations of\n",
" indices for the given variables.\n",
"\n",
" The value returned by the decorated function is used to update the value of\n",
" the variable specified by `impactlab.updates` for the given indices.\n",
" '''\n",
"\n",
" return compute_mortality(popop, tas)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To see this in action, simply call `mortality`:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n",
" bumped 0.0.1 --> 0.0.2\n"
]
}
],
"source": [
"mortality()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To run a subset of the jobs, slice the variables in the `@impactlab.uses` calls:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"@impactlab.uses(popop=api.get_variable('/GCP/socioeconomics/popop'))\n",
"@impactlab.uses(tas=api.get_variable('/GCP/climate/tas')[{'rcp': 'rcp85'}])\n",
"@impactlab.iters()\n",
"@impactlab.updates(api.get_variable('/GCP/impacts/mortality'))\n",
"def mortality_rcp85(popop, tas):\n",
" '''\n",
" `impactlab_uses` may be supplied as many times as desired (but must be above\n",
" impactlab.updates or other functional decorators).\n",
" \n",
" Slicing variables is done with a dictionary specifying the index to be sliced\n",
" '''\n",
"\n",
" return compute_mortality(popop, tas)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that this function only iterates over `rcp85`:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" bumped 0.0.2 --> 0.0.3\n",
" bumped 0.0.2 --> 0.0.3\n",
" bumped 0.0.2 --> 0.0.3\n",
" bumped 0.0.2 --> 0.0.3\n",
" bumped 0.0.2 --> 0.0.3\n"
]
}
],
"source": [
"mortality_rcp85()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Running a complex ETL Job"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's say you want to perform a job that loads one class of variables into memory, then iterates over another set while keeping the first set alive."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This can be done using a two-stage job using the `iters` decorator:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"@impactlab.uses(tas=api.get_variable('/GCP/climate/tas'))\n",
"@impactlab.iters()\n",
"def tas2_ir(tas):\n",
" '''\n",
" Demonstrates a two-stage ETL process\n",
" '''\n",
"\n",
" with tas.open('r') as f:\n",
" tas_data = pd.read_csv(f)\n",
"\n",
" @impactlab.uses(tas=tas)\n",
" @impactlab.uses(popop=api.get_variable('/GCP/socioeconomics/popop'))\n",
" @impactlab.iters()\n",
" @impactlab.updates(api.get_variable('/GCP/climate/tas2_ir'))\n",
" def inner(popop, tas):\n",
" '''\n",
" The inner loop's uses() decorator is given tas as an argument. The\n",
" `iters` decorator sees that tas is an archive rather than a variable\n",
" and simply passes the value through rather than attempting to loop over\n",
" it.\n",
" '''\n",
"\n",
" with popop.open('r') as f:\n",
" popop_data = pd.read_csv(f)\n",
"\n",
" return (tas_data**2) * popop_data\n",
"\n",
" inner()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When this is run, note how the climate data is only loaded once per outer loop:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"loading \n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n",
"loading \n",
" bumped 0.0.1 --> 0.0.2\n"
]
}
],
"source": [
"tas2_ir()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"nifty!"
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python [conda env:datafs]",
"language": "python",
"name": "conda-env-datafs-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.12"
}
},
"nbformat": 4,
"nbformat_minor": 1
}