An open API service indexing awesome lists of open source software.

https://github.com/brookisme/tfr2human

easy parsing of Tensor Flow Records into dictionaries and numpy arrays
https://github.com/brookisme/tfr2human

Last synced: 10 months ago
JSON representation

easy parsing of Tensor Flow Records into dictionaries and numpy arrays

Awesome Lists containing this project

README

          

### TFR2Human

_easy parsing of TFRecords_

---

#### INSTALL

```bash
git clone https://github.com/brookisme/tfr2human.git
pip install -e tfr2human
```

---

#### PARSER

Usage (see complete [example below](#example)):

```python
TFR_LIST=
FEATURE_PROPS=
BANDS=
SIZE=

parser=tfp.TFRParser(
TFR_LIST,
specs=FEATURE_PROPS,
band_specs=BANDS,
dims=[SIZE,SIZE])

for i,element in enumerate(parser.dataset)
...
some_image=parser.image(element,bands=SOME_IM_BANDS,dtype=np.uint8)
some_data=parser.data(element,keys=SOME_KEYS)
```

---

#### UTILS

Here is a quick run down of the methods:

* get_batches: break datasets into batches.
- this is different than TF's [batch](https://www.tensorflow.org/api_docs/python/tf/data/TFRecordDataset#batch) since it returns batches of datasets to be parsed rather than parsing a batch at a time.
* image_profile: returns an image (rasterio) profile for a given lon/lat/crs/resolution/np.array
* gcs_service: returns a google cloud storage client
* save_to_gcs: save generic file to google cloud storage
* csv/image_to_gcs: save csv/image to google cloud storage

---


#### EXAMPLE

```python
#
# CONFIG
#
NOISY=True
NOISE_REDUCER=10
RESOLUTION=20
SIZE=384
MIN_WATER_RATIO=0.005
MAX_WATER_RATIO=0.96
MAX_WATER_NO_DATA_COUNT=int((SIZE**2)*(0.25))
MAX_S1_NAN_COUNT=int((SIZE**2)*(0.01))
MAX_S1_ZERO_COUNT=int((SIZE**2)*(0.1))

WATER_COLUMNS={
0: 'no_data_count',
1: 'not_water_count',
2: 'water_count'
}

#
# TFR Feature Specs
#
WATER_BANDS=['water']
S1_BANDS=['VV','VH','angle','VV_mean','VH_mean']
BANDS=S1_BANDS+WATER_BANDS

FEATURE_PROPS={
'tile_id': tf.string,
'crs': tf.string,
'year': tf.float32,
'month': tf.float32,
'lon': tf.float32,
'lat': tf.float32,
'x_offset': tf.float32,
'y_offset': tf.float32,
'biome_num': tf.float32,
'biome_name': tf.string,
'eco_id': tf.float32,
'eco_name': tf.string,
'grid': tf.string,
'grid_index': tf.int64
# 'nb_s1_images': tf.float32
}
```
```python
#
# HELPERS
#
def process_water(parser,element):
water=parser.image(element,bands=WATER_BANDS,dtype=np.uint8)
values,counts=np.unique(water,return_counts=True)
props={v: c for (v,c) in zip(values,counts)}
props={WATER_COLUMNS[i]: props.get(i,0) for i in range(3)}
water_ratio=props['water_count']/props['not_water_count']
props['water_ratio']=water_ratio
props['valid_water']=((MIN_WATER_RATIO<=water_ratio) and
(water_ratio