Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/hrbigelow/streamvis

Write custom Bokeh interactive visualizations that receive streaming data from your application
https://github.com/hrbigelow/streamvis

bokeh-server interactive-visualizations machine-learning

Last synced: 4 days ago
JSON representation

Write custom Bokeh interactive visualizations that receive streaming data from your application

Awesome Lists containing this project

README

        

# streamvis - interactive visualizations of streaming data with Bokeh

# Install

pip install git+https://github.com/hrbigelow/streamvis.git

## Quick Start

```sh
# start the server
# streamvis serve PORT SCHEMA PATH
streamvis serve 5006 data/demo.yaml gs://bucket/path/to/file
streamvis serve 5006 data/demo.yaml s3://bucket/path/to/file
streamvis serve 5006 data/demo.yaml hdfs://bucket/path/to/file
streamvis serve 5006 data/demo.yaml /path/to/file

# run a test data producing demo app
# streamvis demo SCOPE PATH
streamvis demo run24 gs://bucket/path/to/file

# summarize the data in a log file
# streamvis list PATH
streamvis list gs://bucket/path/to/file
```

Starts the web server on localhost:PORT, using the yaml SCHEMA file to configure how the
data in PATH is plotted. PATH may be any locator accepted by
[tf.io.gfile.GFile](https://www.tensorflow.org/api_docs/python/tf/io/gfile/GFile).
Visit localhost:PORT to see interactive plots, and watch the data progressively
appear as your data-producing application runs.

The non-local (`gs://` etc) forms of PATH enable you to run your data producing
application and the server on different machines, and communicate through the shared
resource at PATH. (To create a GCS bucket for example, see [creating a
project](https://developers.google.com/workspace/guides/create-project) and [enabling
APIs](https://developers.google.com/workspace/guides/enable-apis).)

In your data-producing application you instantiate one `DataLogger` object and call
its `write` method to log any data that you produce. It is buffered logging, so
there is no need to worry about how frequently you call it. The `write` method can
be used with unbatched or batched data. This is merely a convenience for the user.
The batched forms of `write` are logically identical to multiple calls of the
unbatched form.

```python
from streamvis.logger import DataLogger
# `scope` is a name that will be applied to all data points produced by this process
logger = DataLogger(scope='run24')
logger.init(path='gs://bucket/path/to/file', buffer_max_elem=100)

...
for step in range(100):
# generate some data and log it.
# This is the 0D (scalar) logging
logger.write('kldiv', x=step, y=some_kldiv_val)
logger.write('weight_norm', x=step, y=some_norm_val)

# or, log in batches of points (1D logging)
# accepts Python list, numpy, jax/pytorch/tensorflow tensors
for step in range(100, 200, 10):
logger.write('kldiv', x=list(range(step, step+10)), y=kldiv_val_list)
logger.write('weight_norm', x=list(range(step, step+10)), y=norm_val_list)

# or, log to a series of plots
for step in range(200, 300, 10):
# attn[layer, point], value at `layer` for a particular step
logger.write('attn_layer', x=list(range(step, step+10)), y=attn)

# buffer is flushed automatically every `buffer_max_elem` data points, but
# you may call this at the end or at an interrupt handler:
logger.flush_buffer()
```

The SCHEMA is in yaml format. It is still in development, but here is a current demo
with some explanation.

```yaml
kldiv:
# these two are required field, both must be valid regex for selecting which
# groups of data will be included in this plot
scope_pattern: .*
group_pattern: kldiv
# optional, these are keyword arguments to be provided as-is to the bokeh.plotting.figure
# constructor as listed here:
# https://docs.bokeh.org/en/latest/docs/reference/plotting/figure.html#figure
figure_kwargs:
title: KL Divergence (bits)
x_axis_label: SGD Steps
y_axis_label: D[q(x_t|x_