An open API service indexing awesome lists of open source software.

https://github.com/andrie/py-chronicle

Scaffolding to query chronicle data files
https://github.com/andrie/py-chronicle

Last synced: about 1 month ago
JSON representation

Scaffolding to query chronicle data files

Awesome Lists containing this project

README

        

chronicle
================

> **Experimental - Work in progress**
>
> The purpose of this experimental package is to expose functionality to
> make it easy to read, filter and manipulate chronicle parquet files.

## Install

The package is not yet available on PyPi.

~~pip install py_chronicle~~

You can install from github:

``` sh
pip install git+https://github.com/andrie/py-chronicle
```

## How Chronicle stores data

Chronicle collects and stores logs and metrics in a series of parquet
files.

Use `read_chronicle()` to read either logs or metrics, by specifying the
path to the parquet set you need.

The file tree looks like this, with `logs` and `metrics` in separate
folders inside `v1`.

``` bash
.
└── v1/
├── logs/
└── metrics/
```

Inside both `logs` and `metrics` the data is stored by date, separated
by year, month and day.

``` bash
.
└── v1/
├── logs/
│ └── 2023/
│ ├── 02/
│ │ ├── 01
│ │ ├── 02
│ │ ├── 03
│ │ ├── 04
│ │ ├── 05
│ │ └── ...
│ ├── 03
│ ├── 04
│ └── ...
└── metrics/
└── 2023/
├── 02/
│ ├── 01
│ ├── 02
│ ├── 03
│ ├── 04
│ ├── 05
│ └── ...
├── 03
├── 04
└── ...
```

## Working with metrics

Some examples.

``` python
scan_chronicle_metrics("./data", "2023/04/03").head().collect()
```

.dataframe > thead > tr > th,
.dataframe > tbody > tr > td {
text-align: right;
}

shape: (5, 13)servicehostosattributesnamedescriptionunittypetimestampvalue_floatvalue_intvalue_uintvalue_columnstrstrstrlist[struct[2]]strstrstrstrdatetime[ms]f64i64u64str"workbench-metr…"rstudio-workbe…"linux"[]"scrape_samples…"The number of …"""gauge"2023-04-03 16:02:20.57469.000"value_float""workbench-metr…"rstudio-workbe…"linux"[{"version","go1.14.6"}]"go_info""Information ab…"""gauge"2023-04-03 16:02:20.5741.000"value_float""workbench-metr…"rstudio-workbe…"linux"[]"go_memstats_mc…"Number of byte…"""gauge"2023-04-03 16:02:20.57416384.000"value_float""workbench-metr…"rstudio-workbe…"linux"[{"host","rstudio-workbench-6b9658c77f-mn8hj"}]"rstudio_system…"Graphite metri…"""gauge"2023-04-03 16:02:20.5740.000"value_float""workbench-metr…"rstudio-workbe…"linux"[]"go_memstats_ms…"Number of byte…"""gauge"2023-04-03 16:02:20.57465536.000"value_float"

``` python
scan_chronicle_metrics("./data", "2023/04/03").metrics.describe()
```

.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}

.dataframe tbody tr th {
vertical-align: top;
}

.dataframe thead th {
text-align: right;
}




service
name
description
value_column




0

system.cpu.time
Total CPU seconds broken down by different sta...
value_float


1

system.memory.usage
Bytes of memory in use.
value_int


2
connect-metrics
go_goroutines
Number of goroutines that currently exist.
value_float


3
connect-metrics
go_info
Information about the Go environment.
value_float


4
connect-metrics
go_memstats_alloc_bytes
Number of bytes allocated and still in use.
value_float


...
...
...
...
...


176
workbench-metrics
scrape_series_added
The approximate number of new series in this s...
value_float


177
workbench-metrics
statsd_metric_mapper_cache_gets_total
The count of total metric cache gets.
value_float


178
workbench-metrics
statsd_metric_mapper_cache_hits_total
The count of total metric cache hits.
value_float


179
workbench-metrics
statsd_metric_mapper_cache_length
The count of unique metrics currently cached.
value_float


180
workbench-metrics
up
The scraping was successful
value_float

181 rows × 4 columns


``` python
scan_chronicle_metrics("./data", "2023/04/03").metrics.filter("rsconnect_system_memory_used", "memory").head()
```

.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}

.dataframe tbody tr th {
vertical-align: top;
}

.dataframe thead th {
text-align: right;
}




host
timestamp
rsconnect_system_memory_used



## Plotting metrics

``` python
from chronicle.plot import *
```

``` python
scan_chronicle_metrics("./data", "2023/04/03").metrics.plot("rsconnect_system_memory_used", alias = "memory")
```

Unable to display output for mime type(s): application/vnd.plotly.v1+json

## Working with logs

Some examples.

``` python
scan_chronicle_logs("./data", "2023/04/03").head().collect()
```

.dataframe > thead > tr > th,
.dataframe > tbody > tr > td {
text-align: right;
}

shape: (5, 6)servicehostosattributesbodytimestampstrstrstrlist[struct[2]]strdatetime[ms]"workbench""rstudio-workbe…"linux"[{"data","120"}, {"pid","2.36E+02"}, … {"type","session_suspend"}]"{"pid":236,"us…2023-04-03 18:01:26.665"workbench""rstudio-workbe…"linux"[{"data",""}, {"pid","2.36E+02"}, … {"type","session_exit"}]"{"pid":236,"us…2023-04-03 18:01:26.761"connect""rstudio-connec…"linux"[{"user_role","publisher"}, {"user_guid","085ba4be-01b5-478b-877c-321368924c89"}, … {"type","audit"}]"{"action":"add…2023-04-03 19:30:35.698"connect""rstudio-connec…"linux"[{"log.file.name","audit.json"}, {"actor_description","Auth Provider"}, … {"entry_id","3.032E+03"}]"{"action":"add…2023-04-03 19:30:35.698"connect""rstudio-connec…"linux"[{"action","add_group_member"}, {"actor_id","0E+00"}, … {"log.file.name","audit.json"}]"{"action":"add…2023-04-03 19:30:35.698

``` python
scan_chronicle_logs("./data", "2023/04/03").logs.filter_type("username")
```

###