https://github.com/andrie/py-chronicle
Scaffolding to query chronicle data files
https://github.com/andrie/py-chronicle
Last synced: about 1 month ago
JSON representation
Scaffolding to query chronicle data files
- Host: GitHub
- URL: https://github.com/andrie/py-chronicle
- Owner: andrie
- License: mit
- Created: 2023-04-25T11:17:06.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2023-06-22T17:16:34.000Z (almost 2 years ago)
- Last Synced: 2025-02-14T13:16:55.501Z (3 months ago)
- Language: Jupyter Notebook
- Homepage: https://andrie.github.io/py-chronicle/
- Size: 75.5 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
chronicle
================> **Experimental - Work in progress**
>
> The purpose of this experimental package is to expose functionality to
> make it easy to read, filter and manipulate chronicle parquet files.## Install
The package is not yet available on PyPi.
~~pip install py_chronicle~~
You can install from github:
``` sh
pip install git+https://github.com/andrie/py-chronicle
```## How Chronicle stores data
Chronicle collects and stores logs and metrics in a series of parquet
files.Use `read_chronicle()` to read either logs or metrics, by specifying the
path to the parquet set you need.The file tree looks like this, with `logs` and `metrics` in separate
folders inside `v1`.``` bash
.
└── v1/
├── logs/
└── metrics/
```Inside both `logs` and `metrics` the data is stored by date, separated
by year, month and day.``` bash
.
└── v1/
├── logs/
│ └── 2023/
│ ├── 02/
│ │ ├── 01
│ │ ├── 02
│ │ ├── 03
│ │ ├── 04
│ │ ├── 05
│ │ └── ...
│ ├── 03
│ ├── 04
│ └── ...
└── metrics/
└── 2023/
├── 02/
│ ├── 01
│ ├── 02
│ ├── 03
│ ├── 04
│ ├── 05
│ └── ...
├── 03
├── 04
└── ...
```## Working with metrics
Some examples.
``` python
scan_chronicle_metrics("./data", "2023/04/03").head().collect()
```.dataframe > thead > tr > th,
.dataframe > tbody > tr > td {
text-align: right;
}shape: (5, 13)servicehostosattributesnamedescriptionunittypetimestampvalue_floatvalue_intvalue_uintvalue_columnstrstrstrlist[struct[2]]strstrstrstrdatetime[ms]f64i64u64str"workbench-metr…"rstudio-workbe…"linux"[]"scrape_samples…"The number of …"""gauge"2023-04-03 16:02:20.57469.000"value_float""workbench-metr…"rstudio-workbe…"linux"[{"version","go1.14.6"}]"go_info""Information ab…"""gauge"2023-04-03 16:02:20.5741.000"value_float""workbench-metr…"rstudio-workbe…"linux"[]"go_memstats_mc…"Number of byte…"""gauge"2023-04-03 16:02:20.57416384.000"value_float""workbench-metr…"rstudio-workbe…"linux"[{"host","rstudio-workbench-6b9658c77f-mn8hj"}]"rstudio_system…"Graphite metri…"""gauge"2023-04-03 16:02:20.5740.000"value_float""workbench-metr…"rstudio-workbe…"linux"[]"go_memstats_ms…"Number of byte…"""gauge"2023-04-03 16:02:20.57465536.000"value_float"
``` python
scan_chronicle_metrics("./data", "2023/04/03").metrics.describe()
```.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}.dataframe tbody tr th {
vertical-align: top;
}.dataframe thead th {
text-align: right;
}
service
name
description
value_column
0
system.cpu.time
Total CPU seconds broken down by different sta...
value_float
1
system.memory.usage
Bytes of memory in use.
value_int
2
connect-metrics
go_goroutines
Number of goroutines that currently exist.
value_float
3
connect-metrics
go_info
Information about the Go environment.
value_float
4
connect-metrics
go_memstats_alloc_bytes
Number of bytes allocated and still in use.
value_float
...
...
...
...
...
176
workbench-metrics
scrape_series_added
The approximate number of new series in this s...
value_float
177
workbench-metrics
statsd_metric_mapper_cache_gets_total
The count of total metric cache gets.
value_float
178
workbench-metrics
statsd_metric_mapper_cache_hits_total
The count of total metric cache hits.
value_float
179
workbench-metrics
statsd_metric_mapper_cache_length
The count of unique metrics currently cached.
value_float
180
workbench-metrics
up
The scraping was successful
value_float
181 rows × 4 columns
``` python
scan_chronicle_metrics("./data", "2023/04/03").metrics.filter("rsconnect_system_memory_used", "memory").head()
```.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}.dataframe tbody tr th {
vertical-align: top;
}.dataframe thead th {
text-align: right;
}
host
timestamp
rsconnect_system_memory_used
## Plotting metrics
``` python
from chronicle.plot import *
`````` python
scan_chronicle_metrics("./data", "2023/04/03").metrics.plot("rsconnect_system_memory_used", alias = "memory")
```Unable to display output for mime type(s): application/vnd.plotly.v1+json
## Working with logs
Some examples.
``` python
scan_chronicle_logs("./data", "2023/04/03").head().collect()
```.dataframe > thead > tr > th,
.dataframe > tbody > tr > td {
text-align: right;
}shape: (5, 6)servicehostosattributesbodytimestampstrstrstrlist[struct[2]]strdatetime[ms]"workbench""rstudio-workbe…"linux"[{"data","120"}, {"pid","2.36E+02"}, … {"type","session_suspend"}]"{"pid":236,"us…2023-04-03 18:01:26.665"workbench""rstudio-workbe…"linux"[{"data",""}, {"pid","2.36E+02"}, … {"type","session_exit"}]"{"pid":236,"us…2023-04-03 18:01:26.761"connect""rstudio-connec…"linux"[{"user_role","publisher"}, {"user_guid","085ba4be-01b5-478b-877c-321368924c89"}, … {"type","audit"}]"{"action":"add…2023-04-03 19:30:35.698"connect""rstudio-connec…"linux"[{"log.file.name","audit.json"}, {"actor_description","Auth Provider"}, … {"entry_id","3.032E+03"}]"{"action":"add…2023-04-03 19:30:35.698"connect""rstudio-connec…"linux"[{"action","add_group_member"}, {"actor_id","0E+00"}, … {"log.file.name","audit.json"}]"{"action":"add…2023-04-03 19:30:35.698
``` python
scan_chronicle_logs("./data", "2023/04/03").logs.filter_type("username")
```###