Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jaydeluca/repo-metrics
Script that pulls data related to a github repo and visualizes it as timeseries
https://github.com/jaydeluca/repo-metrics
Last synced: 24 days ago
JSON representation
Script that pulls data related to a github repo and visualizes it as timeseries
- Host: GitHub
- URL: https://github.com/jaydeluca/repo-metrics
- Owner: jaydeluca
- Created: 2023-07-16T21:33:03.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-22T10:41:56.000Z (3 months ago)
- Last Synced: 2024-11-05T11:30:34.363Z (2 months ago)
- Language: Python
- Homepage:
- Size: 931 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# Repo Metrics
Random collection of tools to pull and visualize various data about a repository as timeseries metrics.
Functionality:
* Timeseries file count tracking: for tracking migration projects from one language to another.
* Pull benchmark data and visualize as timeseries## Setup
A github token is not required but it is recommended as you will get rate limited if you make too many unauthenticated calls
```
export GITHUB_TOKEN="insert-your-token"
make install
```## Test / linting
```
# Make sure you install pytest and ruff
make install# Tests
make test# Linting
make lint
```## File Count Tracking
### Arguments
| Argument | Command | Description | Example |
|------------|-----------------|----------------------------------------------------------------------------|------------------------------------------------------------|
| Repository | -r, --repo | Repository name. | --repo "open-telemetry/opentelemetry-java-instrumentation" |
| Start Date | -s, --start | Starting Date in format %Y-%m-%d (will calculate from this date until now) | --start "2022-11-15" |
| Interval | -i, --interval | Interval (in days) between data points | --interval 14 |
| Languages | -l, --languages | Languages to count (in comma separated list) | --languages "java,groovy" |
| Output | -o, --output | (Optional) Location where file should be saved | --output "./media/example.png" |### Example Usage:
In the `open-telemetry/opentelemetry-java-instrumentation` repository, track the conversion of tests from groovy to java
in the `instrumentation` directory starting from 2022-11-15 with a data point every 2 weeks.`python main.py -r "open-telemetry/opentelemetry-java-instrumentation" -l "groovy,java" -s "2022-11-15" -i 14`
Output:
![Example](./media/example_output.png)
Now just groovy
`python main.py -r "open-telemetry/opentelemetry-java-instrumentation" -l "groovy" -s "2022-11-15" -i 14`
Output:
![Example](./media/example_output2.png)
## Benchmark Visualization
This is very specific to the open-telemetry/opentelemetry-java-instrumentation repo
### Arguments
| Argument | Command | Description | Example |
|------------|-----------------|----------------------------------------------------------------------------|------------------------------------------------------------|
| Repository | -r, --repo | Repository name. | --repo "open-telemetry/opentelemetry-java-instrumentation" |
| Start Date | -s, --start | Starting Date in format %Y-%m-%d (will calculate from this date until now) | --start "2022-11-15" |
| Interval | -i, --interval | Interval (in days) between data points | --interval 14 |### Example Usage:
Chart Min and max heap starting from 2022-02-14 with a data point every 30 days.
`python benchmark.py -r "open-telemetry/opentelemetry-java-instrumentation" -s "2022-02-14" -i 30`
Output:
![Example](./media/benchmark_output.png)
## Detail Count by Instrumentation module
### Arguments
| Argument | Command | Description | Example |
|------------|----------------|------------------------------------------------|------------------------------------------------------------|
| Repository | -r, --repo | Repository name. | --repo "open-telemetry/opentelemetry-java-instrumentation" |
| Language | -l, --language | Language to get count for | --language "groovy" |
| Output | -o, --output | (Optional) Location where file should be saved | --output "./media/example.png" |### Example Usage:
In the `open-telemetry/opentelemetry-java-instrumentation` repository, analyze the files in test directories in the
`instrumentation` directory and outputting counts by module.`python count_by_instrumentation.py -r "open-telemetry/opentelemetry-java-instrumentation" -l "groovy"`
Output:
| Key | File Count | Total File Size |
|:------------------|-----------:|----------------:|
| servlet | 23 | 143182 |
| aws-sdk | 17 | 127827 |
| jdbc | 8 | 90890 |
| elasticsearch | 15 | 90341 |
| jaxrs | 37 | 72352 |
| vertx | 14 | 56845 |
| ratpack | 16 | 51932 |
| mongo | 10 | 51661 |
| play | 15 | 48986 |
| restlet | 22 | 38226 |
| jaxws | 17 | 21595 |
| kafka | 3 | 21505 |
| twilio-6.6 | 1 | 18936 |
| jaxrs-client | 5 | 16067 |
| spymemcached-2.12 | 1 | 15630 |
| undertow-1.4 | 3 | 12754 |
| hibernate | 1 | 12167 |
| dropwizard | 2 | 10789 |
| rediscala-1.8 | 1 | 3898 |
| grails-3.0 | 3 | 3201 |
| internal | 2 | 2603 |
| Total | 216 | 911387 |![Example](./media/example_pie_output.png)
## Approach
- Query Github for point in time snapshots based on commits around times spanning a timeframe
- Get one data point every `interval` (example: every 14 days)
- Filter based on some criteria
- Cache this data locally to avoid repeated api calls
- Generate Graph to show results over time frame