https://github.com/fliskdata/analyze-tracking
Automatically document your analytics setup by analyzing tracking code and generating data schemas 🚀
https://github.com/fliskdata/analyze-tracking
amplitude analytics googleanalytics heap javascript mixpanel mparticle pendo posthog rudderstack segment snowplow typescript
Last synced: about 2 months ago
JSON representation
Automatically document your analytics setup by analyzing tracking code and generating data schemas 🚀
- Host: GitHub
- URL: https://github.com/fliskdata/analyze-tracking
- Owner: fliskdata
- License: mit
- Created: 2024-08-17T21:00:26.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-06T14:35:37.000Z (over 1 year ago)
- Last Synced: 2025-03-27T21:48:43.032Z (11 months ago)
- Topics: amplitude, analytics, googleanalytics, heap, javascript, mixpanel, mparticle, pendo, posthog, rudderstack, segment, snowplow, typescript
- Language: JavaScript
- Homepage: https://www.flisk.ai
- Size: 91.8 KB
- Stars: 22
- Watchers: 1
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# @flisk/analyze-tracking
Automatically document your analytics setup by analyzing tracking code and generating data schemas from tools like Segment, Amplitude, Mixpanel, and more 🚀
[](https://www.npmjs.com/package/@flisk/analyze-tracking) [](https://github.com/fliskdata/analyze-tracking/actions/workflows/tests.yml)
## Why Use @flisk/analyze-tracking?
📊 **Understand Your Tracking** – Effortlessly analyze your codebase for `track` calls so you can see all your analytics events, properties, and triggers in one place. No more guessing what's being tracked!
🔍 **Auto-Document Events** – Generates a complete YAML schema that captures all events and properties, including where they're implemented in your codebase.
🕵️♂️ **Track Changes Over Time** – Easily spot unintended changes or ensure your analytics setup remains consistent across updates.
📚 **Populate Data Catalogs** – Automatically generate structured documentation that can help feed into your data catalog, making it easier for everyone to understand your events.
## Quick Start
Run without installation! Just use:
```sh
npx @flisk/analyze-tracking /path/to/project [options]
```
### Key Options
- `-g, --generateDescription`: Generate descriptions of fields (default: `false`)
- `-p, --provider `: Specify a provider (options: `openai`, `gemini`)
- `-m, --model `: Specify a model (ex: `gpt-4.1-nano`, `gpt-4o-mini`, `gemini-2.0-flash-lite-001`)
- `-o, --output `: Name of the output file (default: `tracking-schema.yaml`)
- `-c, --customFunction `: Specify the signature of your custom tracking function (see [instructions here](#custom-functions))
- `--format `: Output format, either `yaml` (default) or `json`. If an invalid value is provided, the CLI will exit with an error.
- `--stdout`: Print the output to the terminal instead of writing to a file (works with both YAML and JSON)
🔑 **Important:** If you are using `generateDescription`, you must set the appropriate credentials for the LLM provider you are using as an environment variable. OpenAI uses `OPENAI_API_KEY` and Google Vertex AI uses `GOOGLE_APPLICATION_CREDENTIALS`.
### Custom Functions
If you have your own in-house tracker or a wrapper function that calls other tracking libraries, you can specify the function signature with the `-c` or `--customFunction` option.
#### Standard Custom Function Format
Your function signature should be in the following format:
```js
yourCustomTrackFunctionName(EVENT_NAME, PROPERTIES, customFieldOne, customFieldTwo)
```
- `EVENT_NAME` is the name of the event you are tracking. It should be a string or a pointer to a string. This is required.
- `PROPERTIES` is an object of properties for that event. It should be an object / dictionary. This is optional.
- Any additional parameters are other fields you are tracking. They can be of any type. The names you provide for these parameters will be used as the property names in the output.
For example, if your function has a userId parameter at the beginning, followed by the event name and properties, you would pass in the following:
```js
yourCustomTrackFunctionName(userId, EVENT_NAME, PROPERTIES)
```
If your function follows the standard format `yourCustomTrackFunctionName(EVENT_NAME, PROPERTIES)`, you can simply pass in `yourCustomTrackFunctionName` to `--customFunction` as a shorthand.
#### Method-Name-as-Event Format
For tracking patterns where the method name itself is the event name (e.g., `yourClass.yourEventName({...})`), use the special `EVENT_NAME` placeholder in the method position:
```js
yourClass.EVENT_NAME(PROPERTIES)
```
This pattern tells the analyzer that:
- `yourClass` is the object name to match
- The method name after the dot (e.g., `viewItemList`, `addToCart`) is the event name
- `PROPERTIES` is the properties object (defaults to the first argument if not specified)
**Example:**
```typescript
// Code in your project:
yourClass.viewItemList({ items: [...] });
yourClass.addToCart({ item: {...}, value: 100 });
yourClass.purchase({ userId: '123', value: 100 });
// Command:
npx @flisk/analyze-tracking /path/to/project --customFunction "yourClass.EVENT_NAME(PROPERTIES)"
```
This will detect:
- Event: `viewItemList` with properties from the first argument
- Event: `addToCart` with properties from the first argument
- Event: `purchase` with properties from the first argument
_**Note:** This pattern is currently only supported for JavaScript and TypeScript code._
#### Multiple Custom Functions
You can also pass in multiple custom function signatures by passing in the `--customFunction` option multiple times or by passing in a space-separated list of function signatures.
```sh
npx @flisk/analyze-tracking /path/to/project --customFunction "yourFunc1" --customFunction "yourFunc2(userId, EVENT_NAME, PROPERTIES)"
npx @flisk/analyze-tracking /path/to/project -c "yourFunc1" "yourFunc2(userId, EVENT_NAME, PROPERTIES)"
npx @flisk/analyze-tracking /path/to/project -c "yourClass.EVENT_NAME(PROPERTIES)" "customTrack(EVENT_NAME, PROPERTIES)"
```
## What's Generated?
A clear YAML schema that shows where your events are tracked, their properties, and more.
Here's an example:
```yaml
version: 1
source:
repository:
commit:
timestamp:
events:
:
description:
implementations:
- description:
path:
line:
function:
destination:
properties:
:
description:
type:
```
Use this to understand where your events live in the code and how they're being tracked.
Your LLM of choice is used for generating descriptions of events, properties, and implementations.
See [schema.json](schema.json) for a JSON Schema of the output.
## Supported tracking libraries & languages
| Library | JavaScript/TypeScript | Python | Ruby | Go | Swift |
|---------|:---------------------:|:------:|:----:|:--:|:--:|
| Google Analytics | ✅ | ❌ | ❌ | ❌ | ✅ |
| Google Tag Manager | ✅ | ❌ | ❌ | ❌ | ✅ |
| Segment | ✅ | ✅ | ✅ | ✅ | ✅ |
| Mixpanel | ✅ | ✅ | ✅ | ✅ | ✅ |
| Amplitude | ✅ | ✅ | ❌ | ✅ | ✅ |
| Rudderstack | ✅ | ✅ | ✳️ | ✳️ | ✅ |
| mParticle | ✅ | ❌ | ❌ | ❌ | ✅ |
| PostHog | ✅ | ✅ | ✅ | ✅ | ✅ |
| Pendo | ✅ | ❌ | ❌ | ❌ | ✅ |
| Heap | ✅ | ❌ | ❌ | ❌ | ✅ |
| Snowplow | ✅ | ✅ | ✅ | ✅ | ❌ |
| Datadog RUM | ✅ | ❌ | ❌ | ❌ | ❌ |
| Custom Function | ✅ | ✅ | ✅ | ✅ | ✅ |
✳️ Rudderstack's SDKs often use the same format as Segment, so Rudderstack events may be detected as Segment events.
## SDKs for supported libraries
Google Analytics
**JavaScript/TypeScript**
```js
gtag('event', '', {
'': ''
});
```
**Swift**
```swift
Analytics.logEvent("", parameters: [
"": ""
])
```
Google Tag Manager
**JavaScript/TypeScript**
```js
dataLayer.push({
event: '',
'': ''
});
// Or via window
window.dataLayer.push({
event: '',
'': ''
});
```
**Swift**
```swift
dataLayer.push(["event": "", "": ""])
```
Segment
**JavaScript/TypeScript**
```js
analytics.track('', {
'': ''
});
```
**Python**
```python
analytics.track('', {
'': ''
})
```
**Ruby**
```ruby
Analytics.track(
event: '',
properties: {
'': ''
}
)
```
**Go**
```go
client.Enqueue(analytics.Track{
UserId: "user-id",
Event: "",
Properties: analytics.NewProperties().
Set("", ""),
})
```
**Swift**
```swift
analytics.track(name: "", properties: TrackProperties("": ""))
```
Mixpanel
**JavaScript/TypeScript**
```js
mixpanel.track('', {
'': ''
});
```
**Python**
```python
mixpanel.track('', {
'': ''
})
```
**Ruby**
```ruby
tracker.track('', '', {
'': ''
})
```
**Go**
```go
ctx := context.Background()
mp := mixpanel.NewApiClient("YOUR_PROJECT_TOKEN")
mp.Track(ctx, []*mixpanel.Event{
mp.NewEvent("", "", map[string]any{}{
"": "",
}),
})
```
**Swift**
```swift
Mixpanel.mainInstance().track(event: "", properties: [
"": ""
])
```
Amplitude
**JavaScript/TypeScript**
```js
amplitude.track('', {
});
```
**Python**
```python
client.track(
BaseEvent(
event_type="",
user_id="",
event_properties={
"": "",
},
)
)
```
**Go**
```go
client.Track(amplitude.Event{
UserID: "",
EventType: "",
EventProperties: map[string]any{}{
"": "",
},
})
```
**Swift**
```swift
amplitude.track(
eventType: "",
eventProperties: ["": ""]
)
```
Rudderstack
**JavaScript/TypeScript**
```js
rudderanalytics.track('', {
});
```
**Python**
```python
rudder_analytics.track('', {
'': ''
})
```
**Ruby**
```ruby
analytics.track(
user_id: '',
event: '',
properties: {
'': ''
}
)
```
**Go**
```go
client.Enqueue(analytics.Track{
UserId: "",
Event: "",
Properties: analytics.NewProperties().
Set("", ""),
})
```
**Swift**
```swift
RSClient.sharedInstance()?.track("", properties: [
"": ""
])
```
mParticle
**JavaScript/TypeScript**
```js
mParticle.logEvent('', mParticle.EventType., {
'': ''
});
```
**Swift**
```swift
let event = MPEvent(name: "", type: .other)
event.customAttributes = [
"": ""
]
MParticle.sharedInstance().logEvent(event)
```
PostHog
**JavaScript/TypeScript**
```js
posthog.capture('', {
'': ''
});
```
**Python**
```python
posthog.capture('distinct_id', '', {
'': ''
})
# Or
posthog.capture(
'distinct_id',
event='',
properties={
'': ''
}
)
```
**Ruby**
```ruby
posthog.capture({
distinct_id: '',
event: '',
properties: {
'': ''
}
})
```
**Go**
```go
client.Enqueue(posthog.Capture{
DistinctId: "",
Event: "",
Properties: posthog.NewProperties().
Set("", ""),
})
```
**Swift**
```swift
PostHogSDK.shared.capture("", properties: [
"": ""
])
```
Pendo
**JavaScript/TypeScript**
```js
pendo.track('', {
});
```
**Python**
```python
pendo.track('', {
'': ''
})
```
**Swift**
```swift
PendoManager.shared().track("", properties: [
"": ""
])
```
Heap
**JavaScript/TypeScript**
```js
heap.track('', {
});
```
**Python**
```python
heap.track('', {
'': ''
})
```
**Swift**
```swift
Heap.shared.track("", properties: [
"": ""
])
```
Datadog RUM
**JavaScript/TypeScript**
```js
datadogRum.addAction('', {
'': ''
});
// Or via window
window.DD_RUM.addAction('', {
'': ''
});
// Or via global DD_RUM
DD_RUM.addAction('', {
'': ''
});
```
Snowplow (Structured Events)
**JavaScript/TypeScript**
```js
tracker.track(buildStructEvent({
action: '',
category: '',
label: '',
property: '',
value:
}));
```
**Python**
```python
tracker.track(StructuredEvent(
action="",
category="",
label="",
property_="",
value=,
))
```
**Ruby**
```ruby
tracker.track_struct_event(
action: '',
category: '',
label: '',
property: '',
value:
)
```
**Go**
```go
tracker.TrackStructEvent(sp.StructuredEvent{
Action: sp.NewString(""),
Category: sp.NewString(""),
Label: sp.NewString(""),
Property: sp.NewString(""),
Value: sp.NewFloat64(),
})
```
## Contribute
We're actively improving this package. Found a bug? Have a feature request? Open an issue or submit a pull request!
[](https://join.slack.com/t/fliskcommunity/shared_invite/zt-354hesfnm-BbNzveERo9C4JwVQEWvXoA)