https://github.com/apideck-libraries/agent-analytics

Track AI agent and bot traffic to your Next.js / Vercel app — PostHog, webhooks, or any custom analytics backend. Detects Claude, ChatGPT, Perplexity, Google-Extended, and more.
https://github.com/apideck-libraries/agent-analytics
ai-agents analytics vercel
Last synced: 3 months ago
JSON representation
Track AI agent and bot traffic to your Next.js / Vercel app — PostHog, webhooks, or any custom analytics backend. Detects Claude, ChatGPT, Perplexity, Google-Extended, and more.
Host: GitHub
URL: https://github.com/apideck-libraries/agent-analytics
Owner: apideck-libraries
License: mit
Created: 2026-04-19T06:55:54.000Z (3 months ago)
Default Branch: main
Last Pushed: 2026-04-21T10:14:58.000Z (3 months ago)
Last Synced: 2026-04-23T02:26:25.772Z (3 months ago)
Topics: ai-agents, analytics, vercel
Language: TypeScript
Homepage:
Size: 157 KB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

          


# agent-analytics

### See the agents your JavaScript can't.

**Drop-in Next.js / Vercel middleware that tracks ClaudeBot, GPTBot, Perplexity, and 20+ AI crawlers in PostHog — or any analytics backend you already pay for.**

[![npm version](https://img.shields.io/npm/v/@apideck/agent-analytics.svg?style=flat-square&color=2563eb)](https://www.npmjs.com/package/@apideck/agent-analytics)

[![npm downloads](https://img.shields.io/npm/dm/@apideck/agent-analytics.svg?style=flat-square&color=2563eb)](https://www.npmjs.com/package/@apideck/agent-analytics)

[![bundle size](https://img.shields.io/bundlephobia/minzip/@apideck/agent-analytics?style=flat-square&color=2563eb&label=gzipped)](https://bundlephobia.com/package/@apideck/agent-analytics)

[![CI](https://img.shields.io/github/actions/workflow/status/apideck-libraries/agent-analytics/ci.yml?style=flat-square&label=ci)](https://github.com/apideck-libraries/agent-analytics/actions)

[![license](https://img.shields.io/npm/l/@apideck/agent-analytics?style=flat-square&color=2563eb)](./LICENSE)

[![typescript](https://img.shields.io/badge/TypeScript-strict-3178c6?style=flat-square)](./tsconfig.json)

[**Install**](#install) · [**Quick start**](#quick-start-60-seconds-to-your-first-event) · [**How it works**](#how-it-works) · [**Adapters**](#built-in-adapters) · [**Markdown mirror**](#advanced-markdown-mirror-for-docs-sites) · [**FAQ**](#faq)



---

## The problem

Client-side analytics libraries run in the browser. AI crawlers don't. That means **every time ClaudeBot, GPTBot, or Perplexity fetches a page on your site, your dashboard stays empty.**

You can't see:

- Which AI agents are reading your docs, marketing pages, or blog

- Which pages they actually fetch (vs. which you *think* they should)

- Which agent-driven referrals convert

- How much of your "traffic" is actually LLM training pipelines

Server logs have the data, but turning them into analytics is a pipeline project. This library is the one-line version.

## What you get

```ts

import { trackVisit, posthogAnalytics } from '@apideck/agent-analytics'

const analytics = posthogAnalytics({ apiKey: process.env.POSTHOG_KEY! })

export function middleware(req: NextRequest) {

  void trackVisit(req, { analytics })   // ← that's the whole thing

  return NextResponse.next()

}

```

One line of middleware. Fire-and-forget. Zero impact on your response latency. Events in PostHog within seconds.

What shows up in your analytics (click to expand)

```jsonc

{

  "event": "doc_view",

  "distinct_id": "anon_7f3a1b2c",          // hashed ip:ua, no person profile

  "timestamp": "2026-04-19T08:30:00.000Z",

  "properties": {

    "$process_person_profile": false,       // PostHog: don't create a person

    "$current_url": "https://example.com/docs/intro",

    "path": "/docs/intro",

    "user_agent": "ClaudeBot/1.0 (+https://claude.ai/bot)",

    "is_ai_bot": true,                      // strict: matches a branded AI crawler

    "bot_name": "Claude",                   // 'Claude' | 'ChatGPT' | ... | 'curl' | 'axios' | 'Electron' | 'Browser' | 'Other'

    "ua_category": "declared-crawler",      // 'declared-crawler' | 'coding-agent-hint' | 'browser' | 'other'

    "coding_agent_hint": false,             // loose: HTTP-library / automation UA (curl, axios, got, colly, Electron, ...)

    "referer": "https://claude.ai/",

    "source": "page-view"                   // whatever label you passed

  }

}

```

Now you can build:

- **AI-vs-human traffic ratio** over time

- **Breakdown by agent** (Claude vs ChatGPT vs Perplexity vs Google-Extended)

- **Top pages for agents** (what do they actually read?)

- **Conversion funnels** from agent referral → human visit → sign-up

- **Anomaly detection** when a new bot starts hammering your site

---

## Install

```bash

npm install @apideck/agent-analytics

# or

pnpm add @apideck/agent-analytics

# or

yarn add @apideck/agent-analytics

```

Zero dependencies. Runs on Node 18+, Edge, Bun, and anywhere the Web Fetch API exists.

## Quick start (60 seconds to your first event)

### 1. Pick an adapter

```ts

import {

  posthogAnalytics

} from '@apideck/agent-analytics'

const analytics = posthogAnalytics({

  apiKey: process.env.POSTHOG_KEY!

})

```

Ships with **PostHog**, **webhook**, and **custom** adapters. BYO analytics.

### 2. Wire the middleware

```ts

// middleware.ts

import {

  trackVisit

} from '@apideck/agent-analytics'

export function middleware(req) {

  void trackVisit(req, {

    analytics,

    source: 'page-view'

  })

  return NextResponse.next()

}

```

Works in any middleware that hands you a `Request`.

### 3. Ship it

```bash

vercel --prod

```

Hit any page with a spoofed UA:

```bash

curl -A "ClaudeBot/1.0" \

  https://yoursite.com/

```

Event lands in PostHog in seconds.

---

## How it works

```text

                   Request                   Response (unchanged)

   Agent ─────────────────────►  middleware ───────────────────► Agent

                                      │

                                      │ fire-and-forget

                                      │ keepalive: true

                                      ▼

                            ┌──────────────────┐

                            │  AnalyticsAdapter │

                            │  (PostHog /       │

                            │   webhook /       │

                            │   custom fn)      │

                            └──────────────────┘

```

The middleware call:

1. **Reads UA** from `req.headers.get('user-agent')`

2. **Matches against** `AI_BOT_PATTERN` (ClaudeBot, GPTBot, PerplexityBot, Google-Extended, Applebot-Extended, CCBot, Bytespider, Amazonbot, Meta-ExternalAgent, MistralAI-User, Cursor, Windsurf, and more)

3. **Hashes `ip:ua`** with djb2 → stable anon distinct_id (same bot from same network = same visitor, no PII)

4. **Posts to your adapter** with `keepalive: true` so the request survives after the response returns

5. **Swallows errors** — a downed analytics backend never breaks your response

By default every request is captured so coding-agent traffic (axios, curl, Electron, …) surfaces alongside branded crawlers. Pass `onlyBots: true` to restrict capture to UAs matching the built-in AI bot pattern.

---

## Who's detected out of the box

AgentUA signaturebot_name label

AnthropicClaudeBot, Claude-User, Anthropic-*Claude

OpenAIChatGPT-User, GPTBot, OAI-SearchBotChatGPT

PerplexityPerplexityBot, Perplexity-UserPerplexity

GoogleGoogle-Extended, GooglebotGoogle

AppleApplebot-Extended, ApplebotApple

MetaMeta-ExternalAgent, FacebookBotMeta

AmazonAmazonbotAmazon

BytedanceBytespiderBytespider

Common CrawlCCBotCommon Crawl

MistralMistralAI-UserMistral

Coherecohere-aiCohere

DuckDuckGoDuckAssistBotDuckDuckGo

You.comYouBotYou.com

AI2AI2BotAI2

DiffbotDiffbotDiffbot

Coding agentsCursor, WindsurfCursor / Windsurf

New agents appear every month. Patch releases ship as the list grows — watch the repo for updates. Raise a PR if you spot one we're missing.

### Coding agents (loose detection — `coding_agent_hint: true`)

Coding agents like Claude Code, Cline, Cursor, and Windsurf **don't identify themselves by name** in their user agent. They use whatever HTTP library they're built on, so detection is a loose heuristic — the UAs below are *also* used by legitimate curl scripts, CI jobs, and server-to-server traffic.

`is_ai_bot` stays `false` for these so your strict AI-traffic segment is clean. The `coding_agent_hint` property is the wider net; pair it with other signals (path patterns, [JA4 fingerprints via Vercel Log Drains](https://vercel.com/docs/observability/log-drains), HEAD-then-GET request shape) when you need higher confidence.

| Agent | Signature observed | `bot_name` |

|---|---|---|

| Claude Code | `axios/1.8.4` | `axios` |

| Cline / Junie | `curl/8.4.0` | `curl` |

| Cursor | `got (sindresorhus/got)` | `got` |

| Windsurf | `colly` (Go) | `colly` |

| VS Code | `Electron/` marker | `Electron` |

| Other automation | `node-fetch`, `python-requests`, `Go-http-client`, `okhttp`, `aiohttp`, `Deno` | exact library name |

Playwright-based agents (Aider, OpenCode) spoof full Mozilla/Safari UAs and are **indistinguishable from real browsers by UA alone**. They'll show up as `bot_name: Browser`, `ua_category: browser`. Catching those needs TLS fingerprinting (JA4) or behavioural analysis.

Credit: coding-agent signatures catalogued by [Addy Osmani](https://addyosmani.com/blog/agentic-engine-optimization/).

---

## Built-in adapters

### `posthogAnalytics`

```ts

import { posthogAnalytics } from '@apideck/agent-analytics'

const analytics = posthogAnalytics({

  apiKey: process.env.NEXT_PUBLIC_POSTHOG_KEY!,

  host: 'https://eu.i.posthog.com'        // optional; defaults to US cloud

})

```

Host can be the PostHog cloud (`us.i.posthog.com`, `eu.i.posthog.com`) **or** your own reverse-proxy domain (e.g. `https://svc.yourdomain.com`) to dodge ad-blockers. Scheme is optional — both `'https://host'` and `'host'` work.

### `webhookAnalytics`

```ts

import { webhookAnalytics } from '@apideck/agent-analytics'

const analytics = webhookAnalytics({

  url: 'https://collector.example.com/events',

  headers: { Authorization: `Bearer ${process.env.TOKEN}` },

  transform: (event) => ({              // optional: reshape for your backend

    type: event.event,

    user: event.distinctId,

    ...event.properties

  })

})

```

### `customAnalytics`

```ts

import { customAnalytics } from '@apideck/agent-analytics'

import { Mixpanel } from 'mixpanel'

const mp = Mixpanel.init(process.env.MIXPANEL_TOKEN!)

const analytics = customAnalytics((event) => {

  mp.track(event.event, { distinct_id: event.distinctId, ...event.properties })

})

```

Any `{ capture(event): Promise | void }` object is a valid adapter. Compose multiple by fanning out in a custom callback.

---

## Advanced: Markdown mirror for docs sites

Content-heavy sites should serve **clean Markdown** when an agent asks for it — that's what makes your docs actually useful to coding agents, not just indexable. The `/markdown` subpath exports the helpers that power [developers.apideck.com](https://developers.apideck.com)'s agent-readiness stack:

```ts

import {

  markdownServeDecision,   // decide if this request should get Markdown

  markdownHeaders,         // Content-Type, Content-Signal, x-markdown-tokens

  synthesizeMarkdownPointer // fallback for URLs without a mirror

} from '@apideck/agent-analytics/markdown'

```

Three triggers, one decision helper:

| Trigger | Example | `reason` |

|---|---|---|

| AI-bot UA on any URL | `curl -A ClaudeBot /docs/intro` | `ua-rewrite` |

| `.md` suffix | `curl /docs/intro.md` | `md-suffix` |

| `Accept: text/markdown` header | `curl -H "Accept: text/markdown" /docs/intro` | `accept-header` |

Full middleware example: [`README.md → Markdown mirror helpers`](./README.md#markdown-mirror-helpers) section, or copy from [the reference implementation](https://github.com/apideck-io/developer-docs/blob/main/src/middleware.ts).

---

## Compared to…

@apideck/agent-analytics

DIY middleware

Dark Visitors SaaS

Cloudflare AI Labyrinth

Tracks agents in your analytics

✓

✓ (after N hours of glue code)

✓ (external dashboard)

✗ (it blocks them instead)

Reuses your analytics backend

✓ PostHog / webhook / any

✓

✗ (their dashboard)

✗

Zero runtime dependencies

✓

✓

✗ (SaaS)

✗ (Cloudflare)

Ships maintained UA list

✓

✗

✓

✓

Markdown-mirror helpers

✓

✗

✗

✗

Monthly cost

$0

$0 + engineering time

$$$

Requires CF plan

---

## FAQ

Will this slow down my middleware?

No. `trackVisit` returns a promise you don't await, and the underlying `fetch` uses `keepalive: true` — the browser / runtime guarantees the request completes after your response returns. Your critical path is: `req.headers.get('user-agent')` + a regex test + a `void fetch(...)`. Sub-millisecond.

What if my analytics backend is down?

The adapter call is wrapped in try/catch — `trackVisit` never throws, even if PostHog / your webhook / your custom callback crashes. You lose the event, not the response.

Does this create PostHog person profiles for every bot?

No. The event includes `$process_person_profile: false`, which tells PostHog to skip profile creation. Distinct IDs are djb2 hashes of `ip:ua`, so same-bot-same-network collapses into one anonymous visitor for journey analysis, but no "person" row gets created.

How do I detect a bot I added to the UA list in my own code?

```ts

import { isAiBot, parseBotName } from '@apideck/agent-analytics'

if (isAiBot(req.headers.get('user-agent'))) {

  // serve Markdown, skip personalisation, add rate limits, etc.

}

parseBotName('ClaudeBot/1.0')  // → 'Claude'

```

Can I use this outside Next.js?

Yes. The primary API takes a standard Web Fetch `Request` object. Works in Hono, Bun, Cloudflare Workers, Deno Deploy, Node 18+ HTTP handlers — anywhere you can get a `Request`.

Why not just enable PostHog's bot filtering?

PostHog's bot filter excludes bots from your metrics. This library does the opposite: it makes bots *visible* so you can analyse them deliberately. Complementary — segment by `is_ai_bot` to split the populations.

Is the UA list going to go stale?

AI crawlers keep appearing. We publish patch releases whenever the list changes — `npm update @apideck/agent-analytics` picks them up. If you spot a missing agent, send a PR with a link to the bot's official docs; merges ship the same day.

---

## Who uses this

- **[developers.apideck.com](https://developers.apideck.com)** — extracted from and battle-tested on the Apideck developer documentation site

- *Your company here — send a PR*

---

## Roadmap

- [ ] Runtime UA list fetching (opt-in) so patches land without a dependency bump

- [ ] First-class Mixpanel, Amplitude, Segment adapters

- [ ] Vercel Marketplace one-click install

- [ ] Pre-built PostHog dashboards (JSON export) for AI-vs-human, agent leaderboard, top-pages-per-agent

- [ ] `createMarkdownMiddleware()` — a batteries-included Next.js middleware for the full agent-readiness stack

File a [feature request](https://github.com/apideck-libraries/agent-analytics/issues/new) if something's missing from your setup.

---

## Contributing

PRs welcome — especially new UA signatures, adapters, and docs.

```bash

git clone https://github.com/apideck-libraries/agent-analytics

cd agent-analytics

npm install

npm test

```

### Releasing

Publishing to npm is fully automated by two workflows:

1. **`release.yml`** watches `package.json` on `main`. When the version field

   bumps to something without a matching `v` tag, it creates the

   GitHub Release.

2. **`publish.yml`** fires on `release: published`, runs typecheck + tests +

   build, and publishes to npm with `--provenance` via OIDC trusted

   publishing (no secrets required).

So cutting a release is just:

```bash

# Pick a level (patch | minor | major), or edit package.json directly.

npm version patch

git push

```

The push lands on main, `release.yml` notices the new version, cuts the

release, and `publish.yml` publishes. No CLI juggling, no secrets to manage.

OIDC trusted publishing is configured at

[npmjs.com/package/@apideck/agent-analytics/access](https://www.npmjs.com/package/@apideck/agent-analytics/access)

— the GitHub repo + `publish.yml` workflow are registered as the sole

trusted publisher.

## Credits

Built on learnings from:

- [Agentic Engine Optimization](https://addyosmani.com/blog/agentic-engine-optimization/) by Addy Osmani — the case for making sites agent-ready

- [contentsignals.org](https://contentsignals.org) — the Content-Signal spec

- [darkvisitors.com](https://darkvisitors.com) — maintained catalogue of AI user-agents we cross-reference

## License

[MIT](./LICENSE) © [Apideck](https://apideck.com)

---



_{Built by Apideck — the unified API platform for integrations.}
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/apideck-libraries/agent-analytics

Awesome Lists containing this project

README