{"id":49492568,"url":"https://github.com/friday-platform/hubspot-export","last_synced_at":"2026-05-01T07:05:58.938Z","repository":{"id":348366470,"uuid":"1196831982","full_name":"friday-platform/hubspot-export","owner":"friday-platform","description":"HubSpot export","archived":false,"fork":false,"pushed_at":"2026-03-31T22:15:33.000Z","size":1753,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-31T22:24:58.358Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/friday-platform.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-31T04:48:08.000Z","updated_at":"2026-03-31T21:13:05.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/friday-platform/hubspot-export","commit_stats":null,"previous_names":["friday-platform/hubspot-export"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/friday-platform/hubspot-export","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/friday-platform%2Fhubspot-export","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/friday-platform%2Fhubspot-export/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/friday-platform%2Fhubspot-export/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/friday-platform%2Fhubspot-export/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/friday-platform","download_url":"https://codeload.github.com/friday-platform/hubspot-export/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/friday-platform%2Fhubspot-export/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32487751,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-30T13:12:12.517Z","status":"online","status_checked_at":"2026-05-01T02:00:05.856Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-01T07:05:56.267Z","updated_at":"2026-05-01T07:05:58.926Z","avatar_url":"https://github.com/friday-platform.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# HubSpot Ticket \u0026 Conversation Dump\n\nExports all tickets and their full conversation histories (emails, replies, threads) from HubSpot into CSV files.\n\n## What This Does\n\nThis tool connects to your HubSpot account and downloads:\n\n1. **All tickets** with their metadata (every property defined in your account)\n2. **All emails** associated with each ticket (incoming and outgoing)\n3. **All conversation threads** linked to each ticket (chat messages, thread replies)\n\nEverything is saved as CSV and JSONL files that you can open in Excel, import into a database, or feed into a knowledge base.\n\nDesigned for large accounts (100k-750k+ tickets): processes in chunks of 5,000, saves progress after each chunk, and automatically resumes from the last checkpoint if interrupted.\n\n## Prerequisites\n\n- [Docker Desktop](https://www.docker.com/products/docker-desktop/) installed and running\n- A **HubSpot Service Key** (see next section)\n\n## Getting a HubSpot Service Key\n\nA service key allows this tool to read data from your HubSpot account. Follow these steps to create one:\n\n### Step-by-step instructions\n\n**1.** Log in to your HubSpot account and click the **Settings gear icon** in the top navigation bar. In the left sidebar, expand **Integrations** and click **Service Keys**:\n\n![Settings sidebar showing Integrations \u003e Service Keys](docs/step-1-settings-sidebar.png)\n\n**2.** On the Service Keys page, click **\"Create service key\"** in the top right corner:\n\n![Service Keys list with Create button](docs/step-2-service-keys-list.png)\n\n**3.** Enter a **Name** for your key (e.g. \"Ticket Dump\"):\n\n![Create Service Key form](docs/step-3-create-form.png)\n\n**4.** Click **\"+ Add new scope\"**. In the search box, search for each of the three required scopes one at a time and check the box for each:\n\n| Scope | Why it's needed |\n|-------|----------------|\n| `tickets` | Read ticket data and associations |\n| `conversations.read` | Read conversation threads and messages |\n| `sales-email-read` | Read email content associated with tickets |\n\n![Searching and selecting scopes](docs/step-4-add-scope.png)\n\n**5.** Click **\"Update\"** after selecting all three scopes, then click **\"Create\"**. Your key will be shown on the next page. Click **\"Show\"** to reveal it, then **\"Copy\"** to copy it to your clipboard:\n\n![Completed service key showing token and scopes](docs/step-5-completed-key.png)\n\nThe token looks like: `pat-na2-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx`\n\n## How to Run\n\n### Step 1: Set up your token\n\nCreate a file called `.env` with your service key and portal ID:\n\n```\nHUBSPOT_ACCESS_TOKEN=pat-na2-your-actual-token-here\nHUBSPOT_PORTAL_ID=12345678\n```\n\nYou can find your portal ID in any HubSpot URL: `app.hubspot.com/contacts/{portal_id}/...`\n\n### Step 2: Run the export\n\n```bash\ndocker run --env-file .env -v \"$(pwd)/output:/app/output\" tempestdx/hubspot-export\n```\n\nThat's it! The tool will:\n- Read your HubSpot token from the `.env` file\n- Download all tickets and their conversations in chunks of 5,000\n- Save a checkpoint after each chunk (so it can resume if interrupted)\n- Save the output files in the `output/` folder on your machine\n\n### Exporting a specific year\n\nFor large accounts, you can filter to a single year to keep export times manageable:\n\n```bash\ndocker run --env-file .env -e YEAR=2025 -v \"$(pwd)/output:/app/output\" tempestdx/hubspot-export\n```\n\nThis uses the HubSpot Search API to only fetch tickets created in the specified year. Only months up to the current date are queried (future months are skipped). Date ranges with more than 10,000 tickets are automatically split into smaller ranges to stay within HubSpot's search API limits.\n\n### Skipping conversations (emails only)\n\nIf you only need email data and want to dramatically reduce API usage (~3,300 calls instead of ~242,000 for 90k tickets):\n\n```bash\ndocker run --env-file .env -e SKIP_CONVERSATIONS=true -v \"$(pwd)/output:/app/output\" tempestdx/hubspot-export\n```\n\nConversation threads (live chat, chatbot, Messenger) are the most API-intensive part of the export since HubSpot has no batch endpoint for them. Email data already captures most support interactions (incoming/outgoing emails with full content, sender, recipient, and timestamps).\n\n### Resuming an interrupted export\n\nIf the export is stopped or crashes, just run the same command again. It will automatically:\n- Load cached ticket IDs and properties (skipping the initial discovery phase)\n- Resume from the last completed chunk\n- Append to the existing output files\n\nTo start fresh, delete the `output/` folder before running.\n\n### Sample output\n\n```\n=== HubSpot Ticket + Conversation Dump ===\n\nFetching ticket property definitions...\nFound 658 ticket properties.\nFetching ticket IDs...\n  ...5000 ticket IDs fetched (15.3s elapsed)\nFetched 50000 ticket IDs in 149.8s.\n\nProcessing 50000 tickets in 10 chunks of 5000 (concurrency: 10)...\n\n--- Chunk 1/10 (5000 tickets) ---\n  Associations batch 1/5 (1000/5000 tickets)...\n  ...\nProgress: 5000/50000 (10.0%) | 8368 emails, 3118 convos | ETA: 82m 5s\n  [Checkpoint saved: 5000 tickets complete]\n\n--- Chunk 2/10 (5000 tickets) ---\n  ...\n\n=== Dump Complete ===\nTickets:      50000\nMessages:     95432\n  Emails:     62100\n  Conversations: 33332\nErrors:       3\nOutput dir:   ./output/\n  tickets.csv   - ticket metadata\n  messages.csv  - all conversation messages\n  dump.jsonl    - full structured data\n```\n\n## Output Files\n\nAfter the export completes, you'll find three files in the `output/` folder:\n\n### `tickets.csv`\n\nOne row per ticket. Columns are dynamically generated from every ticket property defined in your HubSpot account, plus two extra columns appended at the end:\n\n| Column | Description |\n|--------|-------------|\n| *(all property labels)* | Every ticket property in your account (e.g. \"Ticket name\", \"Pipeline\", \"Ticket status\", \"Priority\", \"Create date\", etc.) |\n| `Message Count` | Total emails + conversation messages found for this ticket |\n| `URL` | Direct link to the ticket in HubSpot |\n\n### `messages.csv`\n\nOne row per message. Contains the full conversation history for all tickets.\n\n| Column | Description | Example |\n|--------|-------------|---------|\n| `ticket_id` | Which ticket this belongs to | `12345678` |\n| `message_id` | Unique message ID | `msg_abc123` |\n| `timestamp` | When the message was sent | `2024-01-15T10:30:00Z` |\n| `direction` | `INCOMING` (customer) or `OUTGOING` (agent) | `INCOMING` |\n| `sender` | Sender's email address | `john@example.com` |\n| `recipient` | Recipient's email address | `support@company.com` |\n| `subject` | Email subject line | `Re: Cannot login` |\n| `body` | Message content (plain text) | `I tried resetting my password but...` |\n| `source_type` | `EMAIL` or `CONVERSATION` | `EMAIL` |\n| `thread_id` | Conversation thread ID (conversations only) | `thread_789` |\n\n### `dump.jsonl`\n\nOne JSON object per line, containing the full structured data for each ticket and all its messages. Useful for programmatic processing.\n\n### Cache and checkpoint files\n\nThe `output/` folder also contains files used for caching and resume:\n\n| File | Purpose |\n|------|---------|\n| `ticket_ids.json` | Cached ticket IDs (avoids re-fetching on resume) |\n| `ticket_ids_2025.json` | Cached ticket IDs for year-filtered runs |\n| `properties.json` | Cached property definitions |\n| `checkpoint.json` | Current progress (deleted on successful completion) |\n\nThese are safe to delete if you want to force a fresh export.\n\n## How Long Does It Take?\n\n| Ticket Count | Estimated Time |\n|-------------|---------------|\n| 1-100 | Under 1 minute |\n| 1,000 | 2-5 minutes |\n| 10,000 | 15-25 minutes |\n| 50,000 | 1-2 hours |\n| 100,000 | 3-5 hours |\n| 300,000+ | 10-15 hours |\n\nThe tool uses batch APIs and parallel fetching to maximize throughput while respecting HubSpot's rate limits. Email associations and content are fetched in bulk (up to 1,000 per request), and conversation threads are fetched with configurable concurrent workers. Progress with ETA is printed to the terminal as it runs.\n\nFor very large accounts, use the `YEAR` filter to export one year at a time.\n\n## Troubleshooting\n\n### `HUBSPOT_ACCESS_TOKEN is not set`\n\nMake sure you:\n1. Created the `.env` file\n2. Added your actual token to the `.env` file\n3. Included `--env-file .env` in the `docker run` command\n\n### `HUBSPOT_PORTAL_ID is not set`\n\nAdd your portal ID to the `.env` file. Find it in any HubSpot URL: `app.hubspot.com/contacts/{portal_id}/...`\n\n### `HubSpot API 401` / `Unauthorized`\n\nYour token is invalid or expired. Generate a new one in HubSpot Settings \u003e Integrations \u003e Service Keys.\n\n### `HubSpot API 403` / `Forbidden`\n\nYour token is missing required scopes. Go to your Service Key settings and make sure these scopes are enabled:\n- `tickets`\n- `conversations.read`\n- `sales-email-read`\n\nIf you see 403 errors specifically when fetching emails, you may also need to add the `crm.objects.emails.read` scope.\n\n### `Rate limit exceeded` / `429 Too Many Requests`\n\nThe tool has built-in rate limiting with automatic retry and exponential backoff. If it persists, reduce concurrency:\n\n```bash\ndocker run --env-file .env -e CONCURRENCY=5 -v \"$(pwd)/output:/app/output\" tempestdx/hubspot-export\n```\n\n### The output files are empty\n\nCheck the terminal output for errors. Common causes:\n- No tickets exist in the HubSpot account\n- The token doesn't have the `tickets` scope\n- Network connectivity issues\n\n## Environment Variables\n\n| Variable | Required | Default | Description |\n|----------|----------|---------|-------------|\n| `HUBSPOT_ACCESS_TOKEN` | Yes | — | Your HubSpot service key / PAT |\n| `HUBSPOT_PORTAL_ID` | Yes | — | HubSpot portal ID (for ticket URLs in CSV). Find it in your HubSpot URL: `app.hubspot.com/contacts/{portal_id}/...` |\n| `OUTPUT_DIR` | No | `./output` | Where to save the dump files |\n| `CONCURRENCY` | No | `10` | Number of parallel conversation fetches. Lower if you hit rate limits |\n| `CHUNK_SIZE` | No | `5000` | Number of tickets per processing chunk. Lower to reduce memory usage |\n| `YEAR` | No | — | Filter to tickets created in this year (e.g. `2025`). Uses the Search API; only queries up to the current date and auto-splits large date ranges |\n| `SKIP_CONVERSATIONS` | No | `false` | Set to `true` to skip fetching conversation threads/messages and only export emails. Reduces API calls by ~98% |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffriday-platform%2Fhubspot-export","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffriday-platform%2Fhubspot-export","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffriday-platform%2Fhubspot-export/lists"}