{"id":32668994,"url":"https://github.com/qforge-dev/torque","last_synced_at":"2026-01-20T17:57:09.503Z","repository":{"id":321636471,"uuid":"1086662215","full_name":"qforge-dev/torque","owner":"qforge-dev","description":"Declarative, typesafe DSL for building scalable LLM training datasets — compose conversations like React components.","archived":false,"fork":false,"pushed_at":"2025-10-30T18:32:16.000Z","size":0,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-30T18:43:30.579Z","etag":null,"topics":["ai","ai-sdk","anthropic","data-tools","dataset-generation","declarative","llm","openai","react-like","typescript","zod"],"latest_commit_sha":null,"homepage":"https://github.com/qforge-dev/torque","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/qforge-dev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-30T18:11:59.000Z","updated_at":"2025-10-30T18:39:09.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/qforge-dev/torque","commit_stats":null,"previous_names":["qforge-dev/torque"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/qforge-dev/torque","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qforge-dev%2Ftorque","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qforge-dev%2Ftorque/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qforge-dev%2Ftorque/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qforge-dev%2Ftorque/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/qforge-dev","download_url":"https://codeload.github.com/qforge-dev/torque/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qforge-dev%2Ftorque/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":282041516,"owners_count":26604069,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-31T02:00:07.401Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","ai-sdk","anthropic","data-tools","dataset-generation","declarative","llm","openai","react-like","typescript","zod"],"created_at":"2025-11-01T02:01:55.127Z","updated_at":"2026-01-20T17:57:09.485Z","avatar_url":"https://github.com/qforge-dev.png","language":"TypeScript","readme":"# Torque\n\n**Torque** is a declarative, fully typesafe DSL for quickly building complex LLM synthetic datasets. Compose conversations like components, generate realistic variations with any model efficiently.\n\n[![npm version](https://img.shields.io/npm/v/@qforge/torque.svg)](https://www.npmjs.com/package/@qforge/torque)\n[![CI](https://img.shields.io/github/actions/workflow/status/qforge-dev/torque/torque-compile-and-dry-run.yml?branch=main\u0026label=CI)](https://github.com/qforge-dev/torque/actions/workflows/torque-compile-and-dry-run.yml)\n[![TypeScript](https://img.shields.io/badge/TypeScript-5.0+-blue.svg)](https://www.typescriptlang.org/)\n[![License](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)\n\n## ✨ Features\n\n- **🎯 Declarative DSL** - Compose conversations like components\n- **🔒 Fully Typesafe** - Zod schemas with complete type inference\n- **🔌 Provider Agnostic** - Generate with any AI SDK provider (OpenAI, Anthropic, DeepSeek, vLLM, LLaMA.cpp etc.)\n- **🤖 AI-Powered Content** - Generate realistic varied datasets automatically without complicated scripts\n- **🎭 Faker Integration** - Built-in Faker.js with automatic seed synchronization for reproducible fake data\n- **💰 Cache Optimized** - Reuses context across generations to reduce costs\n- **📉 Prompt Optimized** - Concise, optimized structures, prompts and generation workflow lets you use smaller, cheaper models\n- **♻️ Reusable Patterns** - Build libraries of conversation templates\n- **⚡ Concurrent Generation** - Beautiful async CLI with real-time progress tracking while generating concurrently\n\n## 🚀 Quick Example\n\n```typescript\nimport * as T from \"@qforge/torque\";\nimport { openai } from \"@ai-sdk/openai\";\n\nawait T.generateDataset(\n  () =\u003e [\n    T.generatedUser({ prompt: \"Friendly greeting or introduction\" }), // AI generated\n    T.oneOf([\n      // pick one randomly (weights are optional)\n      { value: T.assistant({ content: \"Hello!\" }), weight: 0.3 }, // static\n      T.generatedAssistant({\n        prompt: \"Respond to greeting\",\n        reasoning: T.generatedReasoning({\n          prompt: \"Reason about the greeting\",\n        }),\n        // or reasoning: reasoning({ content: \"....\" }),\n      }), // AI generated, gets remaining weight\n    ]),\n    T.times(between(1, 3), [\n      T.generatedUser({\n        prompt: \"Chat about weather. Optionally mentioning previous message\",\n      }),\n      T.generatedAssistant({ prompt: \"Respond to user. Short and concise.\" }),\n    ]),\n  ],\n  {\n    count: 2, // number of examples\n    model: openai(\"gpt-5-mini\"), // any ai-sdk model\n    seed: 42, // replayable RNG\n    metadata: { example: \"quick-start\" }, // optional per-row metadata\n  }\n);\n```\n\nOutputs:\n\n```json\n{\"messages\":[{\"role\":\"user\",\"content\":[{\"type\":\"text\",\"text\":\"Hi there! I'm new here and just wanted to say hello.\"}]},{\"role\":\"assistant\",\"content\":[{\"type\":\"text\",\"text\":\"Hello!\"}]},{\"role\":\"user\",\"content\":[{\"type\":\"text\",\"text\":\"The sunshine today is perfect for a walk in the park.\"}]},{\"role\":\"assistant\",\"content\":[{\"type\":\"text\",\"text\":\"Absolutely—warm and bright out there.\"}]},{\"role\":\"user\",\"content\":[{\"type\":\"text\",\"text\":\"Do you think the clouds will roll in later this evening?\"}]},{\"role\":\"assistant\",\"content\":[{\"type\":\"text\",\"text\":\"Maybe briefly, but it should stay mostly clear.\"}]}]}\n\n{\"messages\":[{\"role\":\"user\",\"content\":[{\"type\":\"text\",\"text\":\"Hey! Hope you're having a great day.\"}]},{\"role\":\"assistant\",\"content\":[{\"type\":\"text\",\"text\":\"Hi there! I'm doing great—what can I help you with?\"}]},{\"role\":\"user\",\"content\":[{\"type\":\"text\",\"text\":\"The weather keeps flipping between sun and drizzle lately.\"}]},{\"role\":\"assistant\",\"content\":[{\"type\":\"text\",\"text\":\"Totally—it’s been bouncing around all week.\"}]},{\"role\":\"user\",\"content\":[{\"type\":\"text\",\"text\":\"Should I expect rain again tonight?\"}]},{\"role\":\"assistant\",\"content\":[{\"type\":\"text\",\"text\":\"Pack an umbrella just in case; there’s a chance of showers.\"}]},{\"role\":\"user\",\"content\":[{\"type\":\"text\",\"text\":\"Thanks! I’ll be prepared if it turns stormy.\"}]},{\"role\":\"assistant\",\"content\":[{\"type\":\"text\",\"text\":\"Good call—better to stay dry than sorry.\"}]}]}\n```\n\n\u003e 💡 See full example: [`examples/quick-start.ts`](examples/quick-start.ts) | [▶️ Try in Browser](https://stackblitz.com/github/qforge-dev/torque/tree/main/stackblitz-templates/quick-start)\n\n## 🤔 Why Torque?\n\nBuilding synthetic datasets for LLMs is tedious:\n\n- Sometimes you don’t have enough real data\n- Manual conversation writing doesn’t scale as conversations get long\n- Maintaining quality and consistency across thousands of examples is extremely time consuming\n- Tool calling patterns require intricate message sequences and are error‑prone\n- Generating different conversation flows means rewriting everything or creating various hard to maintain scripts\n- Designing generators that are random yet reproducible is surprisingly complex\n- Getting AI to understand complex composition scenarios (nested variations, conditional flows) takes significant prompt engineering time\n\n**Torque solves this** with a declarative approach. Just like React transformed UI development from imperative DOM manipulation to composable components, Torque transforms dataset generation from manual JSON editing or writing complicated scripts to declarative conversation schemas. Plus, its optimized structure means you can use smaller, cheaper models while benefiting from cache optimization for lower costs.\n\n## 📦 Installation\n\n```bash\nnpm install @qforge/torque\n# or\nbun add @qforge/torque\n```\n\n## 📚 Core Concepts\n\n### Message Schemas\n\nBuild conversations by composing message schemas, you can compose them together to build complex conversations from reusable parts:\n\n```typescript\n// Reusable greeting pattern\nconst greeting = () =\u003e [\n  system({ content: \"You are a helpful assistant.\" }),\n  user({ content: \"Hello!\" }),\n  assistant({ content: \"Hi! How can I help?\" }),\n];\n\n// Compose it with additional conversation\nconst extendedSchema = () =\u003e [\n  ...greeting(),\n  user({ content: \"What's the weather like?\" }),\n  assistant({ content: \"I'd be happy to check that for you!\" }),\n];\n\n// Or create variations\nconst formalGreeting = () =\u003e [\n  system({ content: \"You are a professional assistant.\" }),\n  user({ content: \"Good morning.\" }),\n  assistant({ content: \"Good morning. How may I assist you today?\" }),\n];\n\nconst schema = () =\u003e [\n  // Weighted selection between schema branches\n  oneOf([\n    { value: greeting(), weight: 0.6 },\n    formalGreeting(),\n    extendedSchema(),\n  ]),\n  // Continue with shared conversation flow\n  generatedUser({ prompt: \"Ask a question\" }),\n  generatedAssistant({ prompt: \"Provide helpful answer\" }),\n];\n```\n\n\u003e 💡 See full example: [`examples/schema-composition.ts`](examples/schema-composition.ts) | [▶️ Try in Browser](https://stackblitz.com/github/qforge-dev/torque/tree/main/stackblitz-templates/schema-composition)\n\n### Row Metadata\n\nUse `metadata({ ... })` inside your schema to hoist custom fields into the generated row. The helper runs during the check phase, so you can safely compute metadata once and reuse it in generation. Passing a function receives the current metadata object (merged with any top-level metadata you provided) and may mutate it or return a new object for advanced scenarios like simple counters.\n\n```typescript\nconst schema = () =\u003e [\n  system({ content: \"You are a helpful assistant.\" }),\n  oneOf([\n    () =\u003e [metadata({ variant: \"static\" }), assistant({ content: \"Hello!\" })],\n    () =\u003e [\n      metadata({ variant: \"generated\" }),\n      generatedAssistant({ prompt: \"Greet the user warmly\" }),\n    ],\n  ]),\n];\n\nconst withCounter = () =\u003e [\n  metadata((meta) =\u003e {\n    meta.count = meta.count ?? 0;\n    meta.count += 1;\n    count += 1;\n  }),\n  generatedUser({ prompt: \"Ask a question\" }),\n];\n```\n\nWhen the dataset is saved, you can read these values under `row.meta.metadata`.\n\n#### Automatic ID Generation\n\nWhen generating datasets with a seed, each row automatically receives a unique, deterministic `id` in its metadata. This ID is generated based on the seed value, making it easy to identify and track specific rows across multiple runs.\n\n```typescript\nawait generateDataset(schema, {\n  count: 3,\n  seed: 100,\n  model: openai(\"gpt-4o-mini\"),\n});\n\n// Output in row.meta.metadata:\n// Row 0: { id: \"row_100_1l2dpno\" }  // seed: 100\n// Row 1: { id: \"row_101_txnff9\" }   // seed: 101\n// Row 2: { id: \"row_102_2sx56u\" }   // seed: 102\n```\n\nThe ID combines the seed value with a deterministic hash, ensuring:\n\n- **Reproducibility**: Same seed always generates the same ID\n- **Uniqueness**: Different seeds produce different IDs\n- **Traceability**: Easy to reference specific examples in logs or when combining datasets\n\nIf custom metadata is provided, the ID is automatically merged with it:\n\n```typescript\nawait generateDataset(schema, {\n  count: 2,\n  seed: 100,\n  model: openai(\"gpt-4o-mini\"),\n  metadata: { projectName: \"my-project\" },\n});\n\n// Output: { id: \"row_100_1l2dpno\", projectName: \"my-project\" }\n```\n\n\u003e **Note**: IDs are only generated when a seed is provided. Without a seed, no ID is added.\n\n### Composition Utilities\n\nBuild dynamic, varied datasets with composition helpers:\n\n```typescript\nimport { oneOf, times, between, optional } from \"@qforge/torque\";\n\nconst schema = () =\u003e [\n  // Choose randomly from options (weights optional)\n  oneOf([\n    user({ content: \"Hello\" }),\n    { weight: 0.5, value: user({ content: \"Hi there\" }) },\n    user({ content: \"Hey\" }),\n  ]),\n\n  // Repeat pattern 3 times\n  times(3, [\n    generatedUser({ prompt: \"Ask a question\" }),\n    generatedAssistant({ prompt: \"Answer the question\" }),\n  ]),\n\n  // Repeat random number of times (1-5)\n  times(between(1, 5), [generatedUser({ prompt: \"Follow-up question\" })]),\n\n  // Optionally include (50% chance)\n  optional(assistant({ content: \"Anything else I can help with?\" })),\n];\n```\n\n`oneOf` accepts plain schema entries or `{ value, weight }` objects. Provide any subset of weights (summing to ≤ 1) and the remaining probability is spread evenly across unweighted entries.\n\n#### Unique draws across a dataset\n\nPass a `uniqueBy` configuration when you need each option to be used at most once across every row/schema during generation. When using `uniqueBy`, each option must be an object with `id`, `value`, and optionally `weight`:\n\n```ts\nconst toolOptions = [\n  { id: \"weather\", value: weatherTool.toolFunction() },\n  { id: \"calendar\", value: calendarTool.toolFunction() },\n  { id: \"flight\", value: flightTool.toolFunction() },\n];\n\nconst schema = () =\u003e [\n  oneOf(toolOptions, {\n    uniqueBy: {\n      collection: \"tools\",\n    },\n  }),\n];\n```\n\nYou can also combine `uniqueBy` with weighted options:\n\n```ts\nconst toolOptions = [\n  { id: \"weather\", value: weatherTool.toolFunction(), weight: 0.5 },\n  { id: \"calendar\", value: calendarTool.toolFunction(), weight: 0.3 },\n  { id: \"flight\", value: flightTool.toolFunction(), weight: 0.2 },\n];\n\nconst schema = () =\u003e [\n  oneOf(toolOptions, {\n    uniqueBy: {\n      collection: \"tools\",\n    },\n  }),\n];\n```\n\nThe `collection` name identifies the shared pool (so multiple `oneOf` calls can coordinate). The `id` property must be a string, number, or boolean and is used to track uniqueness. Torque throws if the pool is exhausted, making it easy to guarantee perfect round-robin coverage.\n\n#### Using `uniqueOneOf` factory function\n\nFor a simpler API, use `uniqueOneOf` to automatically generate IDs and create a reusable function. This is especially useful when you want to create the unique selection function outside of your schema:\n\n```ts\nimport { uniqueOneOf } from \"@qforge/torque\";\n\n// Create the factory function outside generation\nconst tools = [weatherTool, calendarTool, flightTool];\nconst oneOfTools = uniqueOneOf(tools);\n\n// Or with weighted options\nconst weightedTools = [\n  { value: weatherTool, weight: 0.5 },\n  { value: calendarTool, weight: 0.3 },\n  flightTool, // unweighted, gets remaining weight\n];\nconst oneOfWeightedTools = uniqueOneOf(weightedTools);\n\nconst schema = () =\u003e {\n  const tool = oneOfTools(); // Returns a unique tool each time\n  return [\n    tool.toolFunction(),\n    generatedUser({ prompt: \"Ask question requiring this tool\" }),\n    generatedToolCall(tool, \"t1\"),\n    generatedToolCallResult(tool, \"t1\"),\n  ];\n};\n```\n\nThe `uniqueOneOf` factory automatically:\n\n- Generates unique IDs for each item\n- Creates a unique collection name\n- Returns a function that enforces uniqueness across calls\n\n\u003e 💡 See weighted example: [`examples/weighted-one-of.ts`](examples/weighted-one-of.ts)  \n\u003e 💡 Full utilities demo: [`examples/composition-utilities.ts`](examples/composition-utilities.ts) | [▶️ Try in Browser](https://stackblitz.com/github/qforge-dev/torque/tree/main/stackblitz-templates/composition-utilities)\n\n### Tool Definitions\n\nDefine tools with Zod schemas for complete type safety:\n\n```typescript\nimport {\n  tool,\n  generatedToolCall,\n  generatedToolCallResult,\n} from \"@qforge/torque\";\nimport { z } from \"zod\";\n\n// use standard tool schema using zod ensuring complete type safety\nconst weatherTool = tool({\n  name: \"get_weather\",\n  description: \"Get current weather for a location\",\n  parameters: z.object({\n    location: z.string().describe(\"City name\"),\n    units: z.enum([\"C\", \"F\"]).optional(),\n  }),\n  output: z.object({\n    temperature: z.number(),\n    condition: z.string(),\n  }),\n});\n\nconst schema = () =\u003e [\n  weatherTool.toolFunction(),\n  generatedUser({ prompt: \"Ask about weather in a city\" }),\n  generatedToolCall(weatherTool, \"t1\"), // type safe 100% correct generated tool calls\n  generatedToolCallResult(weatherTool, \"t1\"), // similarly 100% correct generated tool results\n  generatedAssistant({ prompt: \"Interpret the weather data for the user\" }),\n];\n```\n\n\u003e 💡 See full example: [`examples/tool-calling.ts`](examples/tool-calling.ts) | [▶️ Try in Browser](https://stackblitz.com/github/qforge-dev/torque/tree/main/stackblitz-templates/tool-calling)\n\n### 🔐 TypeScript Support\n\nTorque is built with TypeScript and provides complete type safety. Both for user and AI generating the data.\nEnsure that the arguments and tool results are always matching schema.\n\n```typescript\n// Full type inference for tool parameters\nconst weatherTool = tool({\n  name: \"get_weather\",\n  description: \"Get current weather for a location\",\n  parameters: z.object({\n    location: z.string().describe(\"City name\"),\n    units: z.enum([\"C\", \"F\"]).optional(),\n  }),\n  output: z.object({\n    temperature: z.number(),\n    condition: z.string(),\n  }),\n});\n\n// TypeScript knows the shape of parameters and output\nweatherTool.toolCall(\"t1\", {\n  location: \"NYC\",\n  units: \"C\", // ✅ Type-safe\n  // units: 'K' // ❌ TypeScript error\n});\n\nweatherTool.toolCallResult(\"t1\", {\n  temp: 72,\n  condition: \"Sunny\", // ✅ Type-safe\n  // humidity: 50 // ❌ TypeScript error\n});\n```\n\n### Two-Phase Execution\n\nTorque executes in two phases:\n\n1. **Check Phase** - Analyzes conversation structure, registers tools\n2. **Generate Phase** - Creates actual content with AI generation\n\nThis enables:\n\n- AI awareness of what are the exact steps in the conversation before generating content - you can create schemas where LLM \"fills the gaps\"\n- Accurate progress tracking\n- Pre-validation of conversation flow\n\n### Reproducible Generation with Seeds\n\nControl randomness for reproducible datasets:\n\n```typescript\nawait generateDataset(schema, {\n  count: 50,\n  model: openai(\"gpt-5-mini\"),\n  output: \"data/dataset.jsonl\",\n  seed: 12345, // Same seed = same output\n});\n```\n\n**How seeds work:**\n\n- The `seed` parameter ensures deterministic generation across runs\n- Same seed + same schema = identical dataset structure everytime\n- Useful for debugging, testing, and versioning datasets\n- If omitted, a random seed is generated and displayed in the CLI\n- Seeds control both `torque` random selections and AI model sampling (when supported by the provider)\n\n### Background Token Counting\n\nToken counts for each row are computed off the main thread using a worker pool so dataset generation stays responsive. Configure the pool with `tokenCounterWorkers` (default: `3`), or disable counting entirely by setting it to `0`.\n\n```typescript\nawait generateDataset(schema, {\n  count: 20,\n  model: openai(\"gpt-5-mini\"),\n  tokenCounterWorkers: 5, // spawn 5 token-counting workers\n});\n```\n\n### Output Formats\n\nChoose your preferred output file format and data structure:\n\n```typescript\n// Export as JSONL with default ai-sdk structure (default)\nawait generateDataset(schema, {\n  count: 100,\n  model: openai(\"gpt-4o-mini\"),\n  format: \"jsonl\",\n  output: \"data/dataset.jsonl\",\n});\n\n// Export in OpenAI Chat Completions format (tools + messages structure)\nawait generateDataset(schema, {\n  count: 100,\n  model: openai(\"gpt-4o-mini\"),\n  format: \"jsonl\",\n  exportFormat: \"chat_template\",\n  output: \"data/finetune.jsonl\",\n});\n```\n\n**Supported File Formats (`format`):**\n\n- **`jsonl`** (default) - JSON Lines format, one row per line. Best for streaming and line-by-line processing.\n- **`parquet`** - Apache Parquet columnar format. More efficient for large datasets and analytics tools (e.g., Pandas, DuckDB, Apache Spark).\n\n**Supported Data Structures (`exportFormat`):**\n\n- **`ai-sdk`** (default) - Internal Torque format, compatible with Vercel AI SDK. Includes schema metadata, tool definitions, and full message objects.\n- **`chat_template`** - OpenAI Chat Completions compatible format. Flattened message structure with `tools` and `messages` top-level keys. Ideal for fine-tuning or direct API usage.\n\nBoth formats write rows incrementally as they're generated, so large datasets won't consume excessive memory.\n\n\u003e 💡 When `format` is specified without `output`, the file extension is automatically set based on the format.\n\n\u003e 💡 See full example: [`examples/parquet-export.ts`](examples/parquet-export.ts)\n\n## 🔧 Advanced Examples\n\n### Async Tool Pattern\n\nModel conversations where tools take time to execute:\n\n```typescript\nimport {\n  generateDataset,\n  generatedUser,\n  generatedAssistant,\n  generatedToolCall,\n  generatedToolCallResult,\n  tool,\n  times,\n  between,\n} from \"@qforge/torque\";\nimport { z } from \"zod\";\n\nconst searchTool = tool({\n  name: \"web_search\",\n  description: \"Search the web\",\n  parameters: z.object({ query: z.string() }),\n  output: z.object({ results: z.array(z.string()) }),\n});\n\nawait generateDataset(\n  () =\u003e [\n    searchTool.toolFunction(),\n\n    // Initial request\n    generatedUser({ prompt: \"Ask for information requiring web search\" }),\n\n    // Tool call generated based on the user request\n    generatedToolCall(searchTool, \"search-1\"),\n\n    // Immediate acknowledgment\n    searchTool.toolCallResult(\"search-1\", \"\u003ctool_ack /\u003e\"),\n\n    generatedAssistant({\n      prompt: \"Acknowledge search started, assure user it's in progress\",\n    }),\n\n    // Filler conversation while waiting.\n    // While generating AI is aware how many messages are left.\n    times(between(1, 3), [\n      generatedUser({ prompt: \"Casual conversation, unrelated to search\" }),\n      generatedAssistant({ prompt: \"Respond naturally to casual topic\" }),\n    ]),\n\n    // Actual result arrives with reused arguments\n    generatedToolCall(searchTool, \"search-1-FINAL\", {\n      reuseArgsFrom: \"search-1\",\n    }),\n    // Generated actual result based on previously generated tool call\n    generatedToolCallResult(searchTool, \"search-1-FINAL\"),\n    generatedAssistant({ prompt: \"Present search results to user\" }),\n  ],\n  {\n    count: 50,\n    model: openai(\"gpt-5-mini\"),\n    output: \"data/async-tools.jsonl\",\n  }\n);\n```\n\n\u003e 💡 See full example: [`examples/async-tools.ts`](examples/async-tools.ts) | [▶️ Try in Browser](https://stackblitz.com/github/qforge-dev/torque/tree/main/stackblitz-templates/async-tools)\n\n### Custom Generation Context\n\nGuide the AI's generation style globally:\n\n```typescript\nawait generateDataset(schema, {\n  count: 100,\n  model: openai(\"gpt-5-mini\"),\n  output: \"data/dataset.jsonl\",\n  generationContext: {\n    global: {\n      messages: [\n        {\n          role: \"system\",\n          content:\n            'Keep messages concise and natural. Avoid starting with \"Sure\" or \"Thanks\".',\n        },\n      ],\n    },\n    user: {\n      messages: [\n        {\n          role: \"system\",\n          content:\n            \"Generate diverse user messages with varying levels of technical detail.\",\n        },\n      ],\n    },\n    assistant: {\n      messages: [\n        {\n          role: \"system\",\n          content:\n            \"Assistant should be helpful but concise. Use 2-3 sentences max.\",\n        },\n      ],\n    },\n  },\n});\n```\n\n\u003e 💡 See full example: [`examples/custom-generation-context.ts`](examples/custom-generation-context.ts) | [▶️ Try in Browser](https://stackblitz.com/github/qforge-dev/torque/tree/main/stackblitz-templates/custom-generation-context)\n\n### Multiple Tool Variations\n\nGenerate datasets with different tools:\n\n```typescript\nimport { oneOf } from \"@qforge/torque\";\n\nconst tools = [weatherTool, calculatorTool, searchTool];\n\nawait generateDataset(\n  () =\u003e {\n    const tool = oneOf(tools);\n\n    return [\n      tool.toolFunction(),\n      generatedUser({ prompt: \"Ask question requiring this tool\" }),\n      generatedToolCall(tool, \"t1\"),\n      generatedToolCallResult(tool, \"t1\"),\n      generatedAssistant({ prompt: \"Present the result\" }),\n    ];\n  },\n  {\n    count: 300, // 100 examples per tool\n    model: openai(\"gpt-5-mini\"),\n    output: \"data/multi-tool.jsonl\",\n  }\n);\n```\n\n\u003e 💡 See full example: [`examples/multiple-tool-variations.ts`](examples/multiple-tool-variations.ts) | [▶️ Try in Browser](https://stackblitz.com/github/qforge-dev/torque/tree/main/stackblitz-templates/multiple-tool-variations)\n\n### Realistic Fake Data with Faker\n\nTorque includes built-in [Faker.js](https://fakerjs.dev/) integration that automatically respects the seed system for reproducible fake data generation:\n\n```typescript\nimport {\n  generateDataset,\n  generatedUser,\n  generatedAssistant,\n  faker,\n} from \"@qforge/torque\";\n\nawait generateDataset(\n  () =\u003e [\n    generatedUser({\n      prompt: `Introduce yourself as ${faker.person.fullName()} from ${faker.location.city()}`,\n    }),\n    generatedAssistant({\n      prompt: \"Greet the user warmly\",\n    }),\n  ],\n  {\n    count: 100,\n    model: openai(\"gpt-5-mini\"),\n    output: \"data/personas.jsonl\",\n    seed: 42, // Same seed = same fake names and cities\n  }\n);\n```\n\n**Faker automatically uses Torque's seed system**, so:\n\n- Same seed = identical fake data across runs\n- No manual seed configuration needed\n- Perfect for creating realistic user personas, product data, addresses, emails, etc.\n\n**Common use cases:**\n\n- User personas: `faker.person.fullName()`, `faker.person.jobTitle()`\n- Locations: `faker.location.city()`, `faker.location.country()`\n- E-commerce: `faker.commerce.productName()`, `faker.commerce.price()`\n- Contact info: `faker.internet.email()`, `faker.phone.number()`\n- Dates: `faker.date.future()`, `faker.date.past()`\n\n\u003e 💡 See full example: [`examples/faker-integration.ts`](examples/faker-integration.ts) | [▶️ Try in Browser](https://stackblitz.com/github/qforge-dev/torque/tree/main/stackblitz-templates/faker-integration)\n\n## 🎨 CLI Features\n\nTorque includes a beautiful CLI interface with:\n\n- **Real-time progress bar** showing completed/in-progress generations\n- **Per-generation step tracking** (e.g., \"user message\", \"tool-call (web_search)\")\n- **Token counting** for messages and tools\n- **Concurrent execution** with configurable workers\n- **Seed display** for reproducible runs\n- **Output file location** clearly shown\n\n```\n╭────────────────────────────────────────────────────╮\n│ Dataset Generation                                 │\n├────────────────────────────────────────────────────┤\n│ Total:       100                                   │\n│ Completed:   45                                    │\n│ In Progress: 5                                     │\n│ Seed:        42                                    │\n│ Output:      data/dataset_2025-10-30.jsonl         │\n│ Workers:     5                                     │\n├────────────────────────────────────────────────────┤\n│ ████████████░░░░░░░░░░░░░ 45%                      │\n├────────────────────────────────────────────────────┤\n│ #0: [████████████████░░░░] 80% tool-result (search)│\n│ #1: [██████░░░░░░░░░░░░░░] 30% user message        │\n│ #2: [████████████████████] 100% Writing...         │\n│ #3: [██░░░░░░░░░░░░░░░░░░] 10% assistant message   │\n│ #4: [██████████░░░░░░░░░░] 50% tool-call (calc)    │\n╰────────────────────────────────────────────────────╯\n```\n\n## 🤝 Contributing\n\nContributions are welcome! This is part of a larger project exploring async tool patterns in LLMs.\n\n## 📄 License\n\nMIT License - see [LICENSE](LICENSE) for details\n\n## 🔗 Related\n\nBuilt with:\n\n- [Vercel AI SDK](https://sdk.vercel.ai) - Universal AI provider interface\n- [Zod](https://zod.dev) - TypeScript-first schema validation\n- [Bun](https://bun.sh) - Fast JavaScript runtime\n\n---\n\n**Made with ❤️ for the AI tinkerers community**\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqforge-dev%2Ftorque","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fqforge-dev%2Ftorque","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqforge-dev%2Ftorque/lists"}