{"id":50138803,"url":"https://github.com/casoon/astro-site-files","last_synced_at":"2026-05-24T00:02:57.804Z","repository":{"id":356083461,"uuid":"1230929860","full_name":"casoon/astro-site-files","owner":"casoon","description":"Astro integration that generates robots.txt, llms.txt, sitemap.xml, security.txt, and humans.txt at build time from typed configuration.","archived":false,"fork":false,"pushed_at":"2026-05-06T15:22:34.000Z","size":68,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-06T15:47:58.391Z","etag":null,"topics":["astro","astro-integration","humans-txt","llms-txt","robots-txt","security-txt","seo","sitemap","typescript"],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/casoon.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-06T13:09:18.000Z","updated_at":"2026-05-06T15:26:36.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/casoon/astro-site-files","commit_stats":null,"previous_names":["casoon/astro-site-files"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/casoon/astro-site-files","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casoon%2Fastro-site-files","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casoon%2Fastro-site-files/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casoon%2Fastro-site-files/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casoon%2Fastro-site-files/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/casoon","download_url":"https://codeload.github.com/casoon/astro-site-files/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casoon%2Fastro-site-files/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33416316,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-23T22:14:44.296Z","status":"ssl_error","status_checked_at":"2026-05-23T22:14:43.778Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["astro","astro-integration","humans-txt","llms-txt","robots-txt","security-txt","seo","sitemap","typescript"],"created_at":"2026-05-24T00:02:55.279Z","updated_at":"2026-05-24T00:02:57.775Z","avatar_url":"https://github.com/casoon.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# @casoon/astro-site-files\n\nAstro integration that generates all standard site meta-files from typed configuration at build time.\n\n## What it does\n\n- Generates `robots.txt` — crawl rules with per-agent overrides and automatic sitemap reference\n- Generates `llms.txt` — AI model discovery file following the [llmstxt.org](https://llmstxt.org) specification\n- Generates `sitemap.xml` — built-in, enabled by default, with i18n hreflang and sitemap-index support\n- Generates `/.well-known/security.txt` — vulnerability disclosure contact per [RFC 9116](https://www.rfc-editor.org/rfc/rfc9116)\n- Generates `humans.txt` — team and technology credits per [humanstxt.org](https://humanstxt.org)\n\nAll files are written to the build output directory when `astro build` runs.\n\n\u003e **Successor package.** This integration replaces [@casoon/astro-crawler-policy](https://github.com/casoon/astro-crawler-policy) (robots.txt + llms.txt) and [@casoon/astro-sitemap](https://github.com/casoon/astro-sitemap) (sitemap.xml). Both predecessor packages are no longer actively maintained.\n\n## Installation\n\n```sh\nnpm install @casoon/astro-site-files\n```\n\n## Quick start\n\n```ts\n// astro.config.ts\nimport { defineConfig } from 'astro/config'\nimport siteFiles from '@casoon/astro-site-files'\n\nexport default defineConfig({\n  site: 'https://example.com',\n  integrations: [\n    siteFiles({\n      robots: { disallow: ['/admin'] },\n      llms: { title: 'Example', description: 'An example website.' },\n      security: { contact: 'mailto:security@example.com' },\n      humans: {\n        team: [{ name: 'Alice', role: 'Development' }],\n        technology: ['Astro', 'TypeScript']\n      }\n    })\n  ]\n})\n```\n\n`robots.txt` and `sitemap.xml` are enabled by default. The other three files are generated only when their option is configured.\n\n## robots.txt\n\n```ts\nsiteFiles({\n  robots: {\n    disallow: ['/admin', '/private/'],\n    allow: ['/admin/public/'],\n    crawlDelay: 2,\n    sitemap: true,           // auto-derive from astro.config site URL (default)\n    agents: [\n      {\n        userAgent: 'Googlebot',\n        crawlDelay: 1\n      }\n    ]\n  }\n})\n```\n\n**Option reference:**\n\n| Option | Type | Default | Description |\n|---|---|---|---|\n| `disallow` | `string[]` | `[]` | Paths to disallow for `User-agent: *` |\n| `allow` | `string[]` | `[]` | Paths to explicitly allow for `User-agent: *` |\n| `crawlDelay` | `number` | — | Crawl-delay for `User-agent: *` |\n| `sitemap` | `boolean \\| string` | `true` | `true` = derive URL from `astro.config.site`, `string` = explicit URL, `false` = omit |\n| `agents` | `AgentRule[]` | `[]` | Additional per-agent rule blocks |\n\nEach entry in `agents`:\n\n| Field | Type | Description |\n|---|---|---|\n| `userAgent` | `string \\| string[]` | User-agent value(s) |\n| `allow` | `string[]` | Paths to allow |\n| `disallow` | `string[]` | Paths to disallow |\n| `crawlDelay` | `number` | Crawl-delay for this agent |\n\n**Disable:** `robots: false`\n\n**Generated output:**\n\n```\nUser-agent: *\nDisallow: /admin\nDisallow: /private/\nAllow: /admin/public/\nCrawl-delay: 2\n\nUser-agent: Googlebot\nCrawl-delay: 1\n\nSitemap: https://example.com/sitemap.xml\n```\n\n## llms.txt\n\nFollows the [llmstxt.org](https://llmstxt.org) specification. Provides structured metadata for AI models discovering what your site is about.\n\n```ts\nsiteFiles({\n  llms: {\n    title: 'Example',\n    description: 'An example website focused on TypeScript tooling.',\n    details: 'This site documents internal tools and workflows.',\n    sections: [\n      {\n        title: 'Documentation',\n        links: [\n          { title: 'Getting started', url: '/docs/start', description: 'Setup guide' },\n          { title: 'API reference', url: '/docs/api' }\n        ]\n      }\n    ]\n  }\n})\n```\n\n**Option reference:**\n\n| Option | Type | Description |\n|---|---|---|\n| `title` | `string` | **Required.** Site or project name |\n| `description` | `string` | Short description rendered as a blockquote |\n| `details` | `string` | Additional plain-text context |\n| `sections` | `Section[]` | Named sections with link lists |\n\nEach entry in `sections`:\n\n| Field | Type | Description |\n|---|---|---|\n| `title` | `string` | Section heading |\n| `links` | `Link[]` | Optional list of links |\n\nEach entry in `links`:\n\n| Field | Type | Description |\n|---|---|---|\n| `title` | `string` | Link label |\n| `url` | `string` | Absolute or relative URL |\n| `description` | `string` | Optional inline description after the link |\n\n**Disable:** Omit the option or set `llms: false`\n\n**Generated output:**\n\n```md\n# Example\n\n\u003e An example website focused on TypeScript tooling.\n\nThis site documents internal tools and workflows.\n\n## Documentation\n\n- [Getting started](/docs/start): Setup guide\n- [API reference](/docs/api)\n```\n\n## sitemap.xml\n\nSitemap generation is built-in and enabled by default. Static pages are discovered automatically from Astro's build output. Dynamic URLs can be added via `sources`.\n\n```ts\nsiteFiles({\n  sitemap: {\n    exclude: ['/landing/'],\n    priority: [{ pattern: '/blog/', priority: 0.9 }],\n    sources: [\n      async () =\u003e {\n        const posts = await getCollection('blog')\n        return posts.map(p =\u003e ({ loc: `/blog/${p.id}/`, lastmod: p.data.date }))\n      }\n    ]\n  }\n})\n```\n\n**Option reference:**\n\n| Option | Type | Description |\n|---|---|---|\n| `siteUrl` | `string` | Override the site URL (auto-detected from `astro.config.site`) |\n| `sources` | `SitemapSource[]` | Async functions returning additional `SitemapEntry[]` |\n| `exclude` | `(string \\| RegExp)[]` | URL paths or patterns to exclude |\n| `filter` | `(url: string) =\u003e boolean` | Custom filter on the full absolute URL |\n| `priority` | `PriorityRule[]` | Pattern-based priority overrides (first match wins) |\n| `changefreq` | `ChangefreqRule[]` | Pattern-based changefreq overrides (first match wins) |\n| `serialize` | `(entry) =\u003e entry \\| undefined` | Per-item transform or filter hook |\n| `i18n` | `{ defaultLocale, locales }` | Generates `\u003cxhtml:link rel=\"alternate\"\u003e` hreflang entries |\n| `output.mode` | `'single' \\| 'index'` | `index` splits into numbered chunks (auto when \u003e `maxUrls`) |\n| `output.maxUrls` | `number` | Max URLs per file in index mode — default `50 000` |\n| `output.filename` | `string` | Output filename — default `sitemap.xml` |\n| `audit.warnOnEmpty` | `boolean` | Warn when sitemap has zero entries — default `true` |\n| `audit.errorOnDuplicates` | `boolean` | Emit error instead of warning for duplicate URLs — default `false` |\n\n**Built-in exclusions** (always applied): `/404`, `/500`, `/_*`, `/api/`, `/landing/`, `/drafts/`, `sitemap.xml`, `robots.txt`, `llms.txt`, `rss.xml`.\n\n**Built-in priority defaults:** `/` → 1.0, depth 1 → 0.9, depth 2 → 0.8, depth 3+ → 0.7\n\n**Built-in changefreq defaults:** `/` and content paths (`/blog/`, `/artikel/`, etc.) → `weekly`, everything else → `monthly`\n\n**Disable:** `sitemap: false`\n\n## security.txt\n\nGenerated at `/.well-known/security.txt` per [RFC 9116](https://www.rfc-editor.org/rfc/rfc9116). The `contact` field is required by the specification.\n\n```ts\nsiteFiles({\n  security: {\n    contact: 'mailto:security@example.com',\n    policy: 'https://example.com/security-policy',\n    acknowledgments: 'https://example.com/hall-of-fame',\n    preferredLanguages: ['en', 'de'],\n    expires: '2027-01-01T00:00:00.000Z',\n    hiring: 'https://example.com/jobs'\n  }\n})\n```\n\n**Option reference:**\n\n| Option | Type | Description |\n|---|---|---|\n| `contact` | `string \\| string[]` | **Required.** `mailto:` or `https:` URI for reporting vulnerabilities |\n| `policy` | `string` | URL of the security policy |\n| `acknowledgments` | `string` | URL of the acknowledgments or hall-of-fame page |\n| `preferredLanguages` | `string[]` | BCP 47 language tags, e.g. `['en', 'de']` |\n| `expires` | `string \\| Date` | ISO 8601 expiry date — when to renew the file |\n| `encryption` | `string` | URL of the PGP public key |\n| `canonical` | `string` | Canonical URL of this `security.txt` file |\n| `hiring` | `string` | URL of a security-focused jobs page |\n\n**Disable:** Omit the option or set `security: false`\n\n**Generated output:**\n\n```\nContact: mailto:security@example.com\nExpires: 2027-01-01T00:00:00.000Z\nAcknowledgments: https://example.com/hall-of-fame\nPreferred-Languages: en, de\nPolicy: https://example.com/security-policy\nHiring: https://example.com/jobs\n```\n\n## humans.txt\n\nFollows the [humanstxt.org](https://humanstxt.org) convention.\n\n```ts\nsiteFiles({\n  humans: {\n    team: [\n      { name: 'Alice', role: 'Development', location: 'Berlin' },\n      { name: 'Bob', role: 'Design', twitter: '@bob' }\n    ],\n    thanks: ['Open Source Community', 'Our early users'],\n    technology: ['Astro', 'TypeScript', 'Tailwind CSS'],\n    note: 'Built with care.'\n  }\n})\n```\n\n**Option reference:**\n\n| Option | Type | Description |\n|---|---|---|\n| `team` | `TeamMember[]` | List of team members |\n| `thanks` | `string[]` | Acknowledgment entries |\n| `technology` | `string[]` | Technologies used — rendered as a comma-separated list |\n| `note` | `string` | Free-form note |\n| `lastUpdate` | `string \\| Date` | Defaults to the build date |\n\nEach entry in `team`:\n\n| Field | Type | Description |\n|---|---|---|\n| `name` | `string` | **Required.** Full name |\n| `role` | `string` | Job title or role |\n| `twitter` | `string` | Twitter / X handle |\n| `location` | `string` | City or country |\n| `email` | `string` | Contact email |\n\n**Disable:** Omit the option or set `humans: false`\n\n**Generated output:**\n\n```\n/* TEAM */\n    Name: Alice\n    Role: Development\n    Location: Berlin\n\n/* SITE LAST UPDATED */\n    2026-05-06\n\n/* TECHNOLOGY COLOPHON */\n    Astro, TypeScript, Tailwind CSS\n```\n\n## Build-time audit hints\n\nThe integration emits build-time hints when configuration looks incomplete or incorrect. Each hint has a rule ID, a level (`info` / `warn`), and a help message.\n\n**All rule IDs:**\n\n| Rule ID | Level | Triggered when |\n|---|---|---|\n| `robots/legal-pages-blocked` | warn | A legal page (`/privacy`, `/terms`, `/impressum`, …) is in `disallow` |\n| `llms/no-description` | info | `llms` has no `description` |\n| `llms/no-sections` | info | `llms` has no `sections` |\n| `llms/sections-without-links` | info | Sections exist but none have `links` |\n| `security/no-expires` | warn | `security` has no `expires` date (required by RFC 9116) |\n| `security/no-policy` | info | `security` has no `policy` URL |\n| `humans/no-team` | info | `humans` has no `team` entries |\n| `humans/no-technology` | info | `humans` has no `technology` entries |\n\n**Disable all hints:**\n\n```ts\nsiteFiles({ audit: false })\n```\n\n**Suppress specific rules:**\n\n```ts\nsiteFiles({\n  audit: {\n    disable: [\n      'llms/no-description',\n      'security/no-expires',\n    ],\n  },\n})\n```\n\n**`audit` option reference:**\n\n| Option | Type | Description |\n|---|---|---|\n| `enabled` | `boolean` | Set to `false` to silence all hints |\n| `disable` | `string[]` | Rule IDs to suppress individually |\n\nPassing `audit: false` is equivalent to `audit: { enabled: false }`.\n\n## Option defaults\n\n| Option | Default behavior |\n|---|---|\n| `robots` | Enabled — generates `robots.txt` with `Disallow:` (allow all) |\n| `llms` | Disabled — requires `{ title }` |\n| `sitemap` | Enabled — built-in sitemap generation from Astro's build output |\n| `security` | Disabled — requires `{ contact }` |\n| `humans` | Disabled — generates when any option is provided |\n| `audit` | Enabled — emits build-time hints for all generated files |\n\n## Programmatic usage\n\nThe renderer functions are exported for use outside of the Astro integration:\n\n```ts\nimport {\n  renderRobotsTxt,\n  renderLlmsTxt,\n  renderSecurityTxt,\n  renderHumansTxt,\n  renderSitemapXml,\n  renderSitemapIndex,\n  resolveEntry,\n  deduplicateEntries,\n  auditSitemap,\n  auditRobots,\n  auditLlms,\n  auditSecurity,\n  auditHumans,\n  filterIssues,\n} from '@casoon/astro-site-files'\nimport type { AuditOptions, AuditIssue } from '@casoon/astro-site-files'\n\nconst robots = renderRobotsTxt({ disallow: ['/admin'] }, 'https://example.com')\nconst llms = renderLlmsTxt({ title: 'My Site', description: 'A site.' })\nconst security = renderSecurityTxt({ contact: 'mailto:security@example.com' })\nconst humans = renderHumansTxt({ team: [{ name: 'Alice' }], technology: ['Astro'] })\n\nconst entries = [{ loc: '/blog/post/' }].map(e =\u003e resolveEntry(e, {}, 'https://example.com'))\nconst xml = renderSitemapXml(deduplicateEntries(entries))\n```\n\n---\n\n\u003e This package covers static file generation. Actual crawl enforcement depends on whether bots respect these files — many do not.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcasoon%2Fastro-site-files","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcasoon%2Fastro-site-files","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcasoon%2Fastro-site-files/lists"}