https://github.com/krahd/squarespace-to-astro
Python CLI for migrating Squarespace sites to static-site-friendly content snapshots and Astro projects
https://github.com/krahd/squarespace-to-astro
astro cli migration python squarespace static-site-generator web-scraping xml-import
Last synced: about 2 months ago
JSON representation
Python CLI for migrating Squarespace sites to static-site-friendly content snapshots and Astro projects
- Host: GitHub
- URL: https://github.com/krahd/squarespace-to-astro
- Owner: krahd
- License: mit
- Created: 2026-04-05T20:29:23.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2026-04-20T04:05:49.000Z (about 2 months ago)
- Last Synced: 2026-04-20T06:17:07.741Z (about 2 months ago)
- Topics: astro, cli, migration, python, squarespace, static-site-generator, web-scraping, xml-import
- Language: Python
- Homepage: https://krahd.github.io/squarespace-to-astro/
- Size: 229 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Audit: AUDIT-2026-04-20.md
- Roadmap: ROADMAP.md
Awesome Lists containing this project
README
# squarespace-to-astro (s2a)
`s2a` is a Python CLI for extracting content from Squarespace and generating an editable [Astro](https://github.com/withastro/astro) project. This README is the repository entry point for developers, contributors, and advanced users who want to run the project from source or understand how the codebase is organized.
Current release: `v0.5.2`.
For end-user installation and day-to-day usage:
- Project website: [krahd.github.io/squarespace-to-astro](https://krahd.github.io/squarespace-to-astro/)
- Detailed usage guide: [USER_GUIDE.md](USER_GUIDE.md)
## Documentation map
- [USER_GUIDE.md](USER_GUIDE.md): end-user installation, migration workflow, and generated Astro editing
- [CONTRIBUTING.md](CONTRIBUTING.md): contributor setup and pull request expectations
- [DEVELOPMENT.md](DEVELOPMENT.md): codebase layout, architecture, testing, and distribution tooling
- [RELEASE.md](RELEASE.md): versioning, tagging, binary publishing, and Homebrew tap publication
- [CHANGELOG.md](CHANGELOG.md): released changes by version
## Project status
The current implementation supports these main workflows:
- probing a target site for Squarespace indicators, sitemap availability, robots behavior, password gates, and `?format=json-pretty` support
- crawling a site into a structured snapshot of pages, links, assets, headings, and opportunistic Squarespace JSON data
- estimating and downloading Squarespace-hosted assets during `crawl` and `migrate`, with a confirmation prompt, text progress output, route-based public filenames, and content-hash deduplication recorded in `asset_manifest.json`
- capturing browser-authenticated session state with Playwright and reusing those cookies during probe and crawl runs
- importing Squarespace WordPress XML exports into a normalized JSON format
- generating a buildable [Astro](https://github.com/withastro/astro) project from crawl output plus optional XML content, including `--fidelity-mode`, `--layout-strategy`, `--choose-layout-strategy`, and `--markdown` controls for layout-heavy pages, with `components` reconstruction for portfolio grids, gallery blocks, Fluid Engine sections, and classic-editor layouts
- orchestrating probe, crawl, optional XML import, asset download, and Astro generation in one `s2a migrate` workflow
- upgrading older snapshot-root `asset_manifest.json` files automatically during `generate-astro` so legacy hash-suffixed localized filenames are rewritten to the current route-based naming scheme
Current boundaries:
- full Squarespace admin automation is not implemented
- redirect generation implemented (use the `--emit-redirects` flag to write redirect mappings as JSON and a Netlify `_redirects` file)
- commerce, events, forms, and members migration are not implemented
## Supported interfaces
The supported interface for external use is the `s2a` CLI.
The Python modules under `src/s2a/` are intended primarily as implementation details for the CLI and may change between releases. If you import them directly in your own code, pin the project version and review the relevant module before depending on internal behavior.
## Repository layout
- `src/s2a/cli.py`: CLI entry point and command definitions
- `src/s2a/extract/`: HTTP crawling, browser auth capture, XML import, and asset handling
- `src/s2a/normalize/`: report building and data normalization
- `src/s2a/generate/`: Astro project generation
- `scripts/build_binary_release.py`: PyInstaller bundle and standalone release archive builder
- `scripts/render_homebrew_formula.py`: Homebrew formula renderer from release metadata
- `.github/workflows/release-binaries.yml`: GitHub Actions workflow for binary bundle publication
- `.github/workflows/publish-homebrew-tap.yml`: GitHub Actions workflow for tap synchronization
- `tests/`: CLI, generator, auth, XML import, asset, and runtime tests
- `docs/`: static project website published from the repository
## Development quickstart
Python 3.11 or newer is required.
```bash
python -m venv .venv
source .venv/bin/activate
pip install -e .[dev]
python -m playwright install chromium
```
Once installed, check the CLI entry point:
```bash
s2a --help
```
## Running tests
Run the automated test suite with:
```bash
python -m pytest
```
For release-related smoke testing, build the standalone bundle after installing Playwright Chromium and pointing `PLAYWRIGHT_BROWSERS_PATH` at the installed browser cache:
```bash
PLAYWRIGHT_BROWSERS_PATH="$HOME/Library/Caches/ms-playwright" \
python scripts/build_binary_release.py
```
See [DEVELOPMENT.md](DEVELOPMENT.md) for a fuller explanation of the test layout, output directories, and build artifacts.
## CLI workflows
The CLI exposes six user-facing commands, including the combined migration workflow:
- `s2a probe`
- `s2a crawl`
- `s2a auth-browser`
- `s2a import-xml`
- `s2a generate-astro`
- `s2a migrate`
Usage examples and end-user task flows live in [USER_GUIDE.md](USER_GUIDE.md). The authoritative command definitions and help text live in [src/s2a/cli.py](src/s2a/cli.py).
## Distribution
This project currently ships in three ways. The current release is `v0.5.2`.
- standalone binary bundles attached to GitHub Releases
- a Homebrew formula published through `krahd/homebrew-tap`
- source-based installation from this repository
Distribution automation is documented in [RELEASE.md](RELEASE.md) and [DEVELOPMENT.md](DEVELOPMENT.md).
## Contributing
Contribution guidance lives in [CONTRIBUTING.md](CONTRIBUTING.md). Keep changes focused, add or update tests when behavior changes, and update end-user documentation when install or workflow behavior changes.
## License
This project is licensed under the MIT License. See [LICENSE](LICENSE) for details.
## Disclaimer
This tool is provided "as is", without warranty of any kind. You assume responsibility for its use, migration outcomes, and any downstream issues that may result from running it.