{"id":26572965,"url":"https://github.com/kingsdigitallab/corpus-building","last_synced_at":"2025-03-23T00:36:56.256Z","repository":{"id":246883468,"uuid":"823033267","full_name":"kingsdigitallab/corpus-building","owner":"kingsdigitallab","description":null,"archived":false,"fork":false,"pushed_at":"2025-03-19T14:22:34.000Z","size":1222282,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-03-19T15:27:54.554Z","etag":null,"topics":["inscriptions","itemsjs","sveltekit","tei-xml"],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kingsdigitallab.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-02T09:48:01.000Z","updated_at":"2025-03-12T10:09:53.000Z","dependencies_parsed_at":"2024-09-18T10:59:30.471Z","dependency_job_id":"3a9e60e6-883a-444b-acc2-d983398a6e33","html_url":"https://github.com/kingsdigitallab/corpus-building","commit_stats":null,"previous_names":["kingsdigitallab/corpus-building"],"tags_count":23,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingsdigitallab%2Fcorpus-building","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingsdigitallab%2Fcorpus-building/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingsdigitallab%2Fcorpus-building/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingsdigitallab%2Fcorpus-building/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kingsdigitallab","download_url":"https://codeload.github.com/kingsdigitallab/corpus-building/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245040206,"owners_count":20551299,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["inscriptions","itemsjs","sveltekit","tei-xml"],"created_at":"2025-03-23T00:36:55.803Z","updated_at":"2025-03-23T00:36:56.239Z","avatar_url":"https://github.com/kingsdigitallab.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Corpus Building\n\nThis project processes [EpiDoc](https://epidoc.stoa.org) [TEI](https://tei-c.org)\nXML files and presents them as a static website.\n\nIt uses a monorepo structure with two main components: an [ETL](packages/etl)\n(Extract, Transform, Load) process for handling XML files, and a\n[web application](frontend/README.md) for presenting the processed data.\n\n## Project structure\n\nThe main components of the project are:\n\n- `packages/`\n  - `etl/`: ETL package for processing XML\n- `frontend/`: Static site generator web application\n- `data/`\n  - `processed/`: Output data generated by the `etl` package after processing the `raw` data\n  - `raw/`: Git submodule for the EpiDoc files\n- `xslt/`\n  - `epidoc/`: Git submodule for XSLT stylesheets\n  - `start-edition.sef.json`: Compiled version of the XSLT to convert the XML files into HTML\n\n## Workflow\n\n```mermaid\ngraph TD\nA[EpiDoc Submodule] --\u003e B[ETL Process]\nX[XSLT Submodule] --\u003e B\nB --\u003e|Transform XML| C[Saxon-JS]\nC --\u003e|HTML Fragments| D[Processed HTML]\nB --\u003e|Extract Corpus Data| E[JSON Data]\nD --\u003e F[Static Site Generator]\nE --\u003e F\nG[Markdown Files] -.-\u003e F\nF --\u003e|Generate Pages| H[Static HTML]\nH -.-\u003e|Index| I[Pagefind]\nE -.-\u003e|Map Data| F\nH -.-\u003e J[Interactive Map]\n```\n\n## Getting started\n\n1. Clone this repository\n1. Initialise and update the submodules\n\n   ```sh\n   git submodule update --init --recursive\n   ```\n\n1. Install dependencies\n\n   ```sh\n   npm install\n   ```\n\n1. Run the etl process\n\n   ```sh\n   npm run etl\n   ```\n\n1. Run the development server\n\n   ```sh\n   npm run frontend:dev\n   ```\n\nThe project should be available at http://localhost:5173/.\n\n## Adding content to the site\n\nStatic pages are added to the site via [markdown files](https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax). Markdown support is implemented in the\nproject using [mdsvex](https://mdsvex.pngwn.io/). Pages are added to the site by\nadding a new entry to the [frontend/src/routes/](frontend/src/routes/) directory.\n\nFirst, create a new sub-directory in the routes directory. For example, to add\na new page called \"about\", create a new directory called `about` and add\na `+page.md` file to it.\n\nThe `+page.md` file should contain the markdown content for the page. The page\nwill be added to the site and will be accessible at `http://PROJECT_URL/about`.\n\n## Editorial workflow\n\nNew editorial content should be added in the `research` branch. This branch is\nautomatically deployed to the preview site in [GitHub Pages](https://kingsdigitallab.github.io/corpus-building/), together with the `develop` and `main` branches.\n\nContent that needs to be visible to the public should be added to the `main`\nbranch. Content to the `main` branch needs to be added via a pull request.\n\n## Deployment\n\nThe site is automatically deployed, via a [GitHub Actions workflow](.github/workflows/frontend.yml), to GitHub Pages whenever there are commits to the `develop`, `main` or `research` branches.\n\nThe preview site is available at [https://kingsdigitallab.github.io/corpus-building/](https://kingsdigitallab.github.io/corpus-building/).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkingsdigitallab%2Fcorpus-building","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkingsdigitallab%2Fcorpus-building","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkingsdigitallab%2Fcorpus-building/lists"}