https://github.com/eeditiones/tuttle
https://github.com/eeditiones/tuttle
exist-db git github gitlab library sync xar
Last synced: 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/eeditiones/tuttle
- Owner: eeditiones
- Created: 2021-11-04T13:21:23.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2025-08-05T10:11:11.000Z (3 months ago)
- Last Synced: 2025-08-05T12:10:59.679Z (3 months ago)
- Topics: exist-db, git, github, gitlab, library, sync, xar
- Language: XQuery
- Homepage:
- Size: 1.65 MB
- Stars: 10
- Watchers: 5
- Forks: 3
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
Awesome Lists containing this project
README
# Tuttle - a Git-integration for eXist-db
Synchronizes your data collection with GitHub and GitLab.
## User Documentation
[User Documentation](https://eeditiones.github.io/tuttle-doc/)
## Functionality
* Sync data collection from Git to DB
* Deal with multiple repositories
* Incremental updates
* Works with private or public repositories
* Works with self hosted instances
* Extendable to other git services
## Requirements
- [node](https://nodejs.org/en/): `v22`
- [exist-db](https://www.exist-db.org): `v5.5.1+ < 7.0.0`
## Installation
Pre-built packages are available
- as [github-releases](https://github.com/eeditiones/tuttle/releases)
```bash
xst package install github-release tuttle --owner eeditiones
```
- and on [exist-db's public package registry](https://exist-db.org/exist/apps/public-repo/packages/tuttle?eXist-db-min-version=5.5.1).
```bash
xst package install from-registry tuttle
```
## Building from source
Tuttle uses Gulp as its build tool which itself builds on NPM.
To initialize the project and load dependencies run
```
npm install
```
| Run | Description |
|---------|-------------|
|```npm run build``` | builds the Tuttle package |
|```npm run deploy``` | build and install Tuttle in one go |
> Note: the `deploy` commands below assume that you have a local eXist-db running on port 8080. However the database connection can be configured (see gulp-exist documentation)
## Testing
To run the local test suite you need
* an instance of eXist running on `localhost:8080` and
* `npm` to be available in your path
* a GitHub personal access token with read access to public repositories
* a gitlab personal access token with read access to public repositories
In CI these access tokens are read from environment variables.
You can do the same with
```bash
export tuttle_token_tuttle_sample_data=; \
export tuttle_token_gitlab_sample_data=; \
path/to/startup.sh
```
Alternatively, you can modify `/db/apps/tuttle/data/tuttle.xml` _and_ `test/fixtures/alt-tuttle.xml`, `test/fixtures/alt-repo-xml-tuttle.xml` to include your tokens. But remember to never commit them!
Run tests with
```
npm test
```
## Configuration
Tuttle is configured in `data/tuttle.xml`.
New with version 2.0.0:
A commented example configuration is available `data/tuttle-example-config.xml`.
If you want to update tuttle your modified configuration file will be backed up to
`/db/tuttle-backup/tuttle.xml` and restored on installation of the new version.
Otherwise, when no back up of an existing config-file is found, the example configuration is copied to `data/tuttle.xml`.
> [!TIP]
> When migrating from an earlier version you can copy your existing configuration to the backup location:
> `xmldb:copy-resource('/db/apps/tuttle/data', 'tuttle.xml', '/db/tuttle-backup', 'tuttle.xml')`
### Repository configuration
The repositories to keep in sync with a gitservice are all listed under the repos-element.
The name-attribute refers to the **destination collection** also known as the **target collection**.
#### Collection
An example: ``
The collection `/db/apps/tuttle-sample-data` is now considered to be kept in sync with a git repository.
```xml
true
github
https://api.github.com/
tuttle-sample-data
tuttle-sample-data
a-personal-access-token
a-branch
a-exist-user
that-users-password
```
#### type
```xml
gitlab
```
There are two supported git services at the moment `github` and `gitlab`
#### baseurl
```xml
https://api.server/
```
* For github the baseurl is `https://api.github.com/` or your github-enterprise API endpoint
* For gitlab the baseurl is `https://gitlab.com/api/v4/` but can also be your private gitlab server egg 'https://gitlab.existsolutions.com/api/v4/'
#### repo, owner and project-id
* For github you **have to** specify the owner and the repo
* For gitlab you **have to** specify the project-id of the repository
#### ref
```xml
main
```
Defines the branch you want to track.
#### hookuser & hookpasswd
#### token
If a token is specified Tuttle authenticates against GitHub or GitLab. When a token is not defined, Tuttle assumes a public repository without any authentication.
> [!NOTE]
> Be aware of the rate limits for unauthenticated requests
> GitHub allows 60 unauthenticated requests per hour but 5,000 for authenticated requests
> [!TIP]
> It is also possible to pass the token via an environment variable. The name of the variable have to be `tuttle_token_ + collection` (all dashes must be replaces by underscore). Example: `tuttle_token_tuttle_sample_data`
##### Create API-Keys for Github / Gitlab
At this stage of development, the API keys must be generated via the API endpoint `/git/apikey` or for a specific collection `/git/{collection}/apikey`.
In the configuration `tuttle.xml` the "hookuser" is used to define the dbuser which executes the update.
Example configuration for GitHub:
* 'Payload URL': https://existdb:8443/exist/apps/tuttle/git/hook
* 'Content type': application/json
Example configuration for GitLab:
* 'URL' : https://46.23.86.66:8443/exist/apps/tuttle/git/hook
## Dashboard
The dashboard lists all configured collections showing the health
of all of them at a glance.
Here, you can trigger a full deployment or an incremental update for each collection.
Full deployment clones the repository from git at ref and installs it as a `.xar` file or just moves the staging collection.
This is a way to get to a known state in case you encounter issues.
An incremental update only applies those changes to the target collection that happened in the repository after the last synchronization.
> [!NOTE]
> Tuttle is built to keep track of **data collections**
> [!NOTE]
> Tuttle does not run pre- or post install scripts nor change the index configuration on incremental updates!
### Let's start
1) customize the configuration (`data/tuttle.xml`)
2) login to the dashboard
2) click on 'full' to trigger a full deployment from git to existdb
3) now you can update your collection with a click on 'incremental'
Repositories from which a valid XAR (existing `expath-pkg.xml`) package can be generated are installed as a package, all others are created purely on the DB.
> [!NOTE]
> Note that there may be index problems if a collection is not installed as a package.
## API
The page below is reachable via [api.html](api.html) in your installed tuttle app.

### API endpoint description
Calling the API without {collection} ``config:default-collection()`` is chosen.
#### Fetch to staging collection
`` GET ~/tuttle/{collection}/git``
With this most basic endpoint the complete data repository is pulled from the gitservice.
The data will not directly update the target collection but be stored in a staging
collection.
To update the target collection use another POST request to `/tuttle/git`.
The data collection is stored in `/db/app/sample-collection-staging`.
#### Deploy the collection
`` POST ~/tuttle/{collection}/git``
The staging collection `/db/app/sample-collection-staging` is deployed to `/db/app/sample-collection`. All permissions are set and a pre-install function is called if needed.
#### Incremental update
`` POST ~/tuttle/{collection}/git``
All commits since the last update are applied.To ensure the integrity of the collection, all commits are deployed individually.
#### Get the repository hashed
`` GET ~/tuttle/{collection}/hash``
Reports the GIT hashed of all participating collections and the hash of the remote repository.
#### Get Commits
`` GET ~/tuttle/{collection}/commits``
Displays all commits with commit message of the repository.
#### Hook Trigger
`` GET ~/tuttle/{collection}/hook``
The webhook is usually triggered by GitHub or GitLab.
An incremental update is triggered.
Authentication is done by APIKey. The APIKey must be set in the header of the request.
#### Example für GitLab
``` curl --header 'X-Gitlab-Token: RajWFNCILBuQ8SWRfAAAJr7pHxo7WIF8Fe70SGV2Ah' http://127.0.0.1:8080/exist/apps/tuttle/git/hook```
### Generate the APIKey
`` GET ~/tuttle/{collection}/apikey``
The APIKey is generated and displayed once. If forgotten, it must be generated again.
### Display the Repository configuration and status
`` GET ~/tuttle/config ``
Displays the configuration and the state of the git repository.
States:
- uptodate: Collection is up to date with GIT
- behind: Collection is behind GIT and need an update
- new: Collection is not a tuttle collection, full deployment is needed
```xml
sample-collection-github
```
### Remove Lockfile
`` POST ~/tuttle/{collection}/lockfile ``
Remove lockfile after anything goes wrong.
#### Print Lockfile
`` GET ~/tuttle/{collection}/lockfile ``
The running task is stored in the lockfile. It ensures that two tasks do not run at the same time.
## Access token for gitservice (incomplete)
To talk to the configured gitservice Tuttle needs an access token. These can
be obtained from the respective service.
* see [Creating a personal access token](https://docs.github.com/en/github/authenticating-to-github/creating-a-personal-access-token) for github
* see [Personal access tokens](https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html) for Gitlab
The key for the gitservice must be configured in Gitservice configuration as shown above.
## Roadmap
- [ ] DB to Git
## Honorable mentions:

[Horace Parnell Tuttle - American astronomer](http://www.klima-luft.de/steinicke/ngcic/persons/tuttle.htm)
[Archibald "Harry" Tuttle - Robert de Niro in Terry Gilliams' 'Brazil'](https://en.wikipedia.org/wiki/Brazil_(1985_film))