{"id":20942420,"url":"https://github.com/unfoldingword-dev/d43-catalog","last_synced_at":"2025-09-23T06:47:52.153Z","repository":{"id":44221092,"uuid":"71858435","full_name":"unfoldingWord-dev/d43-catalog","owner":"unfoldingWord-dev","description":"Lambda functions for the Door43 Catalog.","archived":false,"fork":false,"pushed_at":"2024-10-21T07:58:13.000Z","size":5113,"stargazers_count":1,"open_issues_count":25,"forks_count":7,"subscribers_count":11,"default_branch":"develop","last_synced_at":"2025-08-23T11:11:31.809Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://api.door43.org/v3/catalog","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/unfoldingWord-dev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2016-10-25T04:10:26.000Z","updated_at":"2024-10-21T07:58:18.000Z","dependencies_parsed_at":"2025-05-13T23:47:23.913Z","dependency_job_id":"586b93ab-12b9-42ad-a0f5-37c46695985b","html_url":"https://github.com/unfoldingWord-dev/d43-catalog","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/unfoldingWord-dev/d43-catalog","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/unfoldingWord-dev%2Fd43-catalog","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/unfoldingWord-dev%2Fd43-catalog/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/unfoldingWord-dev%2Fd43-catalog/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/unfoldingWord-dev%2Fd43-catalog/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/unfoldingWord-dev","download_url":"https://codeload.github.com/unfoldingWord-dev/d43-catalog/tar.gz/refs/heads/develop","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/unfoldingWord-dev%2Fd43-catalog/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":276531303,"owners_count":25658697,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-23T02:00:09.130Z","response_time":73,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-18T23:26:54.080Z","updated_at":"2025-09-23T06:47:52.136Z","avatar_url":"https://github.com/unfoldingWord-dev.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"master:\n[![Build Status](https://travis-ci.org/unfoldingWord-dev/d43-catalog.svg?branch=master)](https://travis-ci.org/unfoldingWord-dev/d43-catalog) \n[![Coverage Status](https://coveralls.io/repos/github/unfoldingWord-dev/d43-catalog/badge.svg?branch=master)](https://coveralls.io/github/unfoldingWord-dev/d43-catalog?branch=master)\n\ndevelop:\n[![Build Status](https://travis-ci.org/unfoldingWord-dev/d43-catalog.svg?branch=develop)](https://travis-ci.org/unfoldingWord-dev/d43-catalog) \n[![Coverage Status](https://coveralls.io/repos/github/unfoldingWord-dev/d43-catalog/badge.svg?branch=develop)](https://coveralls.io/github/unfoldingWord-dev/d43-catalog?branch=develop)\n\n# d43-catalog\n\nThese are the AWS Lambda functions for generating the [API catalog endpoint](https://api.door43.org/v3/catalog) from the [Door43 Catalog] organization in our Door43 Git Service.\n\n## Requirements\n* Python 2.7\n* [API Specification](https://github.com/unfoldingWord-dev/api-index)\n\n## Development\n\n* Install [pip](https://pypi.org/project/pip/)\n* Run `pip install -r requirements.txt`\n* install [Apex](https://apex.run/) and configure with your aws credentials\n* run `apex --env prod deploy` to publish everything to production\n* or you can run `apex --env prod deploy ts_v2_catalog` for example to publish a single function.\n\n## How it Works\n\nWhen a new repository is added or forked into the [Door43 Catalog] organization a chain reaction is started that eventually adds the content into the [API](https://api.door43.org/v3/catalog), assuming all the checks passed.  Here is an overview:\n\n1. Someone creates a new repository or forks a repository into the [Door43 Catalog] organization\n2. The organization triggers the `webhook` function which queues the latest git commit for processing.\n\n\u003e The next few functions run on a fixed schedule.\n\u003e If errors occur they are reported and the process resumed\n\u003e at the next scheduled run.\n\u003e\n\u003e If a function produces errors 4 times in a row an email is sent to administrators.\n\n3. The `signing` function looks for and signs new things in the queue.\n4. The `catalog` function takes everything in the queue and generates a new api catalog file. **The content is now in the API!**\n5. The `ts_v2_catalog` function converts the API catalog file into the legacy translationStudio API.\n6. The `uw_v2_catalog` function converts the API catalog file into the legacy unfoldingWord App Catalog.\n7. The `fork` function checks to see if new repositories exist in the organization and executes the `webhook` function if necessary.\n\n\u003e The content in step (1) is now available in all three API endpoints.\n\n7. The `acceptance` function runs when the catalog file is saved in step (4) above. And performs acceptance tests on the file to ensure it was generated correctly.\n\n\n## Function Description\n\nThe following provides a functional description of the functions in this repository.\n\n### webhook \n\nRuns when a change is made in the [Door43 Catalog]\n\n* [x] Accept webhook from organization.\n* [x] Reads manifest from the repository (via HTTPS)\n* [x] Performs some initial manifest validation. See [Manifest Specification](http://resource-container.readthedocs.io/en/latest/manifest.html)\n* [x] Uploads files and adds/updates an entry to the queue\n\n### signing\n\nThis function is run on a schedule and does the following:\n\n- [x] Identifies items in the queue that require signing.\n- [x] Signs files as necessary\n- [x] Verifies that signature checks out\n- [x] Copies files to proper location on CDN as necessary.\n- [x] Uploads the signature file to the CDN\n- [x] Updates the queued item with appropriate urls and file meta data as necessary.\n\n### catalog\n\nThis function is run on a schedule and does the following:\n\n- [x] Performs a consistency check on queued items\n- [x] Generates the new catalog file\n- [x] Uploads the catalog file to the API.\n- [x] Records the catalog status in the status table.\n- [x] Errors or consistency failures are reported as errors.\n\n### acceptance\n\nAfter a new catalog file is written to S3, this function does the following:\n\n- [x] Make sure structure of catalog file is correct\n- [x] Make HEAD request for each resource (every URL) in catalog to verify it exists\n- [x] Report any errors\n\nTechnically this is all duplicate testing of what we are already doing elsewhere in the pipeline.  This function is the \"oops\" catcher.\n\n### fork\n\nThis function is run on a schedule and does the following:\n\n- [x] Checks if there are new repositories in the [Door43 Catalog] organization\n- [x] Triggers the webhook function for each new repository found.\n- [x] Triggers the webhook function for queued items that are flaged as `dirty`.\n\n### ts_v2_catalog\n\nThis function is run on a schedule and does the following:\n\n- [x] Checks for a new v3 API catalog in the status table\n- [x] Builds a v2 tS api from the new/updated v3 catalog.\n\n### uw_v2_catalog\n\nThis function is run on a schedule and does the following:\n\n- [x] Checks for a new v3 API catalog in the status table\n- [x] Builds a v2 uW api from the new/updated v3 catalog.\n\n### trigger\n\nThis function is run via AWS cron every 5 minutes and does the following:\n\n- [x] Executes those function which run on a schedule. e.g. catalog, signing, etc.\n\n## AWS Configuration\n\nHere's a high level overview of the AWS configuration.\nFor Swagger definitions look in the [aws_configuration](./aws_configuration) folder.\nYou can [create an API in API Gateway](http://docs.aws.amazon.com/apigateway/latest/developerguide/create-api-using-swagger.html) by importing these Swagger definitions.\n\n### The following functions are configured as api endpoints within API Gateway:\n\n* webhook: `/webhook`\n* catalog: `/lambda/catalog`\n* fork: `/lambda/fork`\n* signing: `/lambda/signing`\n* ts_v2_catalog: `/lambda/ts-v2-catalog`\n* uw_v2_catalog: `/lambda/uw-v2-catalog`\n\nFor example you can trigger the fork lambda at `https://api.door43.org/v3/lambda/fork`.\n\n\u003e The functions are not designed to always return useful information in the browser and may timeout,\n\u003e however they are still running properly.\n\nThe name of the stage in API Gateway determines the operating environment.\nIf the stage name begins with `prod` the functions will operate on production databases.\nIf the stage name begins with anything other than `prod` the functions will\nprefix databases with the stage name.\n\nFor example:\n\n* a stage named `prod` would use the `d43-catalog-errors` db for reporting errors.\n* a stage named `dev` would use the `dev-d43-catalog-errors` db for reporting errors.\n* a stage named `test` would use the `test-d43-catalog-errors` db for reporting errors.\n\n#### Stage Variables\n\nStage variables are configured within the stage defined in API Gateway.\nThese variables are accessible within lambdas from the `event` parameter.\ne.g. `event['stage-variables']`\n\n* `cdn_bucket`\n* `cdn_url`\n* `to_email`\n* `from_email`\n* `api_bucket`\n* `api_url`\n* `gogs_url`\n* `gogs_org`\n* `gogs_token`\n* `log_level` how noisy the logger should be. debug|info|warning|error\n* `version` the api version\n\n### acceptance function configuration\n\nThe `acceptance` function is ran according to a CloudWatch rule which runs when the catalog file is added to the api S3 bucket.\n\n### trigger function configuration\n\nThe `trigger` function is ran according to a CloudWatch rule which is configured to run every 5 minutes via a cron job.\n\n### Dynamo DB Configuration\n\nThe following database tables are used by the API pipeline described above.\nPlease note additional tables may be necessary when catering to multiple stages (described above).\n\n* `d43-catalog-errors` tracks errors encountered in functions. Keyed with `lambda`.\n* `d43-catalog-in-progress` tracks items in the queue. Keyed with `repo_name`.\n* `d43-catalog-running` tracks functions that are running. This prevents certain functions from having multiple instances running at the same time. Keyed with `lambda`.\n* `d43-catalog-status` tracks the status of the catalog generation. Keyed with `api_version`.\n\n## Tools\n\n### CSV to USFM3\n\nThis tool will convert a csv file containing Greek words to USFM 3 format.\nYou may execute the following command to learn how to use the tool.\n\n```bash\npython execute.py csvtousfm3 -h\n```\n\n### Map tW to USFM3\n\nThis tool will inject tW links into the USFM generated by `csvtousfm3`.\nThis tool is designed to replace the functionality of the config.yaml found within a tW RC\nwith the newly generated USFM3 content.\nAs such this is mostly a one time use tool.\n\n\u003e If you are not sure what to use this tool for you probably shouldn't use it.\n\nYou may execute the following command to learn how to use the tool.\n\n```bash\npython execute.py maptwtousfm3 -h\n```\n\n### Convert OSIS to USFM3\n\nThis tool will convert a directory of OSIS files (xml) to a new directory of USFM3 files.\n\nYou may execute the following command to learn how to use the tool.\n\n```bash\npython execute.py osistousfm3 -h\n```\n\n## Testing\n\nYou can run tests be executing the following:\n\n```bash\npython -m unittest discover -s tests\n```\n\n## Deploying\n\nIn order to deploy to production you need to run this command.\n```bash\napex deploy --env prod\n```\n\nYou can also deploy a specific function with \n\n```bash\napex deploy --env prod catalog\n```\n\nIf you want to cause a catalog to re-build you can delete the catalog entry from the `d43-catalog-status`.\nIt will begin re-building within 5 minutes. Or you can try to force a re-try now by visiting https://api.door43.org/v3/lambda/catalog.\nThe lambdas are not allowed to run too often, so if you are trying to re-start the catalog lambda right away\nyou may also need to delete the `d43-catalog_catalog` record from the `d43-catalog-running` table.\n\n[Door43 Catalog]:https://git.door43.org/Door43-Catalog\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Funfoldingword-dev%2Fd43-catalog","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Funfoldingword-dev%2Fd43-catalog","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Funfoldingword-dev%2Fd43-catalog/lists"}