{"id":21016605,"url":"https://github.com/gsa/catalog.data.gov","last_synced_at":"2025-05-15T05:33:02.188Z","repository":{"id":37056311,"uuid":"261567981","full_name":"GSA/catalog.data.gov","owner":"GSA","description":"Development environment for catalog.data.gov","archived":false,"fork":false,"pushed_at":"2025-05-08T20:48:07.000Z","size":12676,"stargazers_count":62,"open_issues_count":4,"forks_count":24,"subscribers_count":16,"default_branch":"main","last_synced_at":"2025-05-08T21:43:33.103Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://catalog.data.gov","language":"CSS","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/GSA.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-05-05T19:45:54.000Z","updated_at":"2025-05-07T15:53:51.000Z","dependencies_parsed_at":"2023-11-28T23:22:39.307Z","dependency_job_id":"883a56f2-4089-4a1c-8aaf-4593f25fdbbf","html_url":"https://github.com/GSA/catalog.data.gov","commit_stats":{"total_commits":1860,"total_committers":32,"mean_commits":58.125,"dds":0.7252688172043011,"last_synced_commit":"d22f76e16d6c8ad41a581ac4929233dcefd352ca"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GSA%2Fcatalog.data.gov","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GSA%2Fcatalog.data.gov/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GSA%2Fcatalog.data.gov/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GSA%2Fcatalog.data.gov/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/GSA","download_url":"https://codeload.github.com/GSA/catalog.data.gov/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254282739,"owners_count":22045128,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-19T10:14:52.185Z","updated_at":"2025-05-15T05:33:02.165Z","avatar_url":"https://github.com/GSA.png","language":"CSS","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![GitHub Actions](https://github.com/gsa/catalog.data.gov/actions/workflows/publish.yml/badge.svg)](https://github.com/GSA/catalog.data.gov/actions/workflows/publish.yml)\n\n\n# catalog.data.gov\n\n\nThis is a local development harness for catalog.data.gov.  For details on the system architecture, please see our [data.gov systems list](https://github.com/GSA/data.gov/blob/master/SYSTEMS.md)!\n\n\n## Usage\n\nThe _only_  deployable artifact associated with this repository is the\n`requirements.txt` file. See github actions\nfor full configuration in live environments.\n\nThe live environment is different than the development environment in\na number of ways. Changes made in this repo that work correctly in the\ndevelopment environment may require additional steps to be taken in\norder to make sure the application is deployable to the live\nenvironment:\n\n- If you need to add or change a dependency, you should make that\n  change in the `ckan/requirements.in`, run `make update-dependencies`\n  and commit the changed files.  (See the section below on\n  requirements for details.)  Good news: no other changes are required!\n  \n- If you need to add or change configuration that lives in the\n  application *ini* file (such as a plugin), you will also need to \n  update the configuration file template at `ckan/setup/ckan.ini`.\n  \n- If you find you need to modify the `ckan/Dockerfile` to add OS\n  packages or install software, other changes may need to be made to\n  the cloud.gov buildpack.  Please bring these situations to the team's\n  attention.\n\n## Development\n\n### Requirements\n\nWe assume your environment is already setup with these tools.\n\n- [GNU Make](https://www.gnu.org/software/make/)\n- [Docker Compose](https://docs.docker.com/compose/overview/)\n\n\n### Getting started\n\nBuild and start the docker containers.\n\n    $ make build up\n\nOpen your web browser to [localhost:5000](http://localhost:5000) (or [ckan:5000](http://ckan:5000) if you add ckan to your `hosts` file).  \nYou can log into your instance with user `admin`, password `password`.\n\nRun the integration tests.\n\n    $ make test\n\nStop and remove the containers and volumes associated with this setup.\n\n    $ make clean\n\n See `.env` to override settings. Some settings may require a re-build (`make\n clean build`).\n\n*Note: the solr configuration has a locking mechanism that only allows one\nsolr to access its data at a time.  There are two methods to recover solr in\nthis state.  `make clear-solr-volume` destroys all of the solr data and starts\nfrom scratch.  `make unlock-solr-volume` unlocks the data to allow another\nsolr to access it.  BE CAREFUL when running the `make unlock-solr-volume`\ncommand!  If two solrs are talking to the same volume, the data may corrupt\nand would need to be destroyed anyway.*\n\n### Test extensions\n\nTo test extensions locally you can run\n_TODO: update this for pytest_\n\n```\ndocker compose exec ckan bash\nnosetests --ckan --with-pylons=src/ckan/test-catalog-next.ini src/ckanext-datagovtheme/ckanext/datagovtheme/\nnosetests --ckan --with-pylons=src/ckan/test-catalog-next.ini src/ckanext-datagovtheme/ckanext/datajson/\nnosetests --ckan --with-pylons=src/ckan/test-catalog-next.ini src/ckanext-datagovtheme/ckanext/geodatagov/\n```\n\n### Run Cypress Tests\n\nTo test the UI and e2e user tests, run\n\n    $ make test\n\n#### Run Cypress tests interactively\n\nTo run cypress tests locally, cypress needs to be installed first.\nRun `npm install cypress`.\n\nAt this point, you will need to manually change the .env file to\nhave `CKAN_SITE_URL=http://localhost:5000`. This is to cover for \na docker bug upstream: https://github.com/docker/compose/issues/7423\n\nThen, you can run `make cypress`. For WSL or complex installation, please see\na data.gov team member or follow the steps laid out\n[here](https://nickymeuleman.netlify.app/blog/gui-on-wsl2-cypress#vcxsrv).\n\n## Deploying to cloud.gov\n\nCopy `vars.yml.template` to `vars.yml`, and customize the values in that file. Then, assuming [you're logged in for the Cloud Foundry CLI](https://cloud.gov/docs/getting-started/setup/):\n\nCreate the database used by CKAN itself. You have to wait a bit for the datastore DB to be available (see [the cloud.gov instructions on how to know when it's up](https://cloud.gov/docs/services/relational-database/#instance-creation-time)).\n\n    $ cf create-service aws-rds small-psql ${app_name}-db -c '{\"version\": \"15\"}'\n\nCreate the Redis service for cache\n\n    $ cf create-service aws-elasticache-redis redis-dev ${app_name}-redis\n\nCreate the SOLR service for data search\n\n    $ cf create-service solr-on-ecs base ${app_name}-solr -c solr/service-config-${space}.json -b \"ssb-solrcloud-gsa-datagov-${space}\"\n\nCreate the secrets service to store secret environment variables. See\n[Secrets](#secrets) below.\n\nYou should now be able to visit `https://[ROUTE]`, where `[ROUTE]` is the route reported by `cf app ${app_name}`.\n\n\n### Secrets\n\nips on managing\n[secrets](https://github.com/GSA/datagov-deploy/wiki/Cloud.gov-Cheat-Sheet#secrets-management).\nWhen creating the service for the first time, use `create-user-provided-service`\ninstead of update.\n\n    $ cf update-user-provided-service ${app_name}-secrets -p 'CKAN___BEAKER__SESSION_SECRET, SAML2_PRIVATE_KEY'\n\nName | Description | Where to find\n---- | ----------- | -------------\nCKAN___BEAKER__SESSION__SECRET | Session secret for encrypting CKAN sessions. | `pwgen -s 32 1`\nSAML2_PRIVATE_KEY | Base64 encoded SAML2 key matching the certificate configured for Login.gov | [Google Drive](https://drive.google.com/drive/u/0/folders/1VguFPRiRb1Ljnm_6UShryHWDofX0xBnU)\n\n\n## Login.gov integration\n\nWe use Login.gov as our\n[SAML2](https://github.com/GSA/datagov-deploy/wiki/SAML2-authentication)\nIdentity Provider (IdP). Production apps use the production Login.gov instance\nwhile other apps use the Login.gov identity sandbox.\n\nEach year in March, Login.gov rotates their credentials. See our\n[wiki](https://github.com/GSA/datagov-deploy/wiki/SAML2-authentication#working-with-logingov)\nfor details.\n\nOur Service Provider (SP) certificate and key are provided in through\nenvironment variable and user-provided service.\n\nThe Login.gov IdP metadata is stored in file under `ckan/setup/`.\n\n\n## On Docker CKAN 2.9 images\n\nThe repository extends the Open Knowledge Foundation `ckan-dev:2.9` docker\nimage. The `ckan-base:2.9` image, if needed for some reasons, is available via\ndockerhub with the aformentioned tag, as referenced in [OKF's docker-ckan\nrepository](https://github.com/okfn/docker-ckan).\n\n\n## Public docker image\n\nIf build pass tests a docker-image will be published in the docker hub: https://hub.docker.com/r/datagov/catalog-next.  \nThis image will be used in extensions to test.  \n\n## Note on requirements\n\nThe source of truth about package dependencies is managed with\n*pip* kept in `ckan/requirements.txt`.  The base OKFN Docker image we are using,\nthough, doesn't install all dependencies we need.  We have modified our ckan image\n(`ckan/Dockerfile`) to install frozen requirements from\n`ckan/requirements.txt` at image build time to help ensure all\ndevelopers are working with the same set of requirements.\n\nThe Makefile target *update-dependencies* will use pip to generate a new\n`requirements.txt` and update `ckan/requirements.txt`.\n\n    $ make update-dependencies\n\nTo support cloud.gov installation via normal python buildpack, there is a\nsymbolic link `requirements.txt` that references\n`ckan/requirements.txt`.\n\n### Adding new extensions in requirements\nIf you try to add and extension and it didn't work you should \ntry `chown user:user -R .` (in the _catalog.data.gov_ repo folder) \nbecause if you run docker as \nsuperuser and then as a regular user won't be able to add \nthe folder for the new extension\n\n### Procedure for updating a dependency\n\n1.  Add/change the dependency in `ckan/requirements.in`\n2.  Run `make update-dependencies build clean test`\n3.  Make sure to commit `ckan/requirements.txt` and `ckan/requirements.in`\n    to make the change permanent.\n\n## Create an extension\n\nYou can use the ckan template in much the same way as a source install, only\nexecuting the command inside the CKAN container and setting the mounted `src/`\nfolder as output:\n\n    $ docker compose exec ckan /bin/bash -c \\\n    \"ckan generate extension\"\n\nThe new extension will be created in the `src/` folder. You might need to change\nthe owner of its folder to have the appropriate permissions.\n\n\n## Running the debugger (pdb / ipdb)\n\nTo run a container and be able to add a breakpoint with `pdb` or `ipdb`, run the\n`ckan-dev` container with the `--service-ports` option:\n\n    docker compose run --service-ports ckan\n\nThis will start a new container, displaying the standard output in your\nterminal. If you add a breakpoint in a source file in the `src` folder (`import\npdb; pdb.set_trace()`) you will be able to inspect it in this terminal next time\nthe code is executed.\nIf you are testing a harvest process (gather/fetch/run), try turning off the command\nto start in the background in the `ckan/docker-entrypoint.d/10-setup-harvest.sh`.\nThen, run the relevant command manually (`make harvest fetch-queue`) after startup.\n\n## SAML2\n\nTo enable the ckanext-saml2 extension, add `saml2auth` to `CKAN__PLUGINS` list in the `.env` file and then access to https://localhost:8443/dataset\nOpen your web browser to [localhost:8443](https://localhost:8443).  \nYou can log into your instance with you login.gov user. \n\n\n## CI\n\nContinuous Integration via [GitHub Actions](https://github.com/GSA/catalog.data.gov/actions/workflows/commit.yml).\n\nContinuous Deployment via [GitHub Actions](https://github.com/GSA/catalog.data.gov/actions/workflows/publish.yml).\n\n\n## Put site into maintenance mode\n\nTo block access to the catalog apps (`catalog-web`, `catalog-admin`), set the environment variables (`CATALOG_WEB_MODE`, `CATALOG_ADMIN_MODE`) in the `catalog-proxy` app. Use 'MAINTENANCE' for scheduled downtime, 'DOWN' for unscheduled downtime,  'FEDERAL-SHUTDOWN' for the special occasion. Any other value will resume normal operation. Any change on `CATALOG_WEB_MODE` need to be followed by a CloudFront cache clear.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgsa%2Fcatalog.data.gov","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgsa%2Fcatalog.data.gov","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgsa%2Fcatalog.data.gov/lists"}