{"id":36647795,"url":"https://github.com/containerbuildsystem/cachito","last_synced_at":"2026-01-12T10:02:29.506Z","repository":{"id":37676499,"uuid":"176813085","full_name":"containerbuildsystem/cachito","owner":"containerbuildsystem","description":"Caching service for source code and external dependencies","archived":false,"fork":false,"pushed_at":"2023-12-13T18:35:38.000Z","size":2863,"stargazers_count":50,"open_issues_count":10,"forks_count":46,"subscribers_count":7,"default_branch":"master","last_synced_at":"2023-12-15T14:48:14.866Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/containerbuildsystem.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2019-03-20T20:37:13.000Z","updated_at":"2024-02-19T22:43:13.112Z","dependencies_parsed_at":"2023-10-11T22:41:50.485Z","dependency_job_id":"2bd5499b-13bb-4c36-802b-5f3d2dd7881d","html_url":"https://github.com/containerbuildsystem/cachito","commit_stats":null,"previous_names":[],"tags_count":45,"template":null,"template_full_name":null,"purl":"pkg:github/containerbuildsystem/cachito","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/containerbuildsystem%2Fcachito","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/containerbuildsystem%2Fcachito/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/containerbuildsystem%2Fcachito/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/containerbuildsystem%2Fcachito/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/containerbuildsystem","download_url":"https://codeload.github.com/containerbuildsystem/cachito/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/containerbuildsystem%2Fcachito/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28337870,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-12T06:09:07.588Z","status":"ssl_error","status_checked_at":"2026-01-12T06:05:18.301Z","response_time":98,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-12T10:02:29.257Z","updated_at":"2026-01-12T10:02:29.492Z","avatar_url":"https://github.com/containerbuildsystem.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Coverage Status](https://coveralls.io/repos/github/containerbuildsystem/cachito/badge.svg?branch=master)](https://coveralls.io/github/containerbuildsystem/cachito?branch=master)\n\n# Cachito\n\nCachito is a service to store (and serve) source code for applications. Upon a request, Cachito\nwill fetch a specific revision of a given repository from the Internet and store it permanently in\nits internal storage. Namely, it stores the source code for a specific Git commit from a given Git\nrepository, which could be from a forge such as [GitHub](https://github.com) or\n[GitLab](https://gitlab.com). This way, even if that repository (or that revision) is deleted, it\nis still possible to track the pristine source code for the original sources. In fact, if the\nsources have already been previously fetched, Cachito will simply serve the stored copy.\n\nCachito also supports identifying and permanently storing dependencies for certain package managers\nand making them available for building the application. Like it does for source code, future\nrequests that utilize these same dependencies will be taken from Cachito's internal storage rather\nthan be fetched from the Internet. See the [Package Manager Feature Support](#feature-support)\nsection for the package managers that Cachito currently supports.\n\nCachito will produce bundles as the output artifact of a request. The bundle is a tarball that\ncontains the source code of the application and all the sources of its dependencies. For some\npackage managers, these dependencies can be used directly for building the application. Other\npackage managers will provide an alternative mechanism for this (e.g. a custom npm registry with\nthe declared npm dependencies). Regardless of if the dependencies in the bundle are used for\nbuilding the application, they are always present so that the source of these dependencies\ncan be published alongside the application for license compliance.\n\n## Table of Contents\n\n* [More Documentation](#more-documentation)\n* [Coding Standards](#coding-standards)\n* [Quick Start](#quick-start)\n* [Pre-built Container Images](#pre-built-container-images)\n* [Prerequisites](#prerequisites)\n* [Development](#development)\n* [Database Migrations](#database-migrations)\n* [API Documentation](#api-documentation)\n* [Configuring Workers](#configuring-workers)\n* [Configuring the API](#configuring-the-api)\n* [Flags](#flags)\n* [Nexus](#nexus)\n* [Package Managers](#package-managers)\n\n## More Documentation\n\nDocuments that outgrew this README can be found in the `docs/` drectory.\n\n* [docs/](./docs)\n  * [dependency_confusion.md](./docs/dependency_confusion.md) is a short analysis of a supply chain\n    attack and its impact on Cachito users\n  * [metadata.md](./docs/metadata.md) describes Cachito request metadata\n  * [pip.md](./docs/pip.md) is a guide for using pip with Cachito\n  * [using_requests_locally.md](./docs/using_requests_locally.md) explains how to use Cachito\n    requests to run builds on your PC\n  * [tracing.md](./docs/tracing.md) documents Cachito's support for OpenTelemetry tracing\n\n## Coding Standards\n\nThe codebase conforms to the style enforced by `flake8` with the following exceptions:\n\n* The maximum line length allowed is 100 characters instead of 80 characters\n\nIn addition to `flake8`, docstrings are also enforced by the plugin `flake8-docstrings` with\nthe following exceptions:\n\n* D100: Missing docstring in public module\n* D104: Missing docstring in public package\n* D105: Missing docstring in magic method\n\nThe format of the docstrings should be in the\n[reStructuredText](https://docs.python-guide.org/writing/documentation/#restructuredtext-ref) style\nsuch as:\n\n```python\nSet the state of the request using the Cachito API.\n\n:param int request_id: the ID of the Cachito request\n:param str state: the state to set the Cachito request to\n:param str state_reason: the state reason to set the Cachito request to\n:return: the updated request\n:rtype: dict\n:raise CachitoError: if the request to the Cachito API fails\n```\n\nAdditionally, `black` is used to enforce other coding standards.\n\nTo verify that your code meets these standards, you may run `tox -e black,flake8`.\n\n## Quick Start\n\nRun the application locally (requires [docker compose](https://docs.docker.com/compose/)):\n\n```bash\nmake run\n```\n\nNote: while running Cachito locally requires docker compose, that does not mean you\nhave to use Docker! Podman 3.0 or greater can serve as a replacement, see\nhttps://www.redhat.com/sysadmin/podman-docker-compose.\n\nAlternatively, you could also run the application with\n[podman-compose](https://github.com/containers/podman-compose) by setting the\n`CACHITO_COMPOSE_ENGINE` variable to the path of the `podman-compose` script.\nUnfortunately, the latest release of `podman-compose` contains various bugs making\nit unusable for running Cachito locally. Use the script from the `devel` branch instead.\nTo facilitate this, set `CACHITO_COMPOSE_ENGINE` to the special value `podman-compose-auto`.\nwhich will instruct the Makefile to download and use the correct version of `podman-compose`.\nBe sure to pre-install the dependencies required by `podman-compose`, currently `PyYAML`.\nThe script is available in `./tmp/podman_compose.py`. You may use this script to interact with\nthe local deployment.\n\n```bash\nmake run CACHITO_COMPOSE_ENGINE=podman-compose-auto\n```\n\nVerify in the browser at [http://localhost:8080/](http://localhost:8080/)\n\nUse curl to make requests:\n\n```bash\n# List all requests\ncurl http://localhost:8080/api/v1/requests\n\n# Create a new request\ncurl -X POST -H \"Content-Type: application/json\" http://localhost:8080/api/v1/requests -d \\\n    '{\n        \"repo\": \"https://github.com/release-engineering/retrodep.git\",\n        \"ref\": \"e1be527f39ec31323f0454f7d1422c6260b00580\",\n        \"pkg_managers\": [\"gomod\"]\n      }'\n\n# Check the status of a request\ncurl http://localhost:8080/api/v1/requests/1\n\n# Download the source archive for a completed request\ncurl http://localhost:8080/api/v1/requests/1/download -o source.tar.gz\n```\n\n## Pre-built Container Images\n\nCachito container images are automatically built when changes are merged. There are two images:\nan httpd based image with the Cachito API and a Celery worker image with the Cachito worker code.\n\n[![cachito-api](https://quay.io/repository/containerbuildsystem/cachito-api/status)](https://quay.io/repository/containerbuildsystem/cachito-api)\n  `quay.io/containerbuildsystem/cachito-api:latest`\n\n[![cachito-workers](https://quay.io/repository/containerbuildsystem/cachito-workers/status)](https://quay.io/repository/containerbuildsystem/cachito-workers)\n  `quay.io/containerbuildsystem/cachito-workers:latest`\n\n## Prerequisites\n\nThis is built to be used with Python 3.\n\nSome Flask dependencies are compiled during installation, so `gcc` and Python header files need to be present.\nFor example, on Fedora:\n\n```bash\ndnf install gcc python3-devel\n```\n\n## Development\n\n### Virtualenv\n\nYou may create a virtualenv with Cachito and its dependencies installed with the following command:\n\n```bash\nmake venv\n```\n\nThis installs Cachito in\n[develop mode](http://setuptools.readthedocs.io/en/latest/setuptools.html#development-mode) which\nallows modifying the source code directly without needing to reinstall Cachito. This is really\nuseful for syntax highlighting in your IDE, however, it's not practical to use as a development\nenvironment since Cachito has dependencies on other services.\n\n*NOTE:* you may need to ensure that you have some packages installed. In Fedora, you will need\n\n```\nyum install python3.11 python3-devel python3-virtualenv gcc krb5-devel\n```\n\nwhere `python3.11` is the version of python required based on `tox.ini`.\n\n### Run a Containerized Development Environment\n\nYou may create and run the containerized development environment with\n[docker compose (v2)](https://docs.docker.com/compose/) with the following command:\n\n```bash\nmake run-start\n```\n\nThe will automatically create and run the following containers:\n\n* **athens** - the [Athens](https://docs.gomods.io/) instance responsible for permanently storing\n  dependencies for the `gomod` package manager.\n* **cachito-api** - the Cachito REST API. This is accessible at\n  [http://localhost:8080](http://localhost:8080).\n* **cachito-worker** - the Cachito Celery worker. This container is also responsible for configuring\n  Nexus at startup.\n* **db** - the Postgresql database used by the Cachito REST API.\n* **nexus** - the [Sonatype Nexus Repository Manager](https://www.sonatype.com/nexus-repository-oss)\n  instance that is responsible for permanently storing dependencies for the `npm` package manager.\n  The management UI is accessible at [http://localhost:8082](http://localhost:8082). The username is\n  `admin` and the password is `admin`.\n* **rabbitmq** - the RabbitMQ instance for communicating between the API and the worker. The\n  management UI is accessible at [http://localhost:8081](http://localhost:8081). The username is\n  `cachito` and the password is `cachito`.\n\nAfter the development environment is running, you can submit jobs to it with `curl` requests\n\n```bash\ncurl -X POST -H \"Content-Type: application/json\" http://localhost:8080/api/v1/requests -d \\\n'{\n\"repo\": \"https://github.com/athos-ribeiro/cachito-sample-pip-package.git\",\n\"ref\": \"51ffb9c2412d50953ed9732c67267e5d2ff9aa68\",\n\"pkg_managers\": [\"pip\"],\n\"packages\": {\"pip\": [{\"path\": \".\"}, {\"path\": \"subpackage\"}]}\n}'\n```\n\nThe REST API and the worker will restart if the source code is modified. Please note that the REST\nAPI may stop restarting if there is a syntax error.\n\n#### Rebuilding Images\n\nIf you suspect that the images used for [`docker compose`](#run-a-containerized-development-environment) are out of date, you can\nrun the containerized development environment while forcing a rebuild of the images with the following\ncommand:\n\n```bash\nmake run-build-start\n```\n\nIf you just want to force a rebuild of the images without running them, you can use\n\n```bash\nmake run-build\n```\n\n### Unit Tests\n\nTo run the unit tests with [tox](https://tox.readthedocs.io/en/latest/), you may run the following\ncommand:\n\n```bash\nmake test-unit\n```\n\n### Integration Tests\n\nTo run the integration tests with [tox](https://tox.readthedocs.io/en/latest/), you may run the\nfollowing command:\n\n```bash\nmake test-integration\n```\n\nBy default, some tests will require custom configuration and will run against your local development\nenvironment. Read the [integration tests readme](tests/integration/README.md) for more information.\n\n**NOTE:** The [containerized development environment](#run-a-containerized-development-environment)\nneeds to be running before the integration tests can pass.\n\n### Running Specific Tests\n\nInstead of running the entire unit/integration test suite, you can also run a specific set of tests.\n\n```bash\nmake test-suite TOX_ARGS=\u003ctest-suite-identifier\u003e\n```\n\nThe `test-suite-identifier` can be pulled from the test result in the `tox` output or constructed from\nthe filepath filepath and test function. For example, if you want to run\n[`test_fetch_gomod_source`](https://github.com/release-engineering/cachito/blob/983349b2c45326def8e20f36cbe2a1fee7dabf0e/tests/test_workers/test_tasks/test_gomod.py#L24),\nyou would call:\n\n```bash\nmake test-suite TOX_ARGS=tests/test_workers/test_tasks/test_gomod.py::test_fetch_gomod_source\n```\n\nOmitting the `TOX_ARGS` will run all tests without performing `black`/`flake8` validation.\n\nIn addition to running specific tests, parameters can be passed into `tox` with `TOX_ARGS` and the\nenvironment can be configured with `TOX_ENVLIST`.\n\n```bash\nmake test-suite TOX_ARGS=\"-x --no-cov tests/test_workers/test_tasks/test_gomod.py\"\n```\n\nBy default, `TOX_ENVLIST` is set to `python3.11` indicating that it should run on that version.\nIf adding environment parameters to `tox`, ensure that you are setting the Python version if needed.\n\n### Clean Up\n\nTo remove the virtualenv, built distributions, and the local development environment, you may run\nthe following command:\n\n```bash\nmake clean\n```\n\nIf you are using podman, do not forget to set the `CACHITO_COMPOSE_ENGINE` variable:\n\n```bash\nmake clean CACHITO_COMPOSE_ENGINE=podman-compose\n```\n\n### Adding Dependencies\n\nTo add more Python dependencies, add them to the following files:\n\n* [setup.py](setup.py)\n* [requirements.in](requirements.in)\n* [requirements-web.in](requirements-web.in)\n\nIf you're wondering why you need to add dependencies to both files (setup.py and one of the\nrequirements files), see\n[install_requires vs requirements files](https://packaging.python.org/discussions/install-requires-vs-requirements/).\n\nAfterwards, pip-compile the dependencies via `make pip-compile` (you may need to run `make venv`\nfirst, unless the venv already exists).\n\nAdditionally, if any of the newly added dependencies in the generated `requirements*.txt` files\nneed to be compiled from C code, please install any missing C libraries in the corresponding\nDockerfile(s): requirements.txt is used in both, requirements-web.txt only in api.\n\n* [Dockerfile-api](docker/Dockerfile-api)\n* [Dockerfile-workers](docker/Dockerfile-workers)\n\n### Accessing Private Repositories\n\nIf your Cachito worker needs to access private repositories in your development environment, you\nmay mount a\n[.netrc](https://www.gnu.org/software/inetutils/manual/html_node/The-_002enetrc-file.html) file\nby adding the volume mount `- /path/to/.netrc:/root/.netrc:ro,z` in your `docker-compose.yml`\nfile under the `cachito-worker` container.\n\n### Using Cachito Requests Locally\n\nMore details [here](docs/using_requests_locally.md).\n\nThis is how you would use the example request [above](#run-a-containerized-development-environment)\nlocally (assuming it is request #1).\n\n```shell\nbin/cachito-download.sh localhost:8080/api/v1/requests/1 /tmp/cachito-test\n\ncd /tmp/cachito-test/remote-source/\n# sed will sometimes be needed for requests from the dev environment\nsed 's/nexus:8081/localhost:8082/g' --in-place cachito.env app/requirements.txt\n\n# you don't *have* to use a container but having a clean environment is usually desirable\npodman run --net=host --rm -ti -v \"$PWD:/remote-source:z\" -w \"/remote-source\" fedora:33\n# \u003cinside the container\u003e\ndnf -y install python3-pip\nsource cachito.env\ncd app\npip install -r requirements.txt\npython3 setup.py install\n```\n\nYou need to have [jq](https://stedolan.github.io/jq/) installed for the script to work.\n\n## Database Migrations\n\nFollow the steps below for database data and/or schema migrations:\n\n* Checkout the master branch and ensure no schema changes are present in `cachito/web/models.py`\n* Set `SQLALCHEMY_DATABASE_URI` to `sqlite:///cachito-migration.db` in `cachito/web/config.py`\n  under the `Config` class\n* Run `cachito db upgrade` which will create an empty database in the root of your Git repository\n  called `cachito-migration.db` with the current schema applied\n* Checkout a new branch where the changes are to be made\n* In case of schema changes,\n  * Apply any schema changes to `cachito/web/models.py`\n  * Run `cachito db migrate` which will autogenerate a migration script in\n    `cachito/web/migrations/versions`\n* In case of no schema changes,\n  * Run `cachito db revision` to create an empty migration script file\n* Rename the migration script so that the suffix has a description of the change\n* Modify the docstring of the migration script\n* For data migrations, define the schema of any tables you will be modifying. This is so that it\n  captures the schema of the time of the migration and not necessarily what is in models.py since\n  that reflects the latest schema.\n* Modify the `upgrade` function to make the adjustments as necessary\n* Modify the `downgrade` function to reverse the changes that were made in the `upgrade` function\n* Make any adjustments to the migration script as necessary\n* To test the migration script,\n  * Populate the database with some dummy data as per the requirement\n  * Run `cachito db upgrade` (see upgrade optional data below)\n  * Also test the downgrade by running `cachito db downgrade \u003cprevious revision\u003e`\n    (where previous revision is the revision ID of the previous migration script)\n* Remove the configuration of `SQLALCHEMY_DATABASE_URI` that you set earlier\n* Remove `cachito-migration.db`\n* Commit your changes\n* Check \"615c19a1cee1_add_npm.py\" as an example that does a schema change and a data migration\n\n### Database migration optional data\n\nThere are arguments to add migration optional data while upgrading Cachito Database:\n\n* `delete_data=True` - an argument to delete unused tables from the database\n  (usage: `cachito db upgrade -x delete_data=True`).\n\nRun `cachito db upgrade --help` to get more info about additional arguments consumed by custom env.py scripts.\n\n\n## API Documentation\n\nThe documentation is generated from the [API specification](cachito/web/static/api_v1.yaml)\nwritten in the OpenAPI 3.0 format.\n\nIt is available on Cachito's root URL.\n\n## Configuring Workers\n\nTo configure a Cachito Celery worker, create a Python file at `/etc/cachito/celery.py`. Any\nvariables set in this file will be applied to the Celery worker when running in production mode\n(default).\n\nCustom configuration for the Celery workers are listed below:\n\n* `broker_url` - the URL RabbitMQ instance to connect to. See the\n  [broker_url](https://docs.celeryproject.org/en/latest/userguide/configuration.html#std:setting-broker_url)\n  configuration documentation.\n* `cachito_api_url` - the URL to the Cachito API (e.g. `https://cachito-api.domain.local/api/v1/`).\n* `cachito_api_timeout` - the timeout when making a Cachito API request. The default is `60`\n  seconds.\n* `cachito_athens_url` - the URL to the Athens instance to use for caching gomod dependencies. This\n  is only necessary for workers that process gomod requests.\n* `cachito_auth_cert` - the SSL certificate to be used for authentication. See\n  https://requests.readthedocs.io/en/master/user/advanced/#client-side-certificates for reference on\n  how to provide this certificate.\n* `cachito_auth_type` - the authentication type to use when accessing protected Cachito API\n  endpoints. If this value is `None`, authentication will not be used. This defaults to `kerberos`\n  in production. The `cert` value is also valid and would use an SSL certificate for authentication.\n  This requires `cachito_auth_cert` to be provided.\n* `cachito_bundles_dir` - the directory for storing bundle archives which include the source archive\n  and dependencies. This configuration is required, and the directory must already exist and be\n  writeable.\n* `cachito_default_environment_variables` - a dictionary where the keys are names of package\n  managers. The values are dictionaries where the keys are default environment variables to\n  set for that package manager and the values are dictionaries with the keys `value` and `kind`. The\n  `value` must be a string which specifies the value of the environment variable. The `kind` must\n  also be a string which specifies the type of value, either `\"path\"` or `\"literal\"`. Check\n  `cachito/workers/config.py::Config` for the default value of this configuration.\n* `cachito_gomod_download_max_tries` - how many times to try `go mod` subprocess calls used for\n  downloading dependencies. Cachito will retry the entire operation for any non-zero return code.\n* `cachito_gomod_ignore_missing_gomod_file` - if `True` and the request specifies the `gomod`\n  package manager but there is no `go.mod` file present in the repository, Cachito will skip\n  the `gomod` package manager for the request. If `False`, the request will fail if the `go.mod`\n  file is missing. This is only supported if a single path is provided to the `gomod` package manager.\n  This defaults to `False`.\n* `cachito_gomod_strict_vendor` - the bool to disable/enable the strict vendor mode. This defaults\n  to `False`. For a repo that has gomod dependencies, if the `vendor` directory exists and this config\n  option is set to `True`, Cachito will fail the request.\n* `cachito_js_concurrency_limit` - the maximum number of concurrent download tasks in javascript\n  requests. Upon reaching this limit, a task must end for another to be created. This defaults to `5`.\n* `cachito_log_level` - the log level to configure the workers with (e.g. `DEBUG`, `INFO`, etc.).\n* `cachito_nexus_ca_cert` - the CA certificate that signed the SSL certificate used by the Nexus\n  instance. This defaults to `/etc/cachito/nexus_ca.pem`. If this file does not exist, Cachito will\n  not provide the CA certificate in the package manager configuration.\n* `cachito_nexus_hoster_password` - the password of the Nexus service account used by Cachito for\n  the Nexus instance that has the hosted repositories. This is used instead of\n  `cachito_nexus_password` for uploading content if you are using the two Nexus instance approach as\n  described in the \"Nexus Common Configuration\" section. If this is set, `cachito_nexus_hoster_username` must\n  also be set.\n* `cachito_nexus_hoster_url` - the URL to the Nexus instance that has the hosted repositories. This\n  is used instead of `cachito_nexus_url` for uploading content if you are using the two Nexus\n  instance approach as described in the \"Nexus Common Configuration\" section.\n* `cachito_nexus_hoster_username` - the username of the Nexus service account used by Cachito for\n  the Nexus instance that has the hosted repositories. This is used instead of\n  `cachito_nexus_username` for uploading content if you are using the two Nexus instance approach as\n  described in the \"Nexus Common Configuration\" section. If this is set, `cachito_nexus_hoster_password` must\n  also be set.\n* `cachito_nexus_js_hosted_repo_name` - the name of the Nexus hosted repository for JavaScript\n  package managers. This defaults to `cachito-js-hosted`.\n* `cachito_nexus_max_search_attempts` - the number of times Cachito will retry searching for non\n  PyPI assets in the raw pip repositories to retrieve a URL to append to the requirements file.\n* `cachito_nexus_npm_proxy_url` - the URL to the `cachito-js` repository which is a Nexus group\n  that points to the `cachito-js-hosted` hosted repository and the `cachito-js-proxy` proxy\n  repository. This defaults to `http://localhost:8081/repository/cachito-js/`. This only needs to\n  change if you are using the two Nexus instance approach as described in the \"Nexus For Java Script\"\n  section or you use a different name for the repository.\n* `cachito_nexus_password` - the password of the Nexus service account used by Cachito.\n* `cachito_nexus_pip_raw_repo_name` - the name of the Nexus raw repository for the `pip` package\n  manager. This defaults to `cachito-pip-raw`.\n* `cachito_nexus_pypi_proxy_url` - the URL of the Nexus PyPI proxy repository for the `pip` package\n  manager. Configured using a full URL rather than just a repo name because we need the additional\n  flexibility.\n* `cachito_nexus_rubygems_proxy_url`- the URL of the Nexus RubyGems proxy repository for the \n  `rubygems` package manager. Configured using a full URL rather than just a repo name because \n  we need the additional flexibility.\n* `cachito_nexus_rubygems_raw_repo_name` - the name of the Nexus raw repository for the `rubygems` \n  package manager. This defaults to `cachito-rubygems-raw`.\n* `cachito_nexus_proxy_password` - the password of the unprivileged user that has read access\n  to the main Cachito repositories (e.g. `cachito-js`). This is needed if the Nexus instance that\n  hosts the main Cachito repositories has anonymous access disabled. This is the case if Cachito\n  utilizes just a single Nexus instance.\n* `cachito_nexus_proxy_username` - the username of the unprivileged user that has read access\n  to the main Cachito repositories (e.g. `cachito-js`). This is needed if the Nexus instance that\n  hosts the main Cachito repositories has anonymous access disabled. This is the case if Cachito\n  utilizes just a single Nexus instance.\n* `cachito_nexus_request_repo_prefix` - the prefix of Nexus proxy repositories made for each\n  request for applicable package managers (e.g. `cachito-npm-1`). This defaults to `cachito-`.\n* `cachito_nexus_timeout` - the timeout when making a Nexus API request. The default is `60`\n  seconds.\n* `cachito_nexus_url` - the base URL to the Nexus Repository Manager 3 instance used by Cachito.\n* `cachito_nexus_username` - the username of the Nexus service account used by Cachito. The\n  following privileges are required: `nx-repository-admin-*-*-*`, `nx-repository-view-npm-*-*`,\n  `nx-roles-all`, `nx-script-*-*`, `nx-users-all` and `nx-userschangepw`. This defaults to\n  `cachito`.\n* `cachito_npm_file_deps_allowlist` - the npm \"file\" dependencies that are allowed in the lock file\n  for the \"npm\" package manager. This configuration is a dictionary with the keys as package names\n  and the values as lists of dependency names. This defaults to `{}`.\n* `cachito_yarn_file_deps_allowlist` - the yarn \"file\" dependencies that are allowed in the lock file\n  for the \"yarn\" package manager. See `cachito_npm_file_deps_allowlist`.\n* `cachito_gomod_file_deps_allowlist` - the gomod dependencies that Cachito will allow to be replaced\n  by local paths, e.g. `replace github.com/org/some-module =\u003e ./staging/src/some-module`. This is a\n  dictionary where keys are module names and values are lists of packages that the corresponding module\n  is allowed to replace. The packages may contain wildcards supported by Python's `fnmatch`, e.g.\n  `github.com/org/*` (this will allow all packages starting with `github.com/org/`). A submodule allowed\n  to be replaced by a local module by default (e.g. `\u003cthis-module\u003e/submodule =\u003e ./local-module`),where a\n  submodule is an internal module (placed in non-root directory) in a multi-module hierarchy (read more about\n  [multi-module repositories](https://github.com/golang/go/wiki/Modules#faqs--multi-module-repositories)).\n* `cachito_workers_rubygems_file_deps_allowlist` - for each package, it contains a list of \n  RubyGems PATH dependencies that are allowed to be present in `Gemfile.lock`. This configuration \n  is a dictionary with the keys as package names and the values  as lists of dependency names.\n  This defaults to `{}`.\n* `cachito_request_file_logs_dir` - the directory to write the request specific log files. If `None`, per\n  request log files are not created. This defaults to `None`.\n* `cachito_request_file_logs_format` - the format for the log messages of the request specific log files.\n  This defaults to `\"[%(asctime)s %(name)s %(levelname)s %(module)s.%(funcName)s] %(message)s\"`.\n* `cachito_request_file_logs_level` - the log level for the request specific log files. This defaults to\n  `DEBUG`.\n* `cachito_request_file_logs_perm` - the log file permission for the request specific log files. This\n  defaults to `0o660`.\n* `cachito_request_lifetime` - the number of days before a request that is in the `complete` state\n  or that is stuck in the `in_progress` state will be marked as stale by the `cachito-cleanup`\n  script. This defaults to `1`.\n  * `cachito_request_lifetime_failed` - the number of days before a request that is in the `failed` state\n  will be marked as stale by the `cachito-cleanup` script. This defaults to `7`.\n* `cachito_sources_dir` - the directory for long-term storage of app source archives. This\n  configuration is required, and the directory must already exist and be writeable.\n* `cachito_task_log_format` - the log format that Celery displays when a task is executing. This\n  defaults to\n  `\"[%(asctime)s #%(request_id)s %(name)s %(levelname)s %(module)s.%(funcName)s] %(message)s\"`.\n* `cachito_subprocess_timeout` - a number (in seconds) to set a timeout for commands executed by\n  the `subprocess` module. Default is 3600 seconds. A timeout is always required, and there is no\n  way provided by Cachito to disable it. Set a larger number to give the subprocess execution more time.\n* `cachito_otlp_exporter_endpoint` - A valid URL with a port number as necessary to a OTLP/http-compatible\n  endpoint to receive OpenTelemetry trace data. \n\nTo configure the workers to use a Kerberos keytab for authentication, set the `KRB5_CLIENT_KTNAME`\nenvironment variable to the path of the keytab. Additional Kerberos configuration can be made in\n`/etc/krb5.conf`.\n\n## Configuring the API\n\nCustom configuration for the API:\n\n* `CACHITO_BUNDLES_DIR` - the root of the bundles directory that is also accessible by the\n  workers. This is used to download the bundle archives created by the workers.\n* `CACHITO_DEFAULT_PACKAGE_MANAGERS` - the default package managers to use when no package managers\n  are specified on a request. This defaults to `[\"gomod\"]`.\n* `CACHITO_MAX_PER_PAGE` - the maximum amount of items in a page for paginated results.\n* `CACHITO_MUTUALLY_EXCLUSIVE_PACKAGE_MANAGERS` - the list of pairs of mutually exclusive package\n   managers (e.g. `[(\"npm\", \"yarn\"), (\"gomod\", \"git-submodule\")]`). If two package managers are\n   configured as mutually exclusive, then Cachito will validate that they do not process the same\n   package in a request.\n* `CACHITO_PACKAGE_MANAGERS` - the list of enabled package managers. This defaults to `[\"gomod\"]`.\n* `CACHITO_REQUEST_FILE_LOGS_DIR` - the directory to load the request specific log files. If `None`, per\n  request log files information will not appear in the API response. This defaults to `None`.\n* `CACHITO_USER_REPRESENTATIVES` - the list of usernames that are allowed to submit requests on\n  behalf of other users.\n* `CACHITO_WORKER_USERNAMES` - the list of usernames that are allowed to use the `/requests/\u003cid\u003e`\n  PATCH endpoint.\n* `LOGIN_DISABLED` - disables authentication requirements.\n* `CACHITO_OTLP_EXPORTER_ENDPOINT` - A valid URL with a port number as necessary to a OTLP/http-compatible\n  endpoint to receive OpenTelemetry trace data. \n\nAdditionally, to configure the communication with the Cachito Celery workers, create a Python file\nat `/etc/cachito/celery.py`, and set the\n[broker_url](https://docs.celeryproject.org/en/latest/userguide/configuration.html#std:setting-broker_url)\nconfiguration to point to your RabbitMQ instance.\n\nIf you are planning to deploy Cachito with authentication enabled, you'll need to use\na web server that supplies the `REMOTE_USER` environment variable when the user is\nproperly authenticated. A common deployment option is using httpd (Apache web server)\nwith the `mod_auth_gssapi` module.\n\n## Flags\n\n* `gomod-vendor` - the flag to indicate the vendoring requirement for gomod dependencies. If present in the\n  Cachito request, Cachito will run `go mod vendor` instead of `go mod download` to gather dependencies.\n  See [gomod vendoring](#gomod-vendoring) for more details.\n\n* `gomod-vendor-check` - like `gomod-vendor`, but if the `vendor/` directory is already present,\n  Cachito will refuse to make changes in your repository. Should be preferred over `gomod-vendor`.\n\n* `force-gomod-tidy` - when used, Cachito will unconditionally run `go mod tidy` even when dependency\n  replacments are not present.\n\n* `include-git-dir` - when used, `.git` file objects are not removed from the source bundle created\n  by Cachito. This is useful when the git history is important to the build process.\n\n* `cgo-disable` - use this flag to make Cachito set `CGO_ENABLED=0` while processing gomod packages.\n  This environment variable will only be used internally by Cachito, it will *not* be set in the\n  environment variables for the completed request. Typically, you will only want to use this if your\n  package *does* use C files, and the Cachito request is failing.\n\n* `remove-unsafe-symlinks` - the flag forces Cachito to remove all symlinks that points to some\n  location outside of a cloned repository. Otherwise, if the flag isn't set, Cachito will raise\n  a validation error right after cloning, in case when such symlinks are present in the source.\n\n## Nexus\n\n### Nexus For Java Script\n\nThe Java Script(JS) package managers (npm, yarn) functionality relies on\n[Nexus Repository Manager 3][nexus-docs] to store JS dependencies. The Nexus instance will have a\nJS group repository (e.g. `cachito-js`) which points to a JS hosted repository (e.g.\n`cachito-js-hosted`) and a JS proxy repository\n(e.g. `cachito-js-proxy`) that points to the npm/yarn registry (registry.npmjs.org and\nregistry.yarnpkg.com, which points to the same registry server). The hosted repository will contain\nall non-registry dependencies and the proxy repository will contain all dependencies from the\nJS registry. The union of these two repositories gives the set of all the JS dependencies ever\nencountered by Cachito.\n\nOn each request, Cachito will create a proxy repository to the JS group repository\n(e.g. `cachito-js`). Cachito will populate this proxy repository to contain the subset of\ndependencies declared in the repository's lock file. Once populated, Cachito will block the\nrepository from getting additional content. This prevents the consumer of the repository from\ninstalling something that was not declared in the lock file. This is further enforced by locking\ndown the repository to a single user created for the request, which the consumer will use. Please\nkeep in mind that for this to function properly, anonymous access needs to be disabled on the Nexus\ninstance or at least not set to have read access on all repositories.\n\nThese repositories and users created per request are deleted when the request is marked as stale\nor the request fails.\n\n### Nexus For pip\n\nThe pip package manager functionality relies on [Nexus Repository Manager 3][nexus-docs] to store\npip dependencies. The Nexus instance will have a PyPI proxy repository (e.g. `cachito-pip-proxy`)\nthat points to pypi.org and a raw repository (e.g. `cachito-pip-raw`) which will be used to store\nexternal dependencies. The PyPI proxy repository will cache all PyPI packages that Cachito downloads\nthrough it and the raw repository will hold tarballs or zip archives of external dependencies that\nCachito will upload after fetching them from the original locations.\n\nOn each request, Cachito will create a PyPI hosted repository and a raw repository, e.g.\n`cachito-pip-hosted-1` and `cachito-pip-raw-1`. Cachito will upload all dependencies for the request\nto these repositories (dependencies from PyPI to the hosted repository, external dependencies to the\nraw one). Cachito will provide environment variables and configuration files that, when applied\nto the user's environment, will allow them to install their dependencies from the above-mentioned\nrepositories. When installing dependencies from the Cachito-provided repositories, the user is\ninherently blocked from installing anything that they did not declare as a dependency, because the\nrepositories will only contain content that Cachito has made available.\n\nThese repositories are created per request and deleted when the request is marked as stale or the\nrequest fails.\n\n### Nexus for RubyGems\n\nThe RubyGems package manager functionality relies on [Nexus Repository Manager 3][nexus-docs] to\nstore RubyGems dependencies. The Nexus instance consists of two repositories that act as a long\nterms storage - RubyGems proxy repository (e.g. `cachito-rubygems-proxy`) that points to\n`rubygems.org` and raw repository (e.g. `cachito-rubygems-raw`) used for storing Git dependencies.\nThe RubyGems proxy repository caches all RubyGems packages that Cachito downloads through it and the\nraw repository holds tarballs of Git dependencies that Cachito uploads there after fetching them\nfrom the original locations.\n\nOn each request, Cachito creates a RubyGems hosted repository (e.g. `cachito-rubygems-hosted-1`)\nand uploads there all GEM dependencies for the request. This repository is created per request and\ndeleted when the request is marked as stale or the request fails. Redirecting Bundler to use this\nrepository instead of a default RubyGems server is done by providing a configuration file. Note that\nthere's no request specific repository for external dependencies as other package managers do, instead,\ndependencies are installed from the downloaded bundle (see [Package Managers section](#rubygems-bundler) \nfor more details).\n\nWhen installing dependencies from the Cachito-provided repositories, the user is inherently blocked\nfrom installing anything that they did not declare as a dependency, because the repositories will\nonly contain content that Cachito has made available.\n\n### Nexus Common Configuration\n\nRefer to the \"Configuring Workers\" section to see how to configure Cachito to use Nexus. Please\nnote that you may choose to use two Nexus instances. One for hosting the permanent content and the\nother for the ephemeral repositories created per request. This is useful if your organization\nalready has a shared Nexus instance but doesn't want Cachito to have near admin level access on it.\nIn this case, you will need to configure the following additional settings that point to the\nNexus instance that hosts the permanent content: `cachito_nexus_hoster_username`,\n`cachito_nexus_hoster_password`, and `cachito_nexus_hoster_url`.\n\n## Package Managers\n\n### Feature Support\n\nThe table below shows the supported package managers and their support level in Cachito.\n\n| Feature                 | gomod | npm | pip | yarn | rubygems |\n|-------------------------|-------|-----|-----|------|----------|\n| Baseline                | ✓     | ✓   | ✓   | ✓    | ✓        |\n| Content Manifest        | ✓     | ✓   | ✓   | ✓    | ✓        |\n| Dependency Replacements | ✓     | x   | x   | x    | x        |\n| Dev Dependencies        | ✓     | ✓   | ✓   | ✓    | x        |\n| External Dependencies   | N/A   | ✓   | ✓   | ✓    | ✓        |\n| Multiple Paths          | ✓     | ✓   | ✓   | ✓    | ✓        |\n| Nested Dependencies     | ✓     | ✓   | x   | ✓    | ✓        |\n| Offline Installations   | ✓     | x   | x   | x    | x        |\n\n#### Feature Definitions\n\n* **Baseline** - The basic requirements are all met and this is ready for production use. This means\n  that all dependencies from official sources declared in a lock file will be properly identified\n  and shown in the REST API. The dependencies will be permanently stored by Cachito and be reused\n  when a future request declares the same dependency. Additionally, Cachito will provide a mechanism\n  for the application to be built using just the declared dependencies from Cachito. The dependency\n  sources are also included in the bundle generated by Cachito for convenience so that the sources\n  can be published alongside of the application for licensing requirements.\n* **Content Manifest** - The `/api/\u003cversion\u003e/requests/\u003cid\u003e/content-manifest` returns a Content\n  Manifest JSON document that describes the application's dependencies and sources.\n* **Dependency Replacements** - Dependency replacements can be specified when creating a Cachito\n  request. This is a convenient feature to allow dependencies to be swapped without making changes\n  in the source repository. Dependency replacement is only supported if a single package is referenced\n  in the repository.\n* **Dev Dependencies** - Cachito can distinguish between dependencies used for running the\n  application and building/testing the application. For example, for the `npm` package manager, the\n  application may require `webpack` to minify their JavaScript and CSS files but that is not\n  used at runtime.\n* **External Dependencies** - External dependencies are supported such as those not from the default\n  registry/package index. For example, for the `npm` package manager, the `package-lock.json` file\n  may have a dependency installed directly from GitHub and not from the npm registry.\n* **Multiple Paths** - Cachito supports a source repository with multiple applications within it.\n  The paths within the source repository are provided by the user when creating the request.\n* **Nested Dependencies** - Dependencies that are stored directly in the source Git repository.\n  For example, `npm` allows `file` dependencies with the `cachito_npm_file_deps_allowlist`\n  configuration. `gomod` allows this through the `go.mod` replace directive.\n* **Offline Installations** - The dependencies can be installed solely with the contents of the\n  bundle. This is true for the `gomod` package manager, however, the `npm` and `pip` package\n  managers rely on Nexus to be online and properly configured by Cachito. If users were so inclined,\n  they could find ways to do an offline install for any package manager, but only `gomod` supports\n  this out of the box (i.e. the user does not need to change their workflow).\n\n### Current Tool Versions\n\nTool     | Version |\n---      |---------|\nGo*      | 1.20.7, 1.23.0 (no workspace vendoring support) |\nNpm      | 9.5.0   |\nNode     | 18.16.1 |\nPip      | 22.3.1  |\nPython   | 3.11.4  |\nGit      | 2.41.0  |\nYarn*    | 1.x     |\nBundler* | 2.x     |\n\n* Cachito does not use the Yarn runtime. The processing of yarn.lock files is handled by\n  [PYarn](https://github.com/containerbuildsystem/pyarn), which is compatible with any 1.x file.\n* Cachito does not use the Ruby runtime (no ruby is interpreted from `Gemfile`s). \n  The processing of Gemfile.lock files is handled by\n  [gemlock-parser](https://github.com/containerbuildsystem/gemlock-parser).\n* Starting with Go 1.21 Go changed the meaning of the `go` directive in `go.mod` file slightly and\n  made the constraint stricter in that the line now denotes the **minimum required** version of Go\n  instead of a suggested version of Go. If a project recommending an older version of Go is\n  processed with Go \u003e=1.21 it might happen (based on other dependencies) that its own required\n  version of Go will be bumped to 1.21+, hence dirtying the git repo - to prevent this cachito\n  uses two releases of Go SDK concurrently.\n\n### gomod\n\nThe gomod package manager works by parsing the `go.mod` file present in the source repository to\ndetermine which dependencies are required to build the application. By default, the top level module\nis discovered, but optional `path`s can be provided to point Cachito to the module(s) to discover.\n\nCachito then downloads the dependencies through [Athens](https://docs.gomods.io/) so that they\nare permanently stored and at the same time create a Go module cache to be stored in the request's\nbundle.\n\nCachito will produce a bundle that is downloadable at `/api/v1/requests/\u003cid\u003e/download`. This\nbundle will contain the application source code in the `app` directory and Go module cache of all\nthe dependencies in the `deps/gomod` directory.\n\nCachito will provide environment variables in the REST API to set for the Go tooling to use this\ncache when building the application.\n\n#### gomod vendoring\n\nWhen the user enables vendoring mode via the `gomod-vendor[-check]` [flag](#flags), Cachito will\nnot build the module cache. The `deps/gomod` directory will be empty. Instead, the vendored modules\nwill be present in the main module's `vendor` directory. Check the official documentation about\n[vendoring](https://golang.org/ref/mod#vendoring) for more details.\n\nOne important thing to note is that only a subset of the module dependency graph will be vendored.\nAs explained in the docs, only modules containing packages needed for building and testing the main\nmodule will be present. Commands that expect the entire dependency graph to be available may not\nwork as expected, if at all. Notably, `go mod tidy` and other `go mod` commands ignore the vendor\ndirectory and instead try to download the modules or access the module cache (which is empty).\n\n#### Go package level dependencies and the go-package Cachito package type\n\nWhen reporting Go sources, Cachito differentiates between modules and packages. To simplify a bit,\nany directory that contains a `go.mod` file is a *module* and any directory that contains `.go`\nfiles is a *package*. A directory that contains both `go.mod` and `.go` files is both a module and\na package. In Cachito, all packages *should* have parent modules (or be modules themselves).\n\nIn the JSON response at the `/api/v1/requests/\u003cid\u003e` endpoint, Go modules use the `gomod` type, Go\npackages use `go-package`. Packages can be matched to their parent modules based on name; package\nnames always start with the module name. In the `dependencies` section of a Go package, Cachito\nwill list only the packages that were imported by that package (a.k.a. package level deps). In the\n`dependencies` section of a Go module, Cachito will list all the modules specified as dependencies\nin `go.mod`. Submodules allowed to be replaced by a local module by default, no entry required\nin the `cachito_gomod_file_deps_allowlist` config variable.\n\nIn the Content Manifests shipped at the `/api/v1/requests/\u003cid\u003e/content-manifest` API endpoint, all\ntop-level purls and the purls of all `dependencies` refer to Go packages. The purls for the parent\nGo modules of those dependencies are present in `sources`.\n\n### npm\n\nThe npm package manager works by parsing the `npm-shrinkwrap.json` or `package-lock.json` file\npresent in the source repository to determine what dependencies are required to build the\napplication.\n\nCachito then creates an npm registry in an instance of Nexus it manages that contains just\nthe dependencies discovered in the lock file. The registry is locked down so that no other\ndependencies can be added. The connection information is stored in an\n[.npmrc](https://docs.npmjs.com/configuring-npm/npmrc.html) file accessible at the\n`/api/v1/requests/\u003cid\u003e/configuration-files` API endpoint.\n\nCachito will produce a bundle that is downloadable at `/api/v1/requests/\u003cid\u003e/download`. This\nbundle will contain the application source code in the `app` directory and individual tarballs\nof all the dependencies in the `deps/npm` directory. These tarballs are not meant to be used to\nbuild the application. They are there for convenience so that the dependency sources can be\npublished alongside your application sources. In addition, they can be used to populate a local npm\nregistry in the event that the application needs to be built without Cachito and the Nexus instance\nit manages.\n\nCachito can also handle dependencies that are not from the npm registry such as those directly\nfrom GitHub, a Git repository, or an HTTP(S) URL. Please note that if the dependency is from a\nprivate repository, set the\n[.netrc](https://www.gnu.org/software/inetutils/manual/html_node/The-_002enetrc-file.html) and\n`known_hosts` files for the Cachito workers. If the dependency location is not supported, Cachito\nwill fail the request. When Cachito encounters a supported location, it will download the\ndependency, modify the version in the [package.json](https://docs.npmjs.com/files/package.json) to\nbe unique, upload it to Nexus, modify the top level project's\n[package.json](https://docs.npmjs.com/files/package.json) and lock files to use the dependency from\nNexus instead. The modified files will be accessible at the\n`/api/v1/requests/\u003cid\u003e/configuration-files` API endpoint. If Cachito encounters this same dependency\nagain in a future request, it will use it directly from Nexus rather than downloading it and\nuploading it again. This guarantees that any dependency used for a Cachito request can be used again\nin a future Cachito request.\n\n### pip\n\nThe pip package manager works by parsing the `requirements.txt` and `requirements-build.txt` files\npresent in the source repository to determine what dependencies are required to build the\napplication. It is possible to specify different file path(s) for the requirements files as long\nas the files use the expected format.\n\nCachito then creates two repositories in an instance of Nexus it manages that contain just the\ndependencies discovered in the requirements files. PyPI dependencies are uploaded to a PyPI hosted\nrepository, external dependencies are uploaded to a raw repository. Connection information for the\nhosted repository is provided as the `PIP_INDEX_URL` environment variable accessible at the\n`/api/v1/requests/\u003cid\u003e/environment-variables` endpoint. To make external dependencies available,\nCachito modifies the requirements files for the request by replacing relevant entries with their\ncorresponding URLs from the raw repository. The modified requirements files are accessible at the\n`/api/v1/requests/\u003cid\u003e/configuration-files` endpoint.\n\nNote that the `PIP_INDEX_URL` variable exposes the username and password of the temporary user\ncreated for your request. This should not be a security concern, the user only has read access for\nthe repositories and the only reason why we do not allow anonymous read access is due to a technical\nlimitation in Nexus.\n\nCachito will produce a bundle that is downloadable at `/api/v1/requests/\u003cid\u003e/download`. This\nbundle will contain the application source code in the `app` directory and individual source\narchives of all the dependencies in the `deps/pip` directory. These archives are not meant to be\nused to build the application. They are there for convenience so that the dependency sources can be\npublished alongside your application sources. In addition, they can be used to to install packages\ndirectly from the filesystem with `pip install --no-index --no-deps \u003cpath/to/archive\u003e` (for each\nindividual source archive) in the event that the application needs to be built without Cachito and\nthe Nexus instance it manages.\n\nAs mentioned above, Cachito can also handle dependencies that are not from PyPI, such as those from\na Git repository or an HTTP(S) URL. After downloading such a dependency, Cachito will upload it to\nthe Nexus instance used for hosting permanent content. If Cachito encounters this same dependency\nagain in a future request, it will use it directly from Nexus rather than downloading it and\nuploading it again. This guarantees that any dependency used for a Cachito request can be used again\nin a future Cachito request.\n\nCompared to gomod and npm, Cachito support for pip has restrictions and limitations that users may\nnot expect. For more details, see the [Cachito pip documentation](docs/pip.md).\n\n[nexus-docs]: https://help.sonatype.com/repomanager3\n\n### git-submodule\n\nWith git-submodule as a package manager, Cachito is able to fetch git submodules within given Cachito\nrequested repo and make them available in the Cachito API request response. The git submodules are\nfetched before any other package managers are processed.\n\nCachito will produce a bundle that is downloadable at `/api/v1/requests/\u003cid\u003e/download`. This\nbundle will contain the application source code in the `app` directory. When `git-submodule`\nis passed as a `pkg_managers` argument for any Cachito request, the available git submodules\nwithin the requested repo will also become available as part of the downloadable bundle. If the\nrepo contains multiple submodules, Cachito will fetch them all. Although, recursion is not supported\nand hence only one level of submodules will be fetched.\n\nThe git submodules information will be included in the Cachito API request response at the\n`/api/v1/requests/\u003cid\u003e` endpoint as packages with the `git-submodule` type.\n\nFinally, the packages information will be used to compose Content Manifests shipped at the\n`/api/v1/requests/\u003cid\u003e/content-manifest` API endpoint.\n\nExamples:\n\n```bash\ncurl -X POST -H \"Content-Type: application/json\" http://localhost:8080/api/v1/requests \\\n-d '{\n      \"repo\": \"https://github.com/nirzari/retrodep.git\",\n      \"ref\": \"18002daac67f82f1a0f3b1f41beb3469f23116ea\",\n      \"pkg_managers\": [\"gomod\", \"git-submodule\"]\n    }'\n```\n\nIn the above case, submodules `tour` and `go-github` within specified `retrodep` repo are fetched\nas part of the downloadable bundle. They would also be available as packages for Cachito API request\nresponse. Further, they become part of the Content Manifest.\n\nIf paths to specific git submodules are provided as part of the `packages` configuration,\nCachito would fetch the submodules and then process them as regular packages.\n\n```bash\ncurl \"localhost:8080/api/v1/requests\" \\\n    -X POST \\\n    -H 'content-type: application/json' \\\n    -d '{\n          \"repo\": \"https://github.com/chmeliik/cachito-sample-pip-package/\",\n          \"ref\": \"1ca07be3001450dbc4f0224e0f763c60353d0f01\",\n          \"pkg_managers\": [\"git-submodule\", \"pip\", \"npm\"],\n          \"packages\": {\n            \"pip\": [\n              {\"path\": \"cachito-pip-with-deps\"}\n            ],\n            \"npm\": [\n              {\"path\": \"cachito-npm-test\"}\n            ]\n          }\n        }'\n```\n\nIn the above case, Cachito would fetch the submodules `cachito-pip-with-deps`, `cachito-npm-test` and\nthen process them as a regular pip and npm package respectively.\n\n### yarn\n\nCachito handles the yarn package manager in much the same way as the [npm](#npm) package manager.\nThe yarn package manager works by parsing the [`yarn.lock`](https://classic.yarnpkg.com/en/docs/yarn-lock/)\nfile present in the source repository to determine what dependencies are required to build the application.\n\nAll requests for the yarn package manager with `package-lock.json`, `npm-shrinkwrap.json` files in\nthe root directory will fail because those files are dedicated for [npm](#npm).\n\nAfter parsing, Cachito creates a yarn registry in an instance of Nexus it manages that contains just\nthe dependencies discovered in the lock file. The registry is locked down so that no other\ndependencies can be added. The connection information is stored in an\n[.npmrc](https://docs.npmjs.com/configuring-npm/npmrc.html) file accessible at the\n`/api/v1/requests/\u003cid\u003e/configuration-files` API endpoint. Cachito also generates a\n[.yarnrc](https://classic.yarnpkg.com/en/docs/yarnrc) file in the same directory as the\n[.npmrc](https://docs.npmjs.com/configuring-npm/npmrc.html) file, overwriting any existing\n.yarnrc files if they exist.\n\nCachito will produce a bundle that is downloadable at `/api/v1/requests/\u003cid\u003e/download`. This\nbundle will contain the application source code in the `app` directory and individual tarballs\nof all the dependencies in the `deps/yarn` directory. These tarballs are not meant to be used to\nbuild the application. They are there for convenience so that the dependency sources can be\npublished alongside your application sources. In addition, they can be used to populate a local yarn\nregistry in the event that the application needs to be built without Cachito and the Nexus instance\nit manages.\n\nCachito can also handle dependencies that are not from the yarn registry such as those directly\nfrom GitHub, a Git repository, or an HTTP(S) URL. Please note that if the dependency is from a\nprivate repository, set the\n[.netrc](https://www.gnu.org/software/inetutils/manual/html_node/The-_002enetrc-file.html) and\n`known_hosts` files for the Cachito workers. If the dependency location is not supported, Cachito\nwill fail the request. When Cachito encounters a supported location, it will download the\ndependency, modify the version in the [package.json](https://docs.npmjs.com/files/package.json) to\nbe unique, upload it to Nexus, modify the top level project's\n[package.json](https://docs.npmjs.com/files/package.json) and\n[yarn.lock](https://classic.yarnpkg.com/en/docs/yarn-lock/) to use the dependency from\nNexus instead. The modified files will be accessible at the\n`/api/v1/requests/\u003cid\u003e/configuration-files` API endpoint. If Cachito encounters this same dependency\nagain in a future request, it will use it directly from Nexus rather than downloading it and\nuploading it again. This guarantees that any dependency used for a Cachito request can be used again\nin a future Cachito request.\n\n### RubyGems (Bundler)\n\nThe Bundler package manager works by parsing the `Gemfile.lock` file present in the source\nrepository to determine what dependencies are required to build the application.\n\nCachito then creates a RubyGems repository in an instance of Nexus it manages that contains just the\nGEM dependencies discovered in the lock file. Also, Cachito produces a bundle downloadable at\n`/api/v1/requests/\u003cid\u003e/download` containing `app/` directory with the application source code \n(including PATH dependencies) and `/deps/rubygems` directory with all GEM and GIT dependencies.\n\nSince multiple packages in a single repo are supported, for each of these packages a configuration \nfile is provided at `/api/v1/requests/\u003cid\u003e/configuration-files` endpoint. This file redirects \nBundler to use Nexus proxy for downloading GEM dependencies and contains an entry for every Git \ndependency to be overridden by the corresponding dependency from `deps/rubygems` (instead of \ndownloading it from the internet, see [local Git \nrepos](https://bundler.io/man/bundle-config.1.html#LOCAL-GIT-REPOS) for more details). \nIf a GIT dependency is specified with `branch:` in the Gemfile, this branch is checked out so that \nlocal GIT repo redirection works.\n\nNote that configuration files expose the username and password of the temporary user created for \nyour request. This should not be a security concern, the user only has read access for \nthe repositories and the only reason why we do not allow anonymous read access is due to a technical\nlimitation in Nexus.\n\n#### Requirements for RubyGems repos\n\nThere are several constraints on RubyGems packages that are enforced by Cachito and not meeting them\nraises an exception sooner or later:\n\n- To prevent Cachito from downloading native content (binaries), `Gemfile.lock` has to contain only\n  one platform in its `PLATFORMS` section, and it has to be `ruby`.\n- All PATH dependencies listed in `Gemfile.lock` have to be explicitly allowed in Cachito's\n  config file. For example, a package which is located at the subpath `first_pkg/` from the root of a repository at URL\n  `github.com/cachito-testing/cachito-rubygems-multiple` which has PATH dependency `pathgem` will be processed properly \n  only if Cachito's config contains the following entry\n\n``` \ncachito_rubygems_file_deps_allowlist = {\n    \"cachito-rubygems-multiple/first_pkg\": [\"pathgem\"] \n}\n```\n\nNote that the name of the package (the key in the dictionary) is the last component of its repo URL.\nIf the package isn't located in the root of the repo, then its `/subpath` is appended to the name\n(`/first_pkg` in the example above). The value in the dictionary is an array of all PATH dependencies \nof the given package, where the names are parsed from their `.gemspec` files (= names which are \nlisted in `Gemfile.lock`).\n\n- Git dependencies must use `https://` and specify the exact commit hash in the `Gemfile.lock` (it's\n  done automatically by Bundler).\n- As mentioned above, Cachito provides config files so that user can simply unpack the bundle and\n  run `bundle install` from the `app` directory. This config uses local Git repos redirection, but\n  not all dependencies have `.gemspec` file supporting this. To prevent failure during `bundle\n  install` execution, check `.gemspec` files of all GIT dependencies listed in `Gemfile.lock` and\n  make sure that if there are any `require` statements, these statements are working relative to the\n  .gemspec file of that dependency, ideally by using `require_relative` keyword as suggested in\n  [this RubyGems guide](https://guides.rubygems.org/patterns/#loading-code).\n\n## Using Cachito Without Package Managers\n\nCachito can be used without specifying a package manager in a request. In that case, only the source code\npresent in the specified commit in a repository will be downloaded and cached.\n\nEven if there are package manager definitions in the source code (such as a `package.json` or a\n`requirements.txt` file), they'll be ignored using this approach. Besides not being cached, the dependencies\nwill also be absent from the content manifest.\n\nThis approach can be useful in case there's need to cache and use only the actual source code for that commit,\nwhich will then be present in the tarball served by Cachito. Here's how to create a request without package\nmanagers:\n\n```bash\ncurl \"localhost:8080/api/v1/requests\" \\\n    -X POST \\\n    -H 'content-type: application/json' \\\n    -d '{\n          \"repo\": \"https://github.com/cachito-testing/cachito-pip-with-deps/\",\n          \"ref\": \"56efa5f7eb4ff1b7ea1409dbad76f5bb378291e6\",\n          \"pkg_managers\": []\n        }'\n```\n\nIt is important to use an empty array in the `pkg_managers` key, since omitting it will make Cachito fallback\nto a default package manager.\n\nBy default, the Git history is omitted from the tarball, but it can be included in case the `include-git-dir`\n[flag](#flags) is used.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcontainerbuildsystem%2Fcachito","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcontainerbuildsystem%2Fcachito","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcontainerbuildsystem%2Fcachito/lists"}