{"id":20577829,"url":"https://github.com/toolforge/quarry","last_synced_at":"2025-04-14T19:02:56.594Z","repository":{"id":90070400,"uuid":"532047838","full_name":"toolforge/quarry","owner":"toolforge","description":"Quarry is a web service that allows to perform SQL queries against Wikipedia and sister projects databases.","archived":false,"fork":false,"pushed_at":"2025-04-08T11:17:26.000Z","size":2033,"stargazers_count":16,"open_issues_count":6,"forks_count":11,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-04-08T12:25:52.396Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://quarry.wmcloud.org","language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/toolforge.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-09-02T19:04:42.000Z","updated_at":"2025-04-08T11:17:04.000Z","dependencies_parsed_at":"2024-01-02T18:23:28.566Z","dependency_job_id":"149239f5-e035-40a0-b97e-57c6d7ee3095","html_url":"https://github.com/toolforge/quarry","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/toolforge%2Fquarry","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/toolforge%2Fquarry/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/toolforge%2Fquarry/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/toolforge%2Fquarry/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/toolforge","download_url":"https://codeload.github.com/toolforge/quarry/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248943425,"owners_count":21186957,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-16T06:08:47.648Z","updated_at":"2025-04-14T19:02:56.576Z","avatar_url":"https://github.com/toolforge.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Quarry\n[Quarry](https://quarry.wmcloud.org/) is a web service that allows to perform SQL \nqueries against Wikipedia and sister projects databases.\n\n## Setting up a local dev environment ##\n\n# docker-compose\nQuarry uses [Docker](https://docs.docker.com/engine/install/) to set up a local\nenvironment. You can set it up by:\n\n1. [Download](https://docs.docker.com/engine/install/) and install Docker and\n   [docker-compose](https://docs.docker.com/compose/) (already ships with docker on Windows and Mac)\n3. Clone the [Quarry repository](https://github.com/wikimedia/analytics-quarry-web)\n4. Run `docker-compose up`\n\nA web server will be setup, available at http://localhost:5000. Change to python\nfiles will trigger an automatic reload of the server, and your modifications\nwill imediatelly be taken into account.\nA worker node is also created to execute your queries in the background (uses the\nsame image). Finally, redis and two database instances are also started.\n\nTo stop, run `docker-compose stop` or hit CTRL-C on the terminal your docker-compose\nis running in. After that, to start with code changes, you'll want to `docker-compose down`\nto clean up. Also, this creates a docker volume where sqlite versions of query\nresults are found. That will not be cleaned up unless you run `docker-compose down -v`\n\n# minikube\nIt is possible to run a quarry system inside [minikube](https://minikube.sigs.k8s.io/docs/)!\nAt this time, you need to set it up with a cluster version before 1.22, most likely.\n\nFirst build the containers:\n```\neval $(minikube docker-env)\ndocker build . -t quarry:01\ncd docker-replica/\ndocker build . -t mywiki:01\n```\n\nYou will need to install minikube (tested on minikube 1.23) and [helm](https://helm.sh) and kubectl on your system. When you are confident those are working, start minikube with:\n - `minikube start --kubernetes-version=v1.23.15`\n - `minikube addons enable ingress`\n - `kubectl create namespace quarry`\n - `helm -n quarry install quarry helm-quarry -f helm-quarry/dev-env.yaml`\n\nThe rest of the setup instructions will display on screen as long as the install is successful.\n\n# local databases\nBoth local setups will create two databases.\n\nOne database is your quarry database the other is a wikireplica-like database\nnamed `mywiki`. This (or `mywiki_p`) is the correct thing to enter in the\ndatabase field on all local test queries.\n\nThe other database is the Quarry internal db. In your local environment, you can query Quarry internal db itself. Use then\n\"quarry\" as database name.\n\n### Updating existing containers ###\n\nIf you had already run a dev environment (that is, ran `docker-compose up`) you might want to update\nthe containers with the new dependencies by running `docker-compose build` before running\n`docker-compose up` again.\n\n## Useful commands ##\n\nTo pre-compile nunjucks templates:\n`nunjucks-precompile quarry/web/static/templates/ \u003e quarry/web/static/templates/compiled.js`\n\nSee also commands listed in the mainters documentation:\nhttps://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Quarry\n\n## Comment to Phabricator ##\n\nTo have a PR make comments to an associated phabricator ticket have the last line of the commit look like:\n\nBug: \u003cticket number\u003e\n\nFor example:\nBug: T317566\n\n## git-crypt ##\n\n[git-crypt](https://github.com/AGWA/git-crypt) is used to encrypt the config.yaml file.\nWe're using the \"symmetric key mode\" of git-crypt, with a single secret key.\n\nTo decrypt ask a maintainer for the secret key and:\n```\ngit clone https://github.com/toolforge/quarry.git\ncd quarry\ngit-crypt unlock \u003cpath to secret key\u003e\n```\n\nA copy of the decryption key is stored in [Pwstore](https://wikitech.wikimedia.org/wiki/Pwstore).\n\n## Deploying to production ##\n\nFrom `quarry-bastion.quarry.eqiad1.wikimedia.cloud`:\n\n```\ngit clone https://github.com/toolforge/quarry.git\ncd quarry\ngit checkout \u003cbranch\u003e # If not deploying main\ngit-crypt unlock \u003cpath to encryption key\u003e\nbash deploy.sh\n```\n\n### Testing and deploying a Pull Request ###\n\nAfter a PR has been reviewed, and if the CI runs successfully, the current\nprocedure is to deploy it to production _before_ merging the PR, so that you can\nverify that it is working correctly.\n\nThis procedure makes sure that we never merge non-working code to `main`, but\nyou have to be careful because if you deploy a branch that is stale, you might\nundo some recent changes deployed by somebody else.\n\n* Make sure your PR is not stale:\n* Github should note that the PR has updates or conflicts and offer to fix them.\n* Alternatively:\n  * `git checkout main`\n  * `git pull`\n  * `git checkout \u003cbranch\u003e`\n  * `git fetch`\n  * `git rebase origin/main`\n  * `git push --force`\n* Wait for the CI to build the image\n* Deploy with `bash deploy.sh`\n* If everything works, merge the PR. No need to redeploy after merging.\n* If something breaks, revert your change:\n  * `git checkout main`\n  * `git pull`\n  * Re-deploy with `bash deploy.sh`\n\n### Fresh deploy ###\nFor a completely fresh deploy, an nfs server will need to be setup. Add its hostname to helm-quarry/prod-env.yaml.\nAnd an object store will need to be generated for the tofu state file. Named \"tofu-state\"\nAnd setup mysql:\n`mysql -uquarry -h \u003ctrove hostname created in by tofu\u003e -p \u003c schema.sql`\n\nAfter a fresh deploy, go to Horizon and point the web proxy at the new cluster.\n\n## troubleshooting ##\nIf ansible doesn't detect a change for quarry helm the following can be run:\n`helm -n quarry upgrade --install quarry helm-quarry -f helm-quarry/prod-env.yaml`\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftoolforge%2Fquarry","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftoolforge%2Fquarry","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftoolforge%2Fquarry/lists"}