{"id":24698657,"url":"https://github.com/faradayio/falconeri","last_synced_at":"2025-07-02T22:33:02.056Z","repository":{"id":33816756,"uuid":"139976019","full_name":"faradayio/falconeri","owner":"faradayio","description":"Transform lots of data using a Kubernetes cluster","archived":false,"fork":false,"pushed_at":"2023-06-30T11:26:26.000Z","size":22781,"stargazers_count":7,"open_issues_count":17,"forks_count":2,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-04-02T21:37:00.575Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://github.com/faradayio/falconeri/blob/master/guide/src/SUMMARY.md","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/faradayio.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-07-06T11:25:30.000Z","updated_at":"2023-03-18T21:51:33.000Z","dependencies_parsed_at":"2025-07-02T22:31:38.761Z","dependency_job_id":null,"html_url":"https://github.com/faradayio/falconeri","commit_stats":null,"previous_names":[],"tags_count":35,"template":false,"template_full_name":null,"purl":"pkg:github/faradayio/falconeri","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/faradayio%2Ffalconeri","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/faradayio%2Ffalconeri/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/faradayio%2Ffalconeri/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/faradayio%2Ffalconeri/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/faradayio","download_url":"https://codeload.github.com/faradayio/falconeri/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/faradayio%2Ffalconeri/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263226069,"owners_count":23433630,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-27T04:29:41.939Z","updated_at":"2025-07-02T22:33:01.990Z","avatar_url":"https://github.com/faradayio.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# `falconeri`: Run batch data-processing jobs on Kubernetes\n\nFalconeri runs on a pre-existing Kubernetes cluster, and it allows you to use Docker images to transform large data files stored in cloud buckets.\n\nFor detailed instructions, see the [Falconeri guide][guide].\n\nSetup is simple:\n\n```sh\nfalconeri deploy\nfalconeri proxy\nfalconeri migrate\n```\n\nRunning is similarly simple:\n\n```sh\nfalconeri job run my-job.json\n```\n\n[guide]: https://github.com/faradayio/falconeri/blob/master/guide/src/SUMMARY.md\n\n## REST API\n\nNote that `falconerid` has a complete REST API, and you don't actually need to use the `falconeri` command-line tool during normal operations. This is used internally at Faraday, and it should be fairly self-explanatory, but it isn't documented.\n\n## Contributing to `falconeri`\n\nFirst, you'll need to set up some development tools:\n\n```sh\ncargo install just\ncargo install cargo-deny\ncargo install cargo-edit\n\n# If you want to change the SQL schema, you'll also need the `diesel` CLI. This\n# may also require installing some C development libraries.\ncargo install diesel_cli\n```\n\nNext, check out the available tasks in the `justfile`:\n\n```sh\njust --list\n```\n\nFor local development, you'll want to install [`minikube`](https://minikube.sigs.k8s.io/docs/start/). Start it as follows, and point your local Docker at it:\n\n```sh\nminikube start\neval $(minikube docker-env)\n```\n\nThen build an image. **You must have `docker-env` set up as above** if you want to test this image.\n\n```sh\njust image\n```\n\nNow you can deploy a development version of `falconeri` to `minikube`:\n\n```sh\ncargo run -p falconeri -- deploy --development\n```\n\nCheck to see if your cluster comes up:\n\n```sh\nkubectl get all\n\n# Or if you have `watch`, try:\nwatch -n 5 kubectl get all\n```\n\n### Running the example program\n\nRunning the example program is necessary to make sure `falconeri` works. First, run:\n\n```sh\ncd examples/word-frequencies\n```\n\nNext, you'll need to set up an S3 bucket. If you're **at Faraday,** run:\n\n```sh\n# Faraday only!\njust secret\n```\n\nIf you're **not a Faraday**, create an S3 bucket, and place a `*.txt` file in `$MY_BUCKET/texts/`. Then, set up an AWS access key with read/write access to the bucket, and save the key pair in files named `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`. Then run:\n\n```sh\n# Not for Faraday!\nkubectl create secret generic s3 \\\n    --from-file=AWS_ACCESS_KEY_ID \\\n    --from-file=AWS_SECRET_ACCESS_KEY\n```\n\nThen edit `word-frequencies.json` to point at your bucket.\n\nNow you can build the worker image using:\n\n```sh\n# This assumes you previously ran `just image` in the top-level directory.\njust image\n```\n\nIn another terminal, start a `falconeri proxy` command:\n\n```sh\njust proxy\n```\n\nIn the original terminal, start the job:\n\n```sh\njust run\n```\n\nFrom here, you can use `falconeri job describe $ID` and `kubectl` normally. See the [guide][] for more details.\n\n### Releasing a new `falconeri`\n\nFor now, this process should only be done by Eric, because there are some semver issues that we haven't fully thought out yet.\n\nFirst, edit the `CHANGELOG.md` file to describe the release. Next, bump the version:\n\n```sh\njust set-version $MY_NEW_VERSION\n```\n\nCommit your changes with a subject like:\n\n```sh\n$MY_NEW_VERSION: Short description\n```\n\nYou should be able to make a release by running:\n\n```sh\njust MODE=release release\n```\n\nOnce the the binaries have built, you can find them at https://github.com/faradayio/falconeri/releases. The `CHANGELOG.md` entry should be automatically converted to release notes.\n\n### Changing the database schema\n\nWe use [`diesel`][diesel] as our ORM. This has complex tradeoffs, and we've been considering whether to move to `sqlx` or `tokio-postgres` in the future. See above for instructions on install `diesel_cli`.\n\n[diesel]: https://diesel.rs/\n\nTo create a new migration, run:\n\n```sh\ncd falconeri_common\ndiesel migration generate add_some_table_or_columns\n```\n\nThis will generate a new `up.sql` and `down.sql` file which you can edit as needed. These work like Rails migrations: `up.sql` makes the necessary changes to the database, and `down.sql` reverts those changes. But in this case, migrations are written using SQL.\n\nYou can show a list of migrations using:\n\n```sh\ndiesel migration list\n```\n\nTo apply pending migrations, run:\n\n```sh\ndiesel migration run\n\n# Test the `down.sql` file as well.\ndiesel migration revert\ndiesel migration run\n```\n\nAfter doing this, edit `falconeri_common/src/schema.rs` and revert any changes which break the schema, and any which introduce warnings. You will probably also need to update any corresponding files in `falconeri_common/src/models/`.\n\nMigrations will be compiled into the server and run on deploys, as well.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffaradayio%2Ffalconeri","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffaradayio%2Ffalconeri","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffaradayio%2Ffalconeri/lists"}