{"id":50111391,"url":"https://github.com/Netflix/metaflow-service","last_synced_at":"2026-06-09T04:00:48.035Z","repository":{"id":39020889,"uuid":"224924055","full_name":"Netflix/metaflow-service","owner":"Netflix","description":":rocket: Metadata tracking and UI service for Metaflow!","archived":false,"fork":false,"pushed_at":"2026-04-29T22:13:29.000Z","size":1390,"stargazers_count":227,"open_issues_count":55,"forks_count":108,"subscribers_count":186,"default_branch":"master","last_synced_at":"2026-04-30T00:24:59.391Z","etag":null,"topics":["ai","data-science","machine-learning","metaflow","ml","ml-infrastructure","ml-platform","productivity","ui"],"latest_commit_sha":null,"homepage":"http://www.metaflow.org","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Netflix.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2019-11-29T21:25:51.000Z","updated_at":"2026-04-29T22:13:37.000Z","dependencies_parsed_at":"2023-02-10T15:01:31.742Z","dependency_job_id":"2f281e91-bccd-4f49-a606-c707ee45a2d7","html_url":"https://github.com/Netflix/metaflow-service","commit_stats":null,"previous_names":[],"tags_count":48,"template":false,"template_full_name":null,"purl":"pkg:github/Netflix/metaflow-service","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Netflix%2Fmetaflow-service","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Netflix%2Fmetaflow-service/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Netflix%2Fmetaflow-service/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Netflix%2Fmetaflow-service/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Netflix","download_url":"https://codeload.github.com/Netflix/metaflow-service/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Netflix%2Fmetaflow-service/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34090751,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-09T02:00:06.510Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","data-science","machine-learning","metaflow","ml","ml-infrastructure","ml-platform","productivity","ui"],"created_at":"2026-05-23T12:32:22.512Z","updated_at":"2026-06-09T04:00:48.022Z","avatar_url":"https://github.com/Netflix.png","language":"Python","funding_links":[],"categories":["Core Infrastructure"],"sub_categories":[],"readme":"# Metaflow Service\n\nMetadata service implementation for [Metaflow](https://github.com/Netflix/metaflow).\n\nThis provides a thin wrapper around a database and keeps track of metadata associated with\nmetaflow entities such as Flows, Runs, Steps, Tasks, and Artifacts.\n\nFor more information, see [Metaflow's admin docs](https://docs.outerbounds.com/engineering/service-architecture/#metadata)\n\n## Getting Started\n\nThe service depends on the following Environment Variables to be set:\n\n- MF_METADATA_DB_HOST [defaults to localhost]\n- MF_METADATA_DB_PORT [defaults to 5432]\n- MF_METADATA_DB_USER [defaults to postgres]\n- MF_METADATA_DB_PSWD [defaults to postgres]\n- MF_METADATA_DB_NAME [defaults to postgres]\n\nOptionally you can also overrider the host and port the service runs on\n\n- MF_METADATA_PORT [defaults to 8080]\n- MF_MIGRATION_PORT [defaults to 8082]\n- MF_METADATA_HOST [defaults to 0.0.0.0]\n\nCreate triggers to broadcast any database changes via `pg_notify` on channel `NOTIFY`:\n\n- `DB_TRIGGER_CREATE`\n  - [`metadata_service` defaults to 0]\n  - [`ui_backend_service` defaults to 1]\n\n\u003e ```sh\n\u003e pip3 install ./\n\u003e python3 -m services.metadata_service.server\n\u003e ```\n\nSwagger UI: http://localhost:8080/api/doc\n\n#### Using docker-compose\n\nEasiest way to run this project is to use `docker-compose` and there are two options:\n\n- `docker-compose.yml`\n  - Assumes that Dockerfiles are pre-built and local changes are not included automatically\n  - See `docker build` section on how to pre-build the Docker images\n- `docker-compose.development.yml`\n  - Development version\n  - Includes automatic Dockerfile builds and mounts local `./services` folder inside the container\n\nRunning `docker-compose.yml`:\n\n\u003e ```sh\n\u003e docker-compose up -d\n\u003e ```\n\nRunning `docker-compose.development.yml` (recommended during development):\n\n\u003e ```sh\n\u003e docker-compose -f docker-compose.development.yml up\n\u003e ```\n\n- Metadata service is available at port `:8080`.\n- Migration service is available at port `:8082`.\n- UI service is available at port `:8083`.\n\nto access the container run\n\n\u003e ```sh\n\u003e docker exec -it metadata_service /bin/bash\n\u003e ```\n\nwithin the container curl the service directly\n\n\u003e ```sh\n\u003e curl localhost:8080/ping\n\u003e ```\n\n#### Using published image on DockerHub\n\nLatest release of the image is available on [dockerhub](https://hub.docker.com/repository/docker/netflixoss/metaflow_metadata_service)\n\n\u003e ```sh\n\u003e docker pull netflixoss/metaflow_metadata_service\n\u003e ```\n\nBe sure to set the proper env variables when running the image\n\n\u003e ```sh\n\u003e docker run -e MF_METADATA_DB_HOST='\u003cinstance_name\u003e.us-east-1.rds.amazonaws.com' \\\n\u003e -e MF_METADATA_DB_PORT=5432 \\\n\u003e -e MF_METADATA_DB_USER='postgres' \\\n\u003e -e MF_METADATA_DB_PSWD='postgres' \\\n\u003e -e MF_METADATA_DB_NAME='metaflow' \\\n\u003e -it -p 8082:8082 -p 8080:8080 metaflow_metadata_service\n\u003e ```\n\n### Running tests\n\nTests are run using [Tox](https://tox.readthedocs.io) and [pytest](https://docs.pytest.org).\n\nRun following command to execute tests in Dockerized environment:\n\n\u003e ```sh\n\u003e docker-compose -f docker-compose.test.yml up -V --abort-on-container-exit\n\u003e ```\n\nAbove command will make sure there's PostgreSQL database available.\n\nUsage without Docker:\n\nThe test suite requires a PostgreSQL database, along with the following environment variables for connecting the tested services to the DB.\n\n- MF_METADATA_DB_HOST=db_test\n- MF_METADATA_DB_PORT=5432\n- MF_METADATA_DB_USER=test\n- MF_METADATA_DB_PSWD=test\n- MF_METADATA_DB_NAME=test\n\n\u003e ```sh\n\u003e # Run all tests\n\u003e tox\n\u003e\n\u003e # Run unit tests only\n\u003e tox -e unit\n\u003e\n\u003e # Run integration tests only\n\u003e tox -e integration\n\u003e\n\u003e # Run both unit \u0026 integrations tests in parallel\n\u003e tox -e unit,integration -p\n\u003e ```\n\n### Executing flows against a local Metadata service\n\nWith the metadata service up and running at `http://localhost:8080`, you are able to use this as the service when executing Flows with the Metaflow client locally via\n\n```sh\nMETAFLOW_SERVICE_URL=http://localhost:8080 METAFLOW_DEFAULT_METADATA=\"service\" python3 basicflow.py run\n```\n\nAlternatively you can configure a default profile with the service URL for the Metaflow client to use. See [Configuring metaflow](https://docs.outerbounds.com/engineering/operations/configure-metaflow/) for instructions.\n\n## Migration Service\n\nThe Migration service is a tool to help users manage underlying DB migrations and launch\nthe most recent compatible version of the metadata service\n\nNote that it is possible to run the two services independently and a Dockerfile is\nsupplied for each service. However the default Dockerfile combines the two services.\n\nAlso note that at runtime the migration service and the metadata service are completely disjoint and\ndo not communicate with each other\n\n### Migrating to the latest db schema\n\nNote may need to do a rolling restart to get latest version of the image if you don't have it already\n\nYou can manage the migration either via the api provided or with the utility cli provided with `migration_tools.py`\n\n- check status and note version you are on\n  - Api: `/db_schema_status`\n  - cli: `python3 migration_tools.py db-status`\n- see if there are migrations to be run\n  - if there are any migrations to be run `is_up_to_date` should be false and a list of migrations to be applied\n    will be shown under `unapplied_migrations`\n- take backup of db\n  - in case anything goes wrong it is a good idea to take a back up of the db\n- migrations may cause downtime depending on what is being run as part of the migration\n- Note concurrent updates are not supported. it may be advisable to reduce your cluster size to a single node\n- upgrade db schema\n  - Api: `/upgrade`\n  - cli: `python3 migration_tools.py upgrade`\n- check status again to verify you are on up to date version\n  - Api: `/db_schema_status`\n  - cli: `python3 migration_tools.py db-status`\n  - Note that `is_up_to_date` should be set to True and `migration_in_progress` should be set to False\n- do a rolling restart of the metadata service cluster\n  - In order for the migration to be effective a full restart of the containers is required\n- latest available version of service should be ready\n  - cli: `python3 migration_tools.py metadata-service-version`\n- If you had previously scaled down your cluster it should be safe to return it to the desired number of containers\n\n### Under the Hood: What is going on in the Docker Container\n\nWithin the published metaflow_metadata_service image the migration service is packaged along with\nthe latest version of the metadata service compatible with every version of the db. This means that multiple versions\nof the metadata service comes bundled with the image, each is installed under a different virtual env.\n\nWhen the container spins up, the migration service is launched first and determines what virtualenv to activate\ndepending on the schema version of the DB. This will determine which version of the metadata service will run.\n\n## Release\n\nSee the [release docs](RELEASE.md)\n\n## Get in Touch\n\nThere are several ways to get in touch with us:\n\n- Open an issue at: https://github.com/Netflix/metaflow-service\n- Email us at: help@metaflow.org\n- Chat with us on: http://chat.metaflow.org\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FNetflix%2Fmetaflow-service","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FNetflix%2Fmetaflow-service","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FNetflix%2Fmetaflow-service/lists"}