{"id":26671173,"url":"https://github.com/scaleapi/nucleus-python-client","last_synced_at":"2026-03-06T17:08:04.272Z","repository":{"id":37459257,"uuid":"296474594","full_name":"scaleapi/nucleus-python-client","owner":"scaleapi","description":"The official Python SDK for Nucleus, part of Scale API, the Data Platform for AI","archived":false,"fork":false,"pushed_at":"2026-03-03T16:22:05.000Z","size":3527,"stargazers_count":26,"open_issues_count":21,"forks_count":11,"subscribers_count":43,"default_branch":"master","last_synced_at":"2026-03-03T20:41:31.442Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://dashboard.scale.com/nucleus","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/scaleapi.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2020-09-18T00:41:44.000Z","updated_at":"2026-03-03T16:22:10.000Z","dependencies_parsed_at":"2024-01-01T23:19:40.190Z","dependency_job_id":"d236fc71-eb3a-4161-aa56-de4bc8ebcd75","html_url":"https://github.com/scaleapi/nucleus-python-client","commit_stats":{"total_commits":718,"total_committers":40,"mean_commits":17.95,"dds":0.8119777158774373,"last_synced_commit":"18696ebd267a5af51fd667bf433c33bd3d4fcc48"},"previous_names":[],"tags_count":138,"template":false,"template_full_name":null,"purl":"pkg:github/scaleapi/nucleus-python-client","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scaleapi%2Fnucleus-python-client","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scaleapi%2Fnucleus-python-client/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scaleapi%2Fnucleus-python-client/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scaleapi%2Fnucleus-python-client/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/scaleapi","download_url":"https://codeload.github.com/scaleapi/nucleus-python-client/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scaleapi%2Fnucleus-python-client/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30186781,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-06T14:42:24.748Z","status":"ssl_error","status_checked_at":"2026-03-06T14:42:14.925Z","response_time":250,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-25T23:44:05.280Z","updated_at":"2026-03-06T17:08:04.182Z","avatar_url":"https://github.com/scaleapi.png","language":"Python","readme":"# Nucleus\n\nhttps://dashboard.scale.com/nucleus\n\nAggregate metrics in ML are not good enough. To improve production ML, you need to understand their qualitative failure modes, fix them by gathering more data, and curate diverse scenarios.\n\nScale Nucleus helps you:\n\n- Visualize your data\n- Curate interesting slices within your dataset\n- Review and manage annotations\n- Measure and debug your model performance\n\nNucleus is a new way—the right way—to develop ML models, helping us move away from the concept of one dataset and towards a paradigm of collections of scenarios.\n\n## Installation\n\n`$ pip install scale-nucleus`\n\n## CLI installation\n\nWe recommend installing the CLI via `pipx` (https://pypa.github.io/pipx/installation/). This makes sure that\nthe CLI does not interfere with you system packages and is accessible from your favorite terminal.\n\nFor MacOS:\n\n```bash\nbrew install pipx\npipx ensurepath\npipx install scale-nucleus\n# Optional installation of shell completion (for bash, zsh or fish)\nnu install-completions\n```\n\nOtherwise, install via pip (requires pip 19.0 or later):\n\n```bash\npython3 -m pip install --user pipx\npython3 -m pipx ensurepath\npython3 -m pipx install scale-nucleus\n# Optional installation of shell completion (for bash, zsh or fish)\nnu install-completions\n```\n\n## Common issues/FAQ\n\n### Outdated Client\n\nNucleus is iterating rapidly and as a result we do not always perfectly preserve backwards compatibility with older versions of the client. If you run into any unexpected error, it's a good idea to upgrade your version of the client by running\n\n```\npip install --upgrade scale-nucleus\n```\n\n## Usage\n\nFor the most up to date documentation, reference: https://dashboard.scale.com/nucleus/docs/api?language=python.\n\n## For Developers\n\nClone from github and install as editable\n\n```\ngit clone git@github.com:scaleapi/nucleus-python-client.git\ncd nucleus-python-client\npip3 install poetry\npoetry install\n```\n\nPlease install the pre-commit hooks by running the following command:\n\n```python\npoetry run pre-commit install\n```\n\nWhen releasing a new version please add release notes to the changelog in `CHANGELOG.md`.\n\n**Best practices for testing:**\n(1). Please run pytest from the root directory of the repo, i.e.\n\n```\npoetry run pytest tests/test_dataset.py\n```\n\n(2) To skip slow integration tests that have to wait for an async job to start.\n\n```\npoetry run pytest -n auto -m \"not integration\"\n```\nNote: \"-n auto\" is used for pytest-xdist parallelization\n\n## Pydantic Models\n\nPrefer using [Pydantic](https://pydantic-docs.helpmanual.io/usage/models/) models rather than creating raw dictionaries\nor dataclasses to send or receive over the wire as JSONs. Pydantic is created with data validation in mind and provides very clear error\nmessages when it encounters a problem with the payload.\n\nThe Pydantic model(s) should mirror the payload to send. To represent a JSON payload that looks like this:\n\n```json\n{\n  \"example_json_with_info\": {\n    \"metadata\": {\n      \"frame\": 0\n    },\n    \"reference_id\": \"frame0\",\n    \"url\": \"s3://example/scale_nucleus/2021/lidar/0038711321865000.json\",\n    \"type\": \"pointcloud\"\n  },\n  \"example_image_with_info\": {\n    \"metadata\": {\n      \"author\": \"Picasso\"\n    },\n    \"reference_id\": \"frame0\",\n    \"url\": \"s3://bucket/0038711321865000.jpg\",\n    \"type\": \"image\"\n  }\n}\n```\n\nCould be represented as the following structure. Note that the field names map to the JSON keys and the usage of field\nvalidators (`@validator`).\n\n```python\nimport os.path\nfrom pydantic import BaseModel, validator\nfrom typing import Literal\n\n\nclass JsonWithInfo(BaseModel):\n    metadata: dict  # any dict is valid\n    reference_id: str\n    url: str\n    type: Literal[\"pointcloud\", \"recipe\"]\n\n    @validator(\"url\")\n    def has_json_extension(cls, v):\n        if not v.endswith(\".json\"):\n            raise ValueError(f\"Expected '.json' extension got {v}\")\n        return v\n\n\nclass ImageWithInfo(BaseModel):\n    metadata: dict  # any dict is valid\n    reference_id: str\n    url: str\n    type: Literal[\"image\", \"mask\"]\n\n    @validator(\"url\")\n    def has_valid_extension(cls, v):\n        valid_extensions = {\".jpg\", \".jpeg\", \".png\", \".tiff\"}\n        _, extension = os.path.splitext(v)\n        if extension not in valid_extensions:\n            raise ValueError(f\"Expected extension in {valid_extensions} got {v}\")\n        return v\n\n\nclass ExampleNestedModel(BaseModel):\n    example_json_with_info: JsonWithInfo\n    example_image_with_info: ImageWithInfo\n\n# Usage:\nimport requests\npayload = requests.get(\"/example\")\nparsed_model = ExampleNestedModel.parse_obj(payload.json())\nrequests.post(\"example/post_to\", json=parsed_model.dict())\n```\n\n### Migrating to Pydantic\n\n- When migrating an interface from a dictionary use `nucleus.pydantic_base.DictCompatibleModel`. That allows you to get\n  the benefits of Pydantic but maintaints backwards compatibility with a Python dictionary by delegating `__getitem__` to\n  fields.\n- When migrating a frozen dataclass use `nucleus.pydantic_base.ImmutableModel`. That is a base class set up to be\n  immutable after initialization.\n\n**Updating documentation:**\nWe use [Sphinx](https://www.sphinx-doc.org/en/master/) to autogenerate our API Reference from docstrings.\n\nTo test your local docstring changes, run the following commands from the repository's root directory:\n\n```\npoetry shell\ncd docs\nsphinx-autobuild . ./_build/html --watch ../nucleus\n```\n\n`sphinx-autobuild` will spin up a server on localhost (port 8000 by default) that will watch for and automatically rebuild a version of the API reference based on your local docstring changes.\n\n## Custom Metrics using Shapely in scale-validate\n\nCertain metrics use `Shapely` and `rasterio` which is added as optional dependencies.\n\n```bash\npip install scale-nucleus[metrics]\n```\n\nNote that you might need to install a local GEOS package since Shapely doesn't provide binaries bundled with GEOS for every platform.\n\n```bash\n#Mac OS\nbrew install geos\n# Ubuntu/Debian flavors\napt-get install libgeos-dev\n```\n\nTo develop it locally use\n\n`poetry install --extras metrics`\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscaleapi%2Fnucleus-python-client","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fscaleapi%2Fnucleus-python-client","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscaleapi%2Fnucleus-python-client/lists"}