{"id":16828431,"url":"https://github.com/jameslamb/lightgbm-dask-testing","last_synced_at":"2025-09-06T13:32:59.991Z","repository":{"id":41826765,"uuid":"329827482","full_name":"jameslamb/lightgbm-dask-testing","owner":"jameslamb","description":"Test LightGBM's Dask integration on different cluster types","archived":false,"fork":false,"pushed_at":"2024-12-31T01:29:52.000Z","size":120,"stargazers_count":12,"open_issues_count":4,"forks_count":5,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-12-31T02:27:13.310Z","etag":null,"topics":["aws","dask","dask-distributed","docker","lightgbm","machine-learning"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jameslamb.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-01-15T06:25:12.000Z","updated_at":"2024-12-31T01:29:55.000Z","dependencies_parsed_at":"2024-10-28T12:45:57.410Z","dependency_job_id":null,"html_url":"https://github.com/jameslamb/lightgbm-dask-testing","commit_stats":{"total_commits":58,"total_committers":1,"mean_commits":58.0,"dds":0.0,"last_synced_commit":"696752816b85bd25656ace60c1aef0093f4ddcd9"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jameslamb%2Flightgbm-dask-testing","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jameslamb%2Flightgbm-dask-testing/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jameslamb%2Flightgbm-dask-testing/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jameslamb%2Flightgbm-dask-testing/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jameslamb","download_url":"https://codeload.github.com/jameslamb/lightgbm-dask-testing/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":232126480,"owners_count":18476238,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","dask","dask-distributed","docker","lightgbm","machine-learning"],"created_at":"2024-10-13T11:26:35.359Z","updated_at":"2025-09-06T13:32:59.969Z","avatar_url":"https://github.com/jameslamb.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Testing `lightgbm.dask`\n\n[![GitHub Actions](https://github.com/jameslamb/lightgbm-dask-testing/actions/workflows/main.yml/badge.svg?branch=main)](https://github.com/jameslamb/lightgbm-dask-testing/actions/workflows/main.yml)\n\nThis repository can be used to test and develop changes to LightGBM's Dask integration.\nIt contains the following useful features:\n\n* `make` recipes for building a local development image with `lightgbm` installed from a local copy, and Jupyter Lab running for interactive development\n* Jupyter notebooks for testing `lightgbm.dask` against a `LocalCluster` (multi-worker, single-machine) and a `dask_cloudprovider.aws.FargateCluster` (multi-worker, multi-machine)\n* `make` recipes for publishing a custom container image to ECR Public repository, for use with AWS Fargate\n\n\u003chr\u003e\n\n**Contents**\n\n- [Getting Started](#getting-started)\n- [Develop in Jupyter](#develop-in-jupyter)\n- [Test with a LocalCluster](#test-with-a-localcluster)\n- [Test with a FargateCluster](#test-with-a-fargatecluster)\n- [Run LightGBM unit tests](#run-lightgbm-unit-tests)\n- [Profile LightGBM code](#profiling)\n    - [runtime profiling](#runtime-profiling)\n\n## Getting Started\n\nTo begin, clone a copy of LightGBM to a folder `LightGBM` at the root of this repo.\nYou can do this however you want, for example:\n\n```shell\ngit clone \\\n    --recursive \\\n    git@github.com:microsoft/LightGBM.git \\\n    ./LightGBM\n```\n\nIf you're developing a reproducible example for [an issue](https://github.com/microsoft/LightGBM/issues) or you're testing a potential [pull request](https://github.com/microsoft/LightGBM/pulls), you probably want to clone LightGBM from your fork, instead of the main repo.\n\n\u003chr\u003e\n\n## Develop in Jupyter\n\nThis section describes how to test a version of LightGBM in Jupyter.\n\n#### 1. Build the notebook image\n\nRun the following to build an image that includes `lightgbm`, all its dependencies, and a JupyterLab setup.\n\n```shell\nmake notebook-image\n```\n\nThe first time you run this, it will take a few minutes as this project needs to build a base image with LightGBM's dependencies and needs to compile the LightGBM C++ library.\n\nEvery time after that, `make notebook-image` should run very quickly.\n\n#### 2. Run a notebook locally\n\nStart up Jupyter Lab!\nThis command will run Jupyter Lab in a container using the image you built with `make notebook-image`.\n\n```shell\nmake start-notebook\n```\n\nNavigate to `http://127.0.0.1:8888/lab` in your web browser.\n\nThe command `make start-notebook` mounts your current working directory into the running container.\nThat means that even though Jupyter Lab is running inside the container, changes that you make in it will be saved on your local filesystem even after you shut the container down.\nSo you can edit and create notebooks and other code in there with confidence!\n\nWhen you're done with the notebook, stop the container by running the following from another shell:\n\n```shell\nmake stop-notebook\n```\n\n\u003chr\u003e\n\n## Test with a `LocalCluster`\n\nTo test `lightgbm.dask` on a `LocalCluster`, run the steps in [\"Develop in Jupyter\"](#develop-in-jupyter), then try out [`local.ipynb`](./notebooks/local-cluster.ipynb) or your own notebooks.\n\n\u003chr\u003e\n\n## Test with a `FargateCluster`\n\nThere are some problems with Dask code which only arise in a truly distributed, multi-machine setup.\nTo test for these sorts of issues, I like to use [`dask-cloudprovider`](https://github.com/dask/dask-cloudprovider).\n\nThe steps below describe how to test a local copy of LightGBM on a `FargateCluster` from `dask-cloudprovider`.\n\n#### 1. Build the cluster image\n\nBuild an image that can be used for the scheduler and works in the Dask cluster you'll create on AWS Fargate.\nThis image will have your local copy of LightGBM installed in it.\n\n```shell\nmake cluster-image\n```\n\n#### 2. Install and configure the AWS CLI\n\nFor the rest of the steps in this section, you'll need access to AWS resources.\nTo begin, install the AWS CLI if you don't already have it.\n\n```shell\npip install --upgrade awscli\n```\n\nNext, configure your shell to make authenticated requests to AWS.\nIf you've never done this, you can see [the AWS CLI docs](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html).\n\nThe rest of this section assumes that the shell variables `AWS_SECRET_ACCESS_KEY` and `AWS_ACCESS_KEY_ID` have been sett.\n\nI like to set these by keeping them in a file\n\n```text\n# file: aws.env\nAWS_SECRET_ACCESS_KEY=your-key-here\nAWS_ACCESS_KEY_ID=your-access-key-id-here\n```\n\nand then sourcing that file\n\n```shell\nset -o allexport\nsource aws.env\nset +o allexport\n```\n\n#### 3. Push the cluster image to ECR\n\nTo use the cluster image in the containers you spin up on Fargate, it has to be available in a container registry.\nThis project uses the free AWS Elastic Container Registry (ECR) Public.\nFor more information on ECR Public, see [the AWS docs](https://docs.amazonaws.cn/en_us/AmazonECR/latest/public/docker-push-ecr-image.html).\n\nThe command below will create a new repository on ECR Public, store the details of that repository in a file `ecr-details.json`, and push the cluster image to it.\nThe cluster image will not contain your credentials, notebooks, or other local files.\n\n```shell\nmake push-image\n```\n\nThis may take a few minutes to complete.\n\n#### 4. Run the AWS notebook\n\nFollow the steps in [\"Develop in Jupyter\"](#develop-in-jupyter) to get a local Jupyter Lab running.\nOpen [`aws.ipynb`](./notebooks/fargate-cluster.ipynb).\nThat notebook contains sample code that uses `dask-cloudprovider` to provision a Dask cluster on AWS Fargate.\n\nYou can view the cluster's current state and its logs by navigating to the Elastic Container Service (ECS) section of the AWS console.\n\n#### 5. Clean Up\n\nAs you work on whatever experiment you're doing, you'll probably find yourself wanting to repeat these steps multiple times.\n\nTo remove the image you pushed to ECR Public and the repository you created there, run the following\n\n```shell\nmake delete-repo\n```\n\nThen, repeat the steps above to rebuild your images and test again.\n\n\u003chr\u003e\n\n## Run LightGBM unit tests\n\nThis repo makes it easy to run `lightgbm`'s Dask unit tests in a containerized setup.\n\n```shell\nmake lightgbm-unit-tests\n```\n\nPass variable `DASK_VERSION` to use a different version of `dask` / `distributed`.\n\n```shell\nmake lightgbm-unit-tests \\\n    -e DASK_VERSION=2024.12.0\n```\n\n## Profile LightGBM code \u003ca name=\"profiling\"\u003e\u003c/a\u003e\n\n### runtime profiling\n\nTo try to identify expensive parts of the code path for `lightgbm`, you can run its examples under `cProfile` ([link](https://docs.python.org/3/library/profile.html)) and then visualize those profiling results with `snakeviz` ([link](https://jiffyclub.github.io/snakeviz/)).\n\n```shell\nmake profile\n```\n\nThen navigate to `http://0.0.0.0:8080/snakeviz/%2Fprofiling-output` in your web browser.\n\n### memory profiling\n\nTo summarize memory allocations in typical uses of LightGBM, and to attribute those memory allocations to particular codepaths, you can run its examples under `memray` ([link](https://github.com/bloomberg/memray)).\n\n```shell\nmake profile-memory-usage\n```\n\nThat will generate a bunch of HTML files.\nView them in your browser by running the following, then navigating to `localhost:1234`.\n\n```shell\npython -m http.server \\\n    --directory ./profiling-output/memory-usage \\\n    1234\n```\n\n## Useful Links\n\n* https://github.com/microsoft/LightGBM/pull/3515\n* https://docs.aws.amazon.com/cli/latest/reference/ecr-public/\n* https://docs.amazonaws.cn/en_us/AmazonECR/latest/public/docker-push-ecr-image.html\n* https://github.com/dask/dask-docker\n* https://docs.aws.amazon.com/AmazonECR/latest/public/public-registries.html\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjameslamb%2Flightgbm-dask-testing","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjameslamb%2Flightgbm-dask-testing","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjameslamb%2Flightgbm-dask-testing/lists"}