{"id":13671429,"url":"https://github.com/balsa-project/balsa","last_synced_at":"2025-04-14T19:44:12.113Z","repository":{"id":43922386,"uuid":"462052660","full_name":"balsa-project/balsa","owner":"balsa-project","description":"Balsa is a learned SQL query optimizer. It tailor optimizes your SQL queries to find the best execution plans for your hardware and engine.","archived":false,"fork":false,"pushed_at":"2022-06-13T17:24:51.000Z","size":30509,"stargazers_count":115,"open_issues_count":5,"forks_count":23,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-05-22T13:31:15.410Z","etag":null,"topics":["databases","deep-reinforcement-learning","learned-database","learned-query-optimization","postgresql","query-optimization"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/balsa-project.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-02-21T22:31:02.000Z","updated_at":"2024-05-13T01:25:05.000Z","dependencies_parsed_at":"2022-07-16T15:47:05.930Z","dependency_job_id":null,"html_url":"https://github.com/balsa-project/balsa","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/balsa-project%2Fbalsa","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/balsa-project%2Fbalsa/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/balsa-project%2Fbalsa/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/balsa-project%2Fbalsa/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/balsa-project","download_url":"https://codeload.github.com/balsa-project/balsa/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248949895,"owners_count":21188168,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["databases","deep-reinforcement-learning","learned-database","learned-query-optimization","postgresql","query-optimization"],"created_at":"2024-08-02T09:01:09.547Z","updated_at":"2025-04-14T19:44:12.091Z","avatar_url":"https://github.com/balsa-project.png","language":"Python","readme":"# Balsa\n\n\u003cp\u003e\n    \u003ca href=\"http://arxiv.org/abs/2201.01441\"\u003e\n        \u003cimg alt=\"arXiv\" src=\"https://img.shields.io/badge/arXiv-2201.01441-blue\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/balsa-project/balsa/blob/master/LICENSE\"\u003e\n        \u003cimg alt=\"LICENSE\" src=\"https://img.shields.io/github/license/balsa-project/balsa.svg?color=brightgreen\"\u003e\n    \u003c/a\u003e\n\u003c/p\u003e\n\n**Balsa is a learned query optimizer**. It learns to optimize SQL queries by trial-and-error using deep reinforcement learning and sim-to-real learning.\n\nNotably, Balsa is the first end-to-end learned optimizer that does not rely on learning from an existing expert optimizer's plans, while being able to surpass the performance of expert plans, sometimes by a sizable margin.\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"assets/balsa-overview.png\" width=\"485\"/\u003e\n\u003cp\u003e\n\nFor technical details, see the SIGMOD 2022 paper, [Balsa: Learning a Query Optimizer Without Expert Demonstrations](https://zongheng.me/pubs/balsa-sigmod2022.pdf) [[bibtex](#citation)].\n\n[**Setup**](#setup)\n| [**Quickstart**](#quickstart)\n| [**Experiment configs**](#experiment-configs)\n| [**Metrics and artifacts**](#metrics-and-artifacts)\n| [**Cluster mode**](#cluster-mode)\n| [**Q\u0026A**](#qa)\n| [**Citation**](#citation)\n\n## Setup\n\nTo quickly get started, run the following on one machine which will run both\nthe agent neural network and query execution.\n\n1. Clone and install Balsa.\n\n   \u003cdetails\u003e\n   \u003csummary\u003eDetails\u003c/summary\u003e\n\n    \u003cbr\u003e\n\n    ```bash\n    git clone https://github.com/balsa-project/balsa.git ~/balsa\n   cd ~/balsa\n   # Recommended: run inside a Conda environment.\n   # All commands that follow are run under this conda env.\n   conda create -n balsa python=3.7 -y\n   conda activate balsa\n\n   pip install -r requirements.txt\n   pip install -e .\n   pip install -e pg_executor\n    ```\n   \u003c/details\u003e\n\n\n2. Install Postgres v12.5.\n\n   \u003cdetails\u003e\n   \u003csummary\u003eDetails\u003c/summary\u003e\n\n    \u003cbr\u003eThis can be done in several ways.  For example, installing from source:\n\n    ```bash\n    cd ~/\n    wget https://ftp.postgresql.org/pub/source/v12.5/postgresql-12.5.tar.gz\n    tar xzvf postgresql-12.5.tar.gz\n    cd postgresql-12.5\n    ./configure --prefix=/data/postgresql-12.5 --without-readline\n    sudo make -j\n    sudo make install\n\n    echo 'export PATH=/data/postgresql-12.5/bin:$PATH' \u003e\u003e ~/.bashrc\n    source ~/.bashrc\n    ```\n   \u003c/details\u003e\n\n3. Install the [`pg_hint_plan`](https://github.com/ossc-db/pg_hint_plan) extension v1.3.7.\n\n   \u003cdetails\u003e\n   \u003csummary\u003eDetails\u003c/summary\u003e\n    \u003cbr\u003e\n\n    ```bash\n    cd ~/\n    git clone https://github.com/ossc-db/pg_hint_plan.git -b REL12_1_3_7\n    cd pg_hint_plan\n    # Modify Makefile: change line\n    #   PG_CONFIG = pg_config\n    # to\n    #   PG_CONFIG = /data/postgresql-12.5/bin/pg_config\n    vim Makefile\n    make\n    sudo make install\n    ```\n   \u003c/details\u003e\n\n\n4. Load data into Postgres \u0026 start it with the correct [configuration](./conf/balsa-postgresql.conf).\n\n   \u003cdetails\u003e\n   \u003csummary\u003eDetails\u003c/summary\u003e\n    \u003cbr\u003e\n    For example, load the Join Order Benchmark (JOB) tables:\n\n    ```bash\n    cd ~/\n    mkdir -p datasets/job \u0026\u0026 pushd datasets/job\n    wget -c http://homepages.cwi.nl/~boncz/job/imdb.tgz \u0026\u0026 tar -xvzf imdb.tgz \u0026\u0026 popd\n    # Prepend headers to CSV files\n    python3 ~/balsa/scripts/prepend_imdb_headers.py\n\n    # Create and start the DB\n    pg_ctl -D ~/imdb initdb\n\n    # Copy custom PostgreSQL configuration.\n    cp ~/balsa/conf/balsa-postgresql.conf ~/imdb/postgresql.conf\n\n    # Start the server\n    pg_ctl -D ~/imdb start -l logfile\n\n    # Load data + run analyze (can take several minutes)\n    cd ~/balsa\n    bash load-postgres/load_job_postgres.sh ~/datasets/job\n    ```\n\n    Perform basic checks:\n    ```sql\n    psql imdbload\n    # Check that both primary and foreign key indexes are built:\n    imdbload=# \\d title\n    [...]\n\n    # Check that data count is correct:\n    imdbload=# select count(*) from title;\n      count\n    ---------\n    2528312\n    (1 row)\n    ```\n   \u003c/details\u003e\n\n**NOTE**: Using one machine is only for quickly trying out Balsa. To cleanly reproduce results, use [**Cluster mode**](#cluster-mode) which automates the above setup on a cloud and separates the training machine from the query execution machines.\n\n## Quickstart\n**First, run the baseline PostgreSQL plans** (the expert) on the Join Order Benchmark:\n```bash\npython run.py --run Baseline --local\n```\nThis will prompt you to log into [Weights \u0026 Biases](https://wandb.ai/home) to track experiments and visualize metrics easily, which we highly recommend. (You can disable it by prepending the env var `WANDB_MODE=disabled` or `offline`).\n\nThe first run may take a while due to warming up.  After finishing, you can see messages like:\n```\nlatency_expert/workload (seconds): 156.62 (113 queries)\n...\nwandb: latency_expert/workload 156.61727\n```\n\n**Next, to launch a Balsa experiment:**\n```bash\npython run.py --run Balsa_JOBRandSplit --local\n```\nThe first time this is run, _simulation data is collected_ for all training queures, which will finish in ~5 minutes (cached for future runs):\n```\n...\nI0323 04:15:48.212239 140382943663744 sim.py:738] Collection done, stats:\nI0323 04:15:48.212319 140382943663744 sim.py:742]   num_queries=94 num_collected_queries=77 num_points=516379 latency_s=309.3\nI0323 04:16:39.296590 140382943663744 sim.py:666] Saved simulation data (len 516379) to: data/sim-data-88bd801a.pkl\n```\nBalsa's simulation-to-reality approach requires first training an agent in simulation, then in real execution.  To speed up the simulation phase, we have provided pretrained checkpoints for the simulation agent:\n```\n...\nI0323 04:18:26.856739 140382943663744 sim.py:985] Loaded pretrained checkpoint: checkpoints/sim-MinCardCost-rand52split-680secs.ckpt\n```\nThen, the agent will start the first iteration of real-execution learning: planning all training queries, sending them off for execution, and waiting for these plans to finish. Periodically, test queries are planned and executed for logging.\n\nHandy commands:\n- To kill the experiment(s): `pkill -f run.py`\n- To monitor a machine: `dstat -mcdn`\n\n## Experiment configs\n\nAll experiments and their hyperparameters are declared in [**`experiments.py`**](./experiments.py).\nTo run an experiment with the local Postgres execution engine:\n```bash\n# \u003cname\u003e is a config registered in experiments.py.\npython run.py --run \u003cname\u003e --local\n```\n\nMain Balsa agent:\n\n| Benchmark             | Config                            |\n|-----------------------|-----------------------------------|\n| JOB (Random Split)    | `Balsa_JOBRandSplit` |\n| JOB Slow (Slow Split) | `Balsa_JOBSlowSplit`        |\n\n\nAblation: impact of the simulator (Figure 10):\n\n| Variant    | Config                                         |\n|------------|------------------------------------------------|\n| Balsa Sim  | (main agent) `Balsa_JOBRandSplit` |\n| Expert Sim | `JOBRandSplit_PostgresSim`     |\n| No Sim     | `JOBRandSplit_NoSim`        |\n\n\u003e **_NOTE:_**  Running `JOBRandSplit_PostgresSim` for the first time will be slow (1.1 hours) due to simulation data being collected from `EXPLAIN`. This data is cached in `data/` for future runs.\n\nAblation: impact of the timeout mechanism  (Figure 11):\n\n| Variant                | Config                                          |\n|------------------------|-------------------------------------------------|\n| Balsa (safe execution) | (main agent)  `Balsa_JOBRandSplit` |\n| no timeout             | `JOBRandSplit_NoTimeout`     |\n\n\nAblation: impact of exploration schemes (Figure 12):\n\n| Variant                  | Config                                            |\n|--------------------------|---------------------------------------------------|\n| Balsa (safe exploration) | (main agent)  `Balsa_JOBRandSplit`   |\n| epsilon-greedy           | `JOBRandSplit_EpsGreedy` |\n| no exploration           | `JOBRandSplit_NoExplore`        |\n\n\nAblation: impact of training schemes (Figure 13):\n\n| Variant           | Config                                          |\n|-------------------|-------------------------------------------------|\n| Balsa (on-policy) | (main agent)  `Balsa_JOBRandSplit` |\n| retrain           | `JOBRandSplit_RetrainScheme`    |\n\n\nComparision with learning from expert demonstrations (Neo-impl) (Figure 15):\n\n| Variant           | Config                                          |\n|-------------------|-------------------------------------------------|\n| Balsa  | (main agent)  `Balsa_JOBRandSplit` |\n| Neo-impl           | `NeoImpl_JOBRandSplit`    |\n\n\nDiversified experiences (Figure 16):\n\n| Variant           | Config                                          |\n|-------------------|-------------------------------------------------|\n| Balsa   |  (Main agents) JOB `Balsa_JOBRandSplit`; JOB Slow `Balsa_JOBSlowSplit` |\n| Balsa-8x (uses 8 main agents' data)        | JOB `Balsa_JOBRandSplitReplay`; JOB Slow `Balsa_JOBSlowSplitReplay` |\n\nGeneralizing to highly distinct join templates, Ext-JOB (Figure 17):\n\n| Variant           | Config                                          |\n|-------------------|-------------------------------------------------|\n| Balsa, data collection agent   |  `Balsa_TrainJOB_TestExtJOB` |\n| Balsa-1x (uses 1 base agent's data)        |  `Balsa1x_TrainJOB_TestExtJOB`    |\n| Balsa-8x (uses 8 base agents' data)        |  `Balsa8x_TrainJOB_TestExtJOB`    |\n\n\u003e **_NOTE:_**  When running a Ext-JOB config for the first time, you may see the error `Missing nodes in init_experience`. This means `data/initial_policy_data.pkl` contains the expert latencies of all 113 JOB queries for printing (assuming you ran the previous configs first) but lacks the new Ext-JOB queries.  To fix, rename the previous `.pkl` file and rerun the new Ext-JOB config, which will automatically run the expert plans of the new query set to regenerate this file (as well as gathering the simulation data).\n\nTo specify a new experiment, subclass an existing config (give the subclass a descriptive name), change the values of some hyperparameters, and register the new subclass.\n\n## Metrics and artifacts\nEach run's metrics and artifacts are logged to **its log dir** (path of the form `./wandb/run-20220323_051830-1dq64dx5/`), managed by [W\u0026B](https://wandb.ai/home). We recommend creating an account and running `wandb login`, so that these are automatically logged to their UI for visualization.\n\n### Key metrics to look at\n  - Expert performance\n    - `latency_expert/workload`: total time of expert (PosgreSQL optimizer) plans on training queries, in seconds\n    - `latency_expert_test/workload`: for test queries\n  - Agent performance\n    - `latency/workload`: total time of Balsa's plans on training queries, in seconds\n    - `latency_test/workload`: for test queries\n  - Agent progress\n    - `curr_value_iter`: which iteration the agent is in\n    - `num_query_execs`: total number of unique query plans executed, for training queries\n    - `curr_iter_skipped_queries`: for the current iteration, number of training plans executed before and thus cached\n    - Wallclock duration is an x-axis choice in W\u0026B: `Relative Time (Wall)`\n    - `curr_timeout`: current iteration's timeout in seconds for any training plan; for Balsa's safe execution\n  - Learning efficiency: plot `latency/workload` vs. wallclock duration\n  - Data efficiency: plot `latency/workload` vs. `num_query_execs`\n\nOther (and less important) metrics are also available, such as `latency(_expert)(_test)/q\u003cquery number\u003e` for per-query latencies.\n    \nExample visualization grouping by config `cls`:\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"assets/wandb-example.png\" width=\"885\"/\u003e\n\u003cp\u003e\n\n\n### Artifacts\n\nTracking metrics is sufficient for running and monitoring experiments.  For advanced use cases, we provide a few artifacts (saved locally and automatically uploaded to W\u0026B).\n\n**Agent checkpoints / the experience buffer so far** are periodically (every 5 iters) saved to:\n```\nSaved iter=124 checkpoint to: \u003cBalsa dir\u003e/wandb/run-20220323_051830-1dq64dx5/files/checkpoint.pt\nSaved Experience to: \u003cBalsa dir\u003e/data/replay-Balsa_JOBSlowSplit-9409execs-11844nodes-37s-124iters-1dq64dx5.pkl\n```\nThe experience buffers are useful for [training on diversified experiences](#qa) (see paper for details).\n\n**Best training plans so far**: under `\u003clogdir\u003e/files/best_plans/`\n- `*.sql`: a hinted-version of each training query, where the hint is the best plan found so far; this can be piped into `psql` for re-execution\n  - `all.sql`: concatenates all hinted training queries\n- `latencies.txt`: best latency so far of training queries; the last row, \"all\", sums up all training queires\n\nThis means **Balsa can be used to find the best possible plans for a set of high-value queries**.\n\n**Hyperparameters of each run**: `\u003clogdir\u003e/files/params.txt`.  Hparams are also (1) printed out at the beginning of each run; and (2) captured in W\u0026B's Config table.  Typically, referring to each hparam config by its class name, such as `Balsa_JOBSlowSplit`, is sufficient, but if any change is made in `run.py#Main()`, the above would capture it.\n\n## Cluster mode\nWe recommend launching a multi-node Postgres cluster to support multiple agent runs in parallel.\nThis is optional, but highly recommended: Agent runs can have some variance and thus it's valuable to measure the median and variance of agent performance across several runs of the same config.\n\nInstructions below use [Ray](https://ray.io/) to launch a cluster on AWS; refer to the [Ray documentation](https://docs.ray.io/en/latest/cluster/cloud.html) for credentials setup. See the same docs for launching on other clouds or an on-premise cluster.\n\n\u003e **_NOTE:_**  The AWS instance types used below may not have exactly the same performance as the Azure cluster reported in our paper. The overall magnitude/trends, however, should be the same. See [Cluster hardware used in paper](#cluster-hardware-used-in-paper).\n\n### Launching a cluster\nLaunch the cluster from your local machine and Ray will handle installing all necessary dependencies and configuring both the GPU driver node (cluster head) and the Postgres execution nodes (cluster workers).\n\nOn your laptop, edit the cluster configuration file such that the number of nodes and file mounts are correctly configured; see all `NOTE` comments. By default it uses a `g4dn.12xlarge` head node (48 vCPUs, 192GB RAM, 4 T4 GPUs) and 20 `r5a.2xlarge` worker nodes (8 vCPUs, 64GB RAM, 256GB SSD per node).\n```bash\ncd ~/balsa\nvim balsa/cluster/cluster.yml\n```\n\nStart the cluster.\n```bash\nray up balsa/cluster/cluster.yml\n```\n\nLog in to the head node.\n```bash\nray attach balsa/cluster/cluster.yml\n```\n\nOn the head node, check if the workers are ready. It can take 10-20 minutes while the workers are getting set up.\n```bash\n# Run this on the head node\nconda activate balsa\ncd ~/balsa/balsa/cluster/\npython check_cluster.py\n```\n\u003e **_NOTE:_**  To monitor the detailed status of worker launching and setup, run on your laptop: `ray exec cluster.yml 'tail -n 100 -f /tmp/ray/session_latest/logs/monitor*'`.\n\nBefore warming up the Postgres worker nodes, you'd need to log onto AWS to add one rule in the security group.\n```\n1. Click any worker node's \"Instance ID\" on AWS EC2 web portal\n2. Click \"Security\"\n3. Click the security group \"sg-xxxx (ray-autoscaler-\u003ccluster name\u003e)\"\n4. Click \"Edit inbound rules\"\n5. Add rule -\u003e \"All TCP\" and \"Anywhere-IPv4\"\n6. Click \"Save rules\"\n```\n\nWarm up the Postgres worker nodes:\n```bash\n# Run this on the head node\ncd ~/balsa/balsa/cluster\npython warmup_pg_servers.py\n```\n\n### Running experiments on the cluster\nLog in to the head node and run:\n```bash\nray attach balsa/cluster/cluster.yml\nwandb login\n```\n\n**Before launching parallel runs**, launch a single run first:\n```bash\npython run.py --run Baseline\npython run.py --run \u003cname\u003e\n```\nThis ensures that the appropriate simulation data is generated and cached and that `data/initial_policy_data.pkl` is correct for the query split. After you've observed it to run successfully to the first iteration of the real-execution stage (`Planning took ...`), safely kill it with `pkill -f run.py`. Then, launch multiple runs:\n```bash\n# Run this on the head node\n# Usage: bash scripts/launch.sh \u003cname\u003e \u003cN\u003e\nbash scripts/launch.sh Balsa_JOBRandSplit 8\n\n# To monitor:\n#   (1) W\u0026B\n#   (2) tail -f Balsa_JOBRandSplit-1.log\n```\n\n### Editing code\nYou can edit Balsa code on your laptop under the working directory path specified under `file_mounts` in the `balsa/cluster/cluster.yml` file. Push updated code to the cluster with the following (this will interrupt ongoing runs):\n```bash\nray up balsa/cluster/cluster.yml --restart-only\n```\nNow you can log in to the head node and launch new experiments with the updated changes.\n\n### Cluster hardware used in paper\nFor experiments reported in our paper, we ran 8 parallel agent runs on the head node sending queries to 20 Postgres nodes (i.e., 2.5 concurrent queires per agent run). We used Ubuntu 18.04 virtual machines on Azure:\n  - head node (training/inference): a `Standard_NV24_Promo` VM\n    - 24 vCPUs, 224GB memory, 4 M60 GPUs; each agent run uses 0.5 GPU\n  - query execution nodes (the target hardware environment to optimize for): 20 `Standard_E8as_v4` VMs\n    - 8 vCPUs, 64GB memory, 128GB SSD (Premium SSD LRS) per VM\n\nThis is different from what is included in the cluster template above (AWS). To exactly reproduce, change the cluster template to launch the above or manually use the Azure portal to launch a VMSS.\n\nIf your cluster config differs in size or hardware, the numbers obtained may be different and the provided [Postgres config](./conf/balsa-postgresql.conf) may need to be adjusted. The overall trends and magnitudes should be similar.\n\n## Q\u0026A\n\n**How to train a Balsa agent with diversified experiences**\n\nOnce you have 8 main agent runs finish, their experience buffers will be saved to, say, `./data/replay-Balsa_JOBRandSplit-*-499iters-*.pkl`.\nThen, launch `Balsa_JOBRandSplitReplay` to load all experience buffers that match `replay-Balsa_JOBRandSplit-*` (specified by the glob pattern `p.prev_replay_buffers_glob` in `experiments.py`) to train a Balsa-8x agent.\n```bash\npython run.py --run Balsa_JOBRandSplitReplay\n# Or:\nbash scripts/launch.sh Balsa_JOBRandSplitReplay 8\n```\nIf you have more buffers that match the glob pattern, be sure to change `p.prev_replay_buffers_glob` to specify the desired buffers.\n\nWe also recommend using one experience buffer as a hold-out validation set by specifying `p.prev_replay_buffers_glob_val` for more stable test performance.\nNote that in this case the buffers used for training and validation need to be put in seperate directories and the glob patterns also need to be changed accordingly.\n\nOther diversified experience configs are: `Balsa_JOBSlowSplitReplay`, `Balsa1x_TrainJOB_TestExtJOB`, and `Balsa8x_TrainJOB_TestExtJOB`.\n\n**My experiment failed with a \"Hint not respected\" error**\n\nThis occasionally happens when Postgres fails to respect the query plan hint generated by Balsa. This is a fundamental limitation with the `pg_hint_plan` package. We suggest relaunching new runs to replace the failed runs.\n\n\n## Citation\n```bibtex\n@inproceedings{balsa,\n  author =       {Yang, Zongheng and Chiang, Wei-Lin and Luan, Sifei and Mittal,\n                  Gautam and Luo, Michael and Stoica, Ion},\n  title =        {Balsa: Learning a Query Optimizer Without Expert\n                  Demonstrations},\n  booktitle =    {Proceedings of the 2022 International Conference on Management\n                  of Data},\n  year =         2022,\n  pages =        {931--944},\n  doi =          {10.1145/3514221.3517885},\n  url =          {https://doi.org/10.1145/3514221.3517885},\n  address =      {New York, NY, USA},\n  isbn =         9781450392495,\n  keywords =     {machine learning for systems, learned query optimization},\n  location =     {Philadelphia, PA, USA},\n  numpages =     14,\n  publisher =    {Association for Computing Machinery},\n  series =       {SIGMOD/PODS '22},\n}\n\n@misc{balsa_github,\n  howpublished = {\\url{https://github.com/balsa-project/balsa}},\n  title =        \"Balsa source code\",\n  year =         2022,\n}\n```\n","funding_links":[],"categories":["Python","Models and Projects"],"sub_categories":["Ray + Database"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbalsa-project%2Fbalsa","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbalsa-project%2Fbalsa","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbalsa-project%2Fbalsa/lists"}