{"id":20676820,"url":"https://github.com/jfeser/imputedb","last_synced_at":"2026-05-17T00:33:26.980Z","repository":{"id":74018277,"uuid":"70736433","full_name":"jfeser/ImputeDB","owner":"jfeser","description":"A database with automatic imputation of missing values.","archived":false,"fork":false,"pushed_at":"2017-08-26T18:07:41.000Z","size":97497,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-01-17T14:56:50.192Z","etag":null,"topics":["database","imputation"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jfeser.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-10-12T19:58:27.000Z","updated_at":"2017-06-29T21:50:18.000Z","dependencies_parsed_at":"2023-06-01T10:30:49.463Z","dependency_job_id":null,"html_url":"https://github.com/jfeser/ImputeDB","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jfeser%2FImputeDB","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jfeser%2FImputeDB/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jfeser%2FImputeDB/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jfeser%2FImputeDB/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jfeser","download_url":"https://codeload.github.com/jfeser/ImputeDB/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":242906817,"owners_count":20204902,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database","imputation"],"created_at":"2024-11-16T21:13:45.642Z","updated_at":"2025-10-05T19:19:52.804Z","avatar_url":"https://github.com/jfeser.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ImputeDB [![Build Status](https://travis-ci.org/jfeser/ImputeDB.svg?branch=master)](https://travis-ci.org/jfeser/ImputeDB)\n\nImputeDB is a SQL database which automatically imputes missing data on-the-fly.\nUsers can issue SQL queries over data with NULL values and ImputeDB will use a regression model to fill in the missing values during the execution of the query.\nDesigned to enable exploratory analysis of survey data, ImputeDB removes the cost of performing imputation manually, allowing users to get a quick and accurate view of their data.\n\n# Building and Running #\n\nTo build ImputeDB, run:\n\n``` shell\ncd simpledb; ant\n```\n\nTo create a database from the demo collection of CSV files, run:\n\n``` shell\n./imputedb load --db demo.db demo_data/*\n```\n\nThis creates three tables:\n\n* `demo`: demographics data from the CDC\n* `labs`: laboratory data from the CDC\n* `exams`: physical examination data from the CDC\n\nand places their serialized representations in the `demo.db` folder,\nalong with a catalog describing the table schemas..\n\nThen, to query the database, run:\n\n``` shell\n./imputedb query --db demo.db\n```\n\nThis launches the ImputeDB interpreter with an `--alpha 0.0` parameter\nas a default, which\nmeans ImputeDB will optimize for data quality. You can modify this\nby calling the interpreter with the `--alpha \u003cdouble\u003e` option.\n\nFor example,\n\n``` shell\n./imputedb query --db demo.db --alpha 1.0\n```\n\nlaunches an interpreter that optimizes for query execution speed.\n\n# Experiments #\n\n1. Build the Docker container for the experiments.\n\n``` shell\ncd simpledb/test/experiments\nmake build\n```\n\n2. Run the experiments.\n\nTODO.\n\n# Publications #\n\n**Query Optimization for Dynamic Imputation**. José Cambronero\\*, John K. Feser\\*, Micah J. Smith\\*, Samuel Madden. VLDB. (2017) To appear. [\u003ca href=\"http://people.csail.mit.edu/feser/imputedb.pdf\"\u003epdf\u003c/a\u003e]\n\n*Authors contributed equally to this paper.\n\n# License #\n\n[MIT](https://opensource.org/licenses/MIT)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjfeser%2Fimputedb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjfeser%2Fimputedb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjfeser%2Fimputedb/lists"}