{"id":42894684,"url":"https://github.com/mitdbg/aurum-datadiscovery","last_synced_at":"2026-01-30T15:03:57.445Z","repository":{"id":12264875,"uuid":"57212663","full_name":"mitdbg/aurum-datadiscovery","owner":"mitdbg","description":null,"archived":false,"fork":false,"pushed_at":"2023-03-06T05:16:32.000Z","size":45632,"stargazers_count":75,"open_issues_count":69,"forks_count":49,"subscribers_count":16,"default_branch":"master","last_synced_at":"2025-02-21T18:35:06.963Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mitdbg.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2016-04-27T12:45:00.000Z","updated_at":"2024-12-20T15:44:29.000Z","dependencies_parsed_at":"2023-01-11T20:17:47.472Z","dependency_job_id":"cab151b0-c8cd-412d-9cba-795d41023f05","html_url":"https://github.com/mitdbg/aurum-datadiscovery","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/mitdbg/aurum-datadiscovery","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mitdbg%2Faurum-datadiscovery","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mitdbg%2Faurum-datadiscovery/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mitdbg%2Faurum-datadiscovery/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mitdbg%2Faurum-datadiscovery/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mitdbg","download_url":"https://codeload.github.com/mitdbg/aurum-datadiscovery/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mitdbg%2Faurum-datadiscovery/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28914896,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-30T12:13:43.263Z","status":"ssl_error","status_checked_at":"2026-01-30T12:13:22.389Z","response_time":66,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-30T15:03:57.344Z","updated_at":"2026-01-30T15:03:57.430Z","avatar_url":"https://github.com/mitdbg.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Aurum: Discovering Data in Lakes, Clouds and Databases\n\nWebpage version of this documentation: [http://mitdbg.github.io/aurum-datadiscovery/](http://mitdbg.github.io/aurum-datadiscovery/)\n\nAurum helps users identify relevant content among multiple data\nsources that may consist of tabular files, such as CSV, and relational tables.\nThese may be stored in relational database management systems (RDBMS), file\nsystems, and they may live in cloud services, data lakes or other on-premise\nrepositories.\n\nAurum helps you find data through different interfaces. The most flexible one is\nan API of primitives that can be composed to build queries that describe the\ndata of interest. For example, you can write a query that says \"find tables that\ncontain a column with name 'ID' and have at least one column that looks like\nan input column\". You can also query with very simple primitives, such as \"find\ncolumns that contain the keyword 'caffeine'\". You can also do more complex\nqueries, such as figuring out what tables join with a table of interest. The\nidea is that the API is flexible enough to allow a wide range of use cases, and\nthat it works over all data you feed to the system, regardless where these live.\n\n* [**Why do I need Aurum?**](docs/why_aurum.md) We show you various scenarios in which Aurum has proven useful.\n\n* [**Design Rationale**](docs/design_rationale.md) A brief explanation of the system architecture and \ndesign rationale.\n\n* [**Quick Start**](docs/quick_start.md) A guide to setup Aurum and start running some discovery queries.\n\n* [**Tutorial**](docs/tutorial.md) A tutorial that walks you through the different aspects of Aurum, from how \nto write queries using the discovery API, to how to create new connectors to read data from different \ndata sources to how to store data in different stores.\n\n* [**FAQ**](docs/faq.md) Collection of frequent questions\n\nAurum is a work in progress, we expect to release its first open-source version in the 4th quarter of 2018.\nWe are happy to accept contributions of the community. If you are interested in contributing take a look at\nthe [CONTRIBUTING](../CONTRIBUTING.md) and feel free to email raulcf@csail.mit.edu\nWe also have a code of conduct:\n\n### Code of Conduct\n\nCheck the code of conduct for Aurum here: \n\nhttps://github.com/mitdbg/aurum-datadiscovery/blob/master/CODE_OF_CONDUCT.md\n\nPlease, report violations of the code of conduct by sending an email to\nraulcf@csail.mit.edu","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmitdbg%2Faurum-datadiscovery","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmitdbg%2Faurum-datadiscovery","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmitdbg%2Faurum-datadiscovery/lists"}