{"id":20337327,"url":"https://github.com/expediadotcom/haystack-tables","last_synced_at":"2026-05-11T09:54:07.530Z","repository":{"id":70574273,"uuid":"171510040","full_name":"ExpediaDotCom/haystack-tables","owner":"ExpediaDotCom","description":"This is an EXPERIMENTAL project - not ready for production use.","archived":false,"fork":false,"pushed_at":"2019-03-03T10:49:47.000Z","size":133,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-04T13:47:21.268Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ExpediaDotCom.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-02-19T16:34:29.000Z","updated_at":"2019-12-17T07:43:01.000Z","dependencies_parsed_at":null,"dependency_job_id":"93613eae-051a-459c-8282-14b53961815f","html_url":"https://github.com/ExpediaDotCom/haystack-tables","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ExpediaDotCom/haystack-tables","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ExpediaDotCom%2Fhaystack-tables","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ExpediaDotCom%2Fhaystack-tables/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ExpediaDotCom%2Fhaystack-tables/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ExpediaDotCom%2Fhaystack-tables/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ExpediaDotCom","download_url":"https://codeload.github.com/ExpediaDotCom/haystack-tables/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ExpediaDotCom%2Fhaystack-tables/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32889971,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-10T13:40:02.631Z","status":"online","status_checked_at":"2026-05-11T02:00:05.975Z","response_time":120,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-14T21:08:39.138Z","updated_at":"2026-05-11T09:54:07.499Z","avatar_url":"https://github.com/ExpediaDotCom.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Architecture\n\n![Architecture](./docs/images/Haystack_Tables.png)\n\n\n### Getting Started?\nLaunch the table-allocator dropwizard app that exposes endpoint for creating and listing the views. \nThe allocator uses kubernetes for running parquet-writers by default. If you are using minikube, make sure it is running and current k8s context points to it.\n\n##### Create a new view:\n\n```\ncurl -XPOST -H \"Content-Type: application/json\" -d '\n{\n  \"view\": \"oms\",\n  \"select\": [\n    \"tags[errorcode]\",\n     \"operationname\"\n  ],\n  \"where\": {\n    \"servicename\": \"oms\"\n  }\n}' \"http://localhost:8080/view\"\n```\n\n\n##### List all views:\n\n```\ncurl \"http://localhost:8080/views\"\n\nResponse:\n\n[\n  {\n    \"createTimestamp\": \"2019-03-03T10:17:50.000Z\",\n    \"lastUpdatedTimestamp\": \"2019-03-03T10:17:50.866Z\",\n    \"query\": {\n      \"view\": \"oms-test\",\n      \"select\": [\n        \"tags[errorcode]\",\n        \"operationname\"\n      ],\n      \"where\": {\n        \"servicename\": \"oms\"\n      }\n    },\n    \"running\": true\n  }\n]\n```\n\n##### Delete a view:\n\n```\ncurl -XDELETE \"http://localhost:8080/view/oms\"\n```\n\n### S3 Data\nParquet writer runs independently for each requested view. They put the parquet data under a configured bucket name with following partitoning strategy:\n\n`s3://bucket-name/views/{view-name}/year=2019/month=02/day=03/hour=12/..`\n\nThe parquet files are named with the last kafka-offset value of the record in the file itself.\n\n### Athena Tables\nAllocator provides an endpoint `/athena/refresh` that takes following action for all the running views:\n* Create partitioned table in Athena under haystack_tables database\n* Repair the already existing table to add new s3 partitions\n\nWe run a cron job that hits this endpoint after every few minutes to make sure the tables are always upto date.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fexpediadotcom%2Fhaystack-tables","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fexpediadotcom%2Fhaystack-tables","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fexpediadotcom%2Fhaystack-tables/lists"}