{"id":33057189,"url":"https://github.com/pixelsdb/pixels","last_synced_at":"2026-05-03T04:04:01.703Z","repository":{"id":37693108,"uuid":"193060499","full_name":"pixelsdb/pixels","owner":"pixelsdb","description":"An efficient storage and compute engine for both on-prem and cloud-native data analytics.","archived":false,"fork":false,"pushed_at":"2026-04-22T05:32:36.000Z","size":141105,"stargazers_count":908,"open_issues_count":32,"forks_count":157,"subscribers_count":14,"default_branch":"master","last_synced_at":"2026-04-22T07:38:49.151Z","etag":null,"topics":["cloud-database","column-store","data-lake","data-warehouse","database","olap"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pixelsdb.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE","maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2019-06-21T08:24:25.000Z","updated_at":"2026-04-20T15:48:59.000Z","dependencies_parsed_at":"2024-09-17T11:54:31.384Z","dependency_job_id":"94a23b85-fe26-4121-8b8c-064941ddf371","html_url":"https://github.com/pixelsdb/pixels","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/pixelsdb/pixels","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pixelsdb%2Fpixels","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pixelsdb%2Fpixels/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pixelsdb%2Fpixels/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pixelsdb%2Fpixels/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pixelsdb","download_url":"https://codeload.github.com/pixelsdb/pixels/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pixelsdb%2Fpixels/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32365519,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-27T20:07:02.737Z","status":"online","status_checked_at":"2026-04-28T02:00:07.250Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cloud-database","column-store","data-lake","data-warehouse","database","olap"],"created_at":"2025-11-14T04:02:43.206Z","updated_at":"2026-04-28T04:01:16.596Z","avatar_url":"https://github.com/pixelsdb.png","language":"Java","funding_links":[],"categories":["大数据"],"sub_categories":[],"readme":"Pixels\n=======\n[![Pixels Daily Build](https://github.com/pixelsdb/pixels/actions/workflows/daily-build.yml/badge.svg)](https://github.com/pixelsdb/pixels/releases/tag/daily-latest)\n![GitHub commits](https://img.shields.io/github/commit-activity/m/pixelsdb/pixels/master)\n[![GitHub License](https://img.shields.io/github/license/pixelsdb/pixels)](https://github.com/pixelsdb/pixels/blob/master/LICENSE)\n[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/pixelsdb/pixels)\n\nThe core of Pixels is a columnar storage engine designed for data lakes and warehouses.\nIt is optimized for analytical tables stored in on-premises and cloud-native storage systems,\nincluding S3, GCS, HDFS, Redis, HTTP (Netty), and local file systems.\nPixels outperforms Parquet, which is the most widely used columnar format in today's lakehouses, by up to two orders of magnitude.\n\nWe have integrated Pixels with popular query engines including DuckDB (1.3.0), Trino (405 and 466), StarRocks (3.3.5), PrestoDB (0.279), and Hive (2.3+).\n\nThe DuckDB integration and the C++ implementation of Pixels are in the [cpp](cpp) folder.\nThe other integrations are opensourced in separate repositories:\n* [Pixels Connector for Trino](https://github.com/pixelsdb/pixels-trino)\n* [Pixels Connector for PrestoDB](https://github.com/pixelsdb/pixels-presto)\n* [StarRocks with Pixels Integration](https://github.com/pixelsdb/starrocks)\n* [Pixels SerDe for Hive](https://github.com/pixelsdb/pixels-hive)\n\nPixels also has its own query engine [Pixels-Turbo](pixels-turbo).\nIt prioritizes processing queries in an autoscaling MPP cluster (currently based on Trino) and exploits serverless functions \n(e.g, [AWS Lambda](https://aws.amazon.com/lambda/), [vHive / Knative](https://github.com/vhive-serverless/vHive), and [Spike](https://github.com/pixelsdb/pixels-spike)) \nto accelerate the processing of workload spikes. With `Pixels-Turbo`, we can achieve better performance and cost-efficiency \nfor continuous workloads while not compromising elasticity for workload spikes.\n\nBased on Pixels-Turbo, we implement [Pixels-Rover](https://github.com/pixelsdb/pixels-rover), a web-based query interface\nthat provides users with a complete experience of serverless query processing, natural-language-to-SQL translation, and flexible\nservice levels in query urgency. It allows users to select whether to execute the query immediately, within a grace period, or eventually.\nPixels-Turbo can apply different resource scheduling and query execution policies for Different levels of query urgency, which\nwill result in different monetary costs on resources.\n\nFurthermore, Pixels has a real-time data synchronization framework namely [Pixels-Retina](pixels-retina).\nIt replays data-change operations from log-based CDC sources as mirror transactions on the columnar table data,\nusing a lightweight MVCC mechanism to support concurrent analytical queries with 10-ms-level data freshness, significantly\noutperforming the batch-granular merge-on-read approach used by existing lakehouses such as Apache Iceberg and Paimon.\n\n## Build Pixels\n\nPixels is mainly implemented in both Java (with some JNI hooks of system calls and C/C++ libs) and C++.\nThe [C++ document](cpp/README.md) provides the instructions to build and run the C++ codebase. Here we explain how to build and use the Java components.\n\nJDK 8 (or above) and Maven 3.8 (or above) are required to build Pixels.\nEarlier Maven versions may work but are not tested.\nAfter installing these prerequisites, enter any `SRC_BASE` directory, clone the Pixels codebase and build it as follows:\n```bash\ngit clone https://github.com/pixelsdb/pixels.git\ncd pixels\n# ensure PIXELS_HOME environment variable is set to the installation directory of pixels (not SRC_BASE).\nexport PIXELS_HOME=[pixels-install-dir]\nmvn clean install\n```\n\nIt may take a couple of minutes to complete. After that, the library jars of Pixels has been installed to the local Maven repository.\nPlease also find the executable jar files of Pixels:\n* `pixels-daemon-*-full.jar` in `pixels-daemon/target`,this is the jar to run Pixels daemons.\n* `pixels-cli-*-full.jar` in `pixels-cli/target`, this is the jar of Pixels command line tool.\n\nThey will be used in the installation of Pixels.\n\n\u003e Note: Some Junit tests in Pixels access some low-level packages in the JDK, such as sun.nio and java.nio.\n\u003e Compiling and running such test cases require lower version JDKs (e.g., 1.8). However, these tests are not necessary for the aforementioned build process.\n\nPixels is compatible with different query engines, such as Trino, Presto, and Hive.\nThe query engine integrations also can be built using maven.\nFor example, to build the Trino integration for Pixels, just git clone [pixels-trino](https://github.com/pixelsdb/pixels-trino), \nand build it using `mvn package` in the local git repository.\n\n\u003e Pixels by itself is compatible with Java 8+ and Maven 3.8+. However, third-party query engines such as Trino may require\n\u003e a later JDK (e.g., Trino 405/466 requires JDK17.0.3+/23.0.0+) and Maven.\n\u003e It is fine to build the query engine integration (e.g., `pixels-trino`) with the same or higher versions of JDK and Maven than Pixels.\n\n### Daily Build Release\n\nIf you want to try the latest daily build of Pixels without building from source, you can download it from the automated [daily releases](https://github.com/pixelsdb/pixels/releases/tag/daily-latest).\n\n- The daily build includes the latest changes from the repository.  \n- Suitable for testing and early feedback, not recommended for production use.  \n- Contains pre-built `pixels-daemon` and `pixels-cli` jar files, ready to use.\n\n\n## Develop Pixels in IntelliJ\n\nIf you want to develop Pixels in Intellij, open `SRC_BASE/pixels` as a maven project.\nWhen the project is fully indexed and the dependencies are successfully downloaded, \nyou can build Pixels using the maven plugin (as an alternative of `mvn package`), run and debug unit tests, and debug Pixels by\nsetting up a *Remote JVM Debug*. \nEnsure the environment variable `PIXELS_HOME` is set to the installation directory of Pixels for the maven plugin and the run/debug targets in IntelliJ.\n\nIn some versions of IntelliJ, the default `idea.max.intellisense.filesize` in IntelliJ may be not large enough for the source files generated by ProtoBuf.\nHence, the large generated source file will be considered as plain text file in the user interface.\nTo solve this problem, set this property to `4096` (i.e., 4MB) or larger in `Help` -\u003e `Edit Custom Properties...` and restart Intellij.\n\n\u003e To use the maven plugin, run/debug the unit tests, or run/debug the main classes of Pixels in Intellij, set the `PIXELS_HOME` environment\n\u003e variable for `Maven`, `Junit`, or `Application` in `Run` -\u003e `Edit Configurations` -\u003e `Edit Configuration Templetes`.\n\u003e Ensure that the `PIXELS_HOME` directory exists and follow the instructions in [Install Pixels](docs/INSTALL.md#install-pixels) to put\n\u003e the `pixels.properties` into `PIXELS_HOME/etc` and create the `logs` directory where the log files will be\n\u003e written into.\n\n\n## Deploy and Evaluate Pixels\n\nYou can follow the [Installation](docs/INSTALL.md) instructions to deploy Pixels in a cluster,\nand learn how to use Pixels and evaluate its performance following [TPC-H Evaluation](docs/TPC-H.md) or [ClickBench Evaluation](docs/CLICKBENCH.md).\n\n\n## Contributing\n\nWe welcome contributions to Pixels and its subprojects. If you are interested in contributing to Pixels, \nplease read our [Git Workflow](https://github.com/pixelsdb/pixels/wiki/Git-Workflow).\n\n\n## Publications\n\nPixels is an academic system aims at providing production-grade quality. It supports all the functionalities required by TPC-H and\nis compatible with the mainstream data analytic ecosystems.\nThe key ideas and insights in Pixels are elaborated in the following publications.\n\n\u003e `ICDE'25` [PixelsDB: Serverless and NL-Aided Data Analytics with Flexible Service Levels and Prices](https://arxiv.org/abs/2405.19784)\\\n\u003e Haoqiong Bian, Dongyang Geng, Haoyang Li, Yunpeng Chai, Anastasia Ailamaki\n\n\u003e `arXiv'24` [Serverless Query Processing with Flexible Performance SLAs and Prices](https://arxiv.org/abs/2409.01388)\\\n\u003e Haoqiong Bian, Dongyang Geng, Yunpeng Chai, Anastasia Ailamaki\n\n\u003e `SIGMOD'23` [Using Cloud Functions as Accelerator for Elastic Data Analytics](https://doi.org/10.1145/3589306)\\\n\u003e Haoqiong Bian, Tiannan Sha, Anastasia Ailamaki\n\n\u003e `EDBT'22` [Columnar Storage Optimization and Caching for Data Lakes (short)](https://doi.org/10.48786/edbt.2022.33)\\\n\u003e Guodong Jin, Haoqiong Bian, Yueguo Chen, Xiaoyong Du\n\n\u003e `ICDE'22` [Pixels: An Efficient Column Store for Cloud Data Lakes](https://doi.org/10.1109/ICDE53745.2022.00276)\\\n\u003e Haoqiong Bian, Anastasia Ailamaki\n\n\u003e `CIDR'20` [Pixels: Multiversion Wide Table Store for Data Lakes (abstract)](https://www.cidrdb.org/cidr2020/gongshow2020/gongshow/abstracts/cidr2020_abstract74.pdf)\\\n\u003e Haoqiong Bian\n\n\u003e `ICDE'18` [Rainbow: Adaptive Layout Optimization for Wide Tables (demo)](https://doi.org/10.1109/ICDE.2018.00200)\\\n\u003e Haoqiong Bian, Youxian Tao, Guodong Jin, Yueguo Chen, Xiongpai Qin, Xiaoyong Du\n\n\u003e `SIGMOD'17` [Wide Table Layout Optimization by Column Ordering and Duplication](https://doi.org/10.1145/3035918.3035930)\\\n\u003e Haoqiong Bian, Ying Yan, Wenbo Tao, Liang Jeff Chen, Yueguo Chen, Xiaoyong Du, Thomas Moscibroda\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpixelsdb%2Fpixels","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpixelsdb%2Fpixels","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpixelsdb%2Fpixels/lists"}