{"id":13417899,"url":"https://github.com/heavyai/heavydb","last_synced_at":"2025-12-12T00:34:05.460Z","repository":{"id":37630398,"uuid":"90541149","full_name":"heavyai/heavydb","owner":"heavyai","description":"HeavyDB (formerly OmniSciDB)","archived":false,"fork":false,"pushed_at":"2024-09-04T23:43:20.000Z","size":455270,"stargazers_count":2994,"open_issues_count":292,"forks_count":452,"subscribers_count":133,"default_branch":"master","last_synced_at":"2025-05-10T01:02:04.396Z","etag":null,"topics":["cuda","database","gpu","heavyai","interactive","llvm","machine-learning","mapd","olap","omnisci","real-time","sql","visualization"],"latest_commit_sha":null,"homepage":"https://heavy.ai","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/heavyai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-05-07T15:07:56.000Z","updated_at":"2025-05-08T12:59:18.000Z","dependencies_parsed_at":"2024-11-16T17:18:14.305Z","dependency_job_id":null,"html_url":"https://github.com/heavyai/heavydb","commit_stats":{"total_commits":10334,"total_committers":150,"mean_commits":68.89333333333333,"dds":0.732533384942907,"last_synced_commit":"a5dc49c757739d87f12baf8038ccfe4d1ece88ea"},"previous_names":["mapd/mapd-core","omnisci/omniscidb","omnisci/mapd-core"],"tags_count":67,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/heavyai%2Fheavydb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/heavyai%2Fheavydb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/heavyai%2Fheavydb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/heavyai%2Fheavydb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/heavyai","download_url":"https://codeload.github.com/heavyai/heavydb/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254040333,"owners_count":22004509,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cuda","database","gpu","heavyai","interactive","llvm","machine-learning","mapd","olap","omnisci","real-time","sql","visualization"],"created_at":"2024-07-30T22:00:55.024Z","updated_at":"2025-12-12T00:34:05.399Z","avatar_url":"https://github.com/heavyai.png","language":"C++","funding_links":[],"categories":["TODO scan for Android support in followings","C++","\u003ca name=\"cpp\"\u003e\u003c/a\u003eC++","Table of Contents"],"sub_categories":["Mathematics and Science"],"readme":"HeavyDB (formerly OmniSciDB)\n==============================\n\nHeavyDB is an open source SQL-based, relational, columnar database engine that leverages the full performance and parallelism of modern hardware (both CPUs and GPUs) to enable querying of multi-billion row datasets in milliseconds, without the need for indexing, pre-aggregation, or downsampling.  HeavyDB can be run on hybrid CPU/GPU systems (Nvidia GPUs are currently supported), as well as on CPU-only systems featuring X86, Power, and ARM (experimental support) architectures. To achieve maximum performance, HeavyDB features multi-tiered caching of data between storage, CPU memory, and GPU memory, and an innovative Just-In-Time (JIT) query compilation framework.\n\nFor usage info, see the [product documentation](https://docs.heavy.ai/), and for more details about the system's internal architecture, check out the [developer documentation](https://heavyai.github.io/heavydb/). Further technical discussion can be found on the [HEAVY.AI Community Forum](https://community.heavy.ai).\n\nThe repository includes a number of third party packages provided under separate licenses. Details about these packages and their respective licenses is at [ThirdParty/licenses/index.md](ThirdParty/licenses/index.md).\n\n# Downloads and Installation Instructions\n\nHEAVY.AI provides pre-built binaries for Linux for stable releases of the project:\n\n| Distro | Package type | CPU/GPU | Repository | Docs |\n| --- | --- | --- | --- | --- |\n| CentOS | RPM | CPU | https://releases.heavy.ai/os/yum/stable/cpu |  https://docs.heavy.ai/installation-and-configuration/installation/installing-on-centos/centos-yum-gpu-ee |\n| CentOS | RPM | GPU | https://releases.heavy.ai/os/yum/stable/cuda | https://docs.heavy.ai/installation-and-configuration/installation/installing-on-centos/centos-yum-gpu-ee |\n| Ubuntu | DEB | CPU | https://releases.heavy.ai/os/apt/dists/stable/cpu | https://docs.heavy.ai/installation-and-configuration/installation/installing-on-ubuntu/centos-yum-gpu-ee |\n| Ubuntu | DEB | GPU | https://releases.heavy.ai/os/apt/dists/stable/cuda | https://docs.heavy.ai/installation-and-configuration/installation/installing-on-ubuntu/centos-yum-gpu-ee |\n| * | tarball | CPU | https://releases.heavy.ai/os/tar/heavyai-os-latest-Linux-x86_64-cpu.tar.gz |  |\n| * | tarball | GPU | https://releases.heavy.ai/os/tar/heavyai-os-latest-Linux-x86_64.tar.gz |  |\n***\n\n# Developing HeavyDB: Table of Contents\n\n- [Links](#links)\n- [License](#license)\n- [Contributing](#contributing)\n- [Building](#building)\n- [Testing](#testing)\n- [Using](#using)\n- [Code Style](#code-style)\n- [Dependencies](#dependencies)\n- [Roadmap](ROADMAP.md)\n\n# Links\n\n- [Developer Documentation](https://heavyai.github.io/heavydb/)\n- [Doxygen-generated Documentation](http://doxygen.mapd.com/)\n- [Product Documentation](https://docs.heavy.ai/)\n- [Release Notes](https://docs.heavy.ai/overview/release-notes)\n- [Community Forum](https://community.heavy.ai)\n- [HEAVY.AI Homepage](https://www.heavy.ai)\n- [HEAVY.AI Blog](https://www.heavy.ai/blog/)\n- [HEAVY.AI Downloads](https://www.heavy.ai/platform/downloads/)\n\n# License\n\nThis project is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).\n\nThe repository includes a number of third party packages provided under separate licenses. Details about these packages and their respective licenses is at [ThirdParty/licenses/index.md](ThirdParty/licenses/index.md).\n\n# Contributing\n\nIn order to clarify the intellectual property license granted with Contributions from any person or entity, HEAVY.AI must have a Contributor License Agreement (\"CLA\") on file that has been signed by each Contributor, indicating agreement to the [Contributor License Agreement](CLA.txt). After making a pull request, a bot will notify you if a signed CLA is required and provide instructions for how to sign it. Please read the agreement carefully before signing and keep a copy for your records.\n\n# Building\n\nIf this is your first time building HeavyDB, install the dependencies mentioned in the [Dependencies](#dependencies) section below.\n\nHeavyDB uses CMake for its build system.\n\n    mkdir build\n    cd build\n    cmake -DCMAKE_BUILD_TYPE=debug ..\n    make -j 4\n\nThe following `cmake`/`ccmake` options can enable/disable different features:\n\n- `-DCMAKE_BUILD_TYPE=release` - Build type and compiler options to use.\n                                 Options are `Debug`, `Release`, `RelWithDebInfo`, `MinSizeRel`, and unset.\n- `-DENABLE_ASAN=off` - Enable address sanitizer. Default is `off`.\n- `-DENABLE_AWS_S3=on` - Enable AWS S3 support, if available. Default is `on`.\n- `-DENABLE_CUDA=off` - Disable CUDA. Default is `on`.\n- `-DENABLE_CUDA_KERNEL_DEBUG=off` - Enable debugging symbols for CUDA kernels. Will dramatically reduce kernel performance. Default is `off`.\n- `-DENABLE_DECODERS_BOUNDS_CHECKING=off` - Enable bounds checking for column decoding. Default is `off`.\n- `-DENABLE_FOLLY=on` - Use Folly. Default is `on`.\n- `-DENABLE_IWYU=off` - Enable include-what-you-use. Default is `off`.\n- `-DENABLE_JIT_DEBUG=off` - Enable debugging symbols for the JIT. Default is `off`.\n- `-DENABLE_ONLY_ONE_ARCH=off` - Compile GPU code only for the host machine's architecture, speeding up compilation. Default is `off`.\n- `-DENABLE_PROFILER=off` - Enable google perftools. Default is `off`.\n- `-DENABLE_STANDALONE_CALCITE=off` - Require standalone Calcite server. Default is `off`.\n- `-DENABLE_TESTS=on` - Build unit tests. Default is `on`.\n- `-DENABLE_TSAN=off` - Enable thread sanitizer. Default is `off`.\n- `-DENABLE_CODE_COVERAGE=off` - Enable code coverage symbols (clang only). Default is `off`.\n- `-DPREFER_STATIC_LIBS=off` - Static link dependencies, if available. Default is `off`. Only works on CentOS.\n\n# Testing\n\nHeavyDB uses [Google Test](https://github.com/google/googletest) as its main testing framework. Tests reside under the [Tests](Tests) directory.\n\nThe `sanity_tests` target runs the most common tests. If using Makefiles to build, the tests may be run using:\n\n    make sanity_tests\n\n## AddressSanitizer\n\n[AddressSanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer) can be activated by setting the `ENABLE_ASAN` CMake flag in a fresh build directory. At this time CUDA must also be disabled. In an empty build directory run CMake and compile:\n\n    mkdir build \u0026\u0026 cd build\n    cmake -DENABLE_ASAN=on -DENABLE_CUDA=off ..\n    make -j 4\n\nFinally run the tests:\n\n    export ASAN_OPTIONS=alloc_dealloc_mismatch=0:handle_segv=0\n    make sanity_tests\n\n## ThreadSanitizer\n\n[ThreadSanitizer](https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual) can be activated by setting the `ENABLE_TSAN` CMake flag in a fresh build directory. At this time CUDA must also be disabled. In an empty build directory run CMake and compile:\n\n    mkdir build \u0026\u0026 cd build\n    cmake -DENABLE_TSAN=on -DENABLE_CUDA=off ..\n    make -j 4\n\nWe use a TSAN suppressions file to ignore warnings in third party libraries. Source the suppressions file by adding it to your `TSAN_OPTIONS` env:\n\n    export TSAN_OPTIONS=\"suppressions=/path/to/heavydb/config/tsan.suppressions\"\n\nFinally run the tests:\n\n    make sanity_tests\n\n# Generating Packages\n\nHeavyDB uses [CPack](https://cmake.org/cmake/help/latest/manual/cpack.1.html) to generate packages for distribution. Packages generated on CentOS with static linking enabled can be used on most other recent Linux distributions.\n\nTo generate packages on CentOS (assuming starting from top level of the heavydb repository):\n\n    mkdir build-package \u0026\u0026 cd build-package\n    cmake -DPREFER_STATIC_LIBS=on -DCMAKE_BUILD_TYPE=release ..\n    make -j 4\n    cpack -G TGZ\n\nThe first command creates a fresh build directory, to ensure there is nothing left over from a previous build.\n\nThe second command configures the build to prefer linking to the dependencies' static libraries instead of the (default) shared libraries, and to build using CMake's `release` configuration (enables compiler optimizations). Linking to the static versions of the libraries libraries reduces the number of dependencies that must be installed on target systems.\n\nThe last command generates a `.tar.gz` package. The `TGZ` can be replaced with, for example, `RPM` or `DEB` to generate a `.rpm` or `.deb`, respectively.\n\n# Using\n\nThe [`startheavy`](startheavy) wrapper script may be used to start HeavyDB in a testing environment. This script performs the following tasks:\n\n- initializes the `data` storage directory via `initdb`, if required\n- starts the main HeavyDB server, `heavydb`\n- offers to download and import a sample dataset, using the `insert_sample_data` script\n\nAssuming you are in the `build` directory, and it is a subdirectory of the `heavydb` repository, `startheavy` may be run by:\n\n    ../startheavy\n\n## Starting Manually\n\nIt is assumed that the following commands are run from inside the `build` directory.\n\nInitialize the `data` storage directory. This command only needs to be run once.\n\n    mkdir data \u0026\u0026 ./bin/initdb data\n\nStart the HeavyDB server:\n\n    ./bin/heavydb\n\nIf desired, insert a sample dataset by running the `insert_sample_data` script in a new terminal:\n\n    ../insert_sample_data\n\nYou can now start using the database. The `heavysql` utility may be used to interact with the database from the command line:\n\n    ./bin/heavysql -p HyperInteractive\n\nwhere `HyperInteractive` is the default password. The default user `admin` is assumed if not provided.\n\n# Code Style\n\nContributed code should compile without generating warnings by recent compilers on most Linux distributions. Changes to the code should follow the [C++ Core Guidelines](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines).\n\n## clang-format\n\nA [`.clang-format`](https://clang.llvm.org/docs/ClangFormat.html) style configuration, based on the Chromium style guide, is provided at the top level of the repository. Please format your code using a recent version (8.0+ preferred) of ClangFormat before submitting.\n\nTo use:\n\n    clang-format -i File.cpp\n\n## clang-tidy\n\nA [`.clang-tidy`](https://clang.llvm.org/extra/clang-tidy/) configuration is provided at the top level of the repository. Please lint your code using a recent version (6.0+ preferred) of clang-tidy before submitting.\n\n`clang-tidy` requires all generated files to exist before running. The easiest way to accomplish this is to simply run a full build before running `clang-tidy`. A build target which runs `clang-tidy` is provided. To use:\n\n    make run-clang-tidy\n\nNote: `clang-tidy` may make invalid or overly verbose changes to the source code. It is recommended to first commit your changes, then run `clang-tidy` and review its recommended changes before amending them to your commit.\n\nNote: the `clang-tidy` target uses the `run-clang-tidy.py` script provided with LLVM, which may depend on `PyYAML`. The target also depends on `jq`, which is used to filter portions of the `compile_commands.json` file.\n\n# Dependencies\n\nHeavyDB has the following dependencies:\n\n| Package | Min Version | Required |\n| ------- | ----------- | -------- |\n| [CMake](https://cmake.org/) | 3.16 | yes |\n| [LLVM](http://llvm.org/) | 9.0 | yes |\n| [GCC](http://gcc.gnu.org/) | 8.4.0 | no, if building with clang |\n| [Go](https://golang.org/) | 1.12 | yes |\n| [Boost](http://www.boost.org/) | 1.72.0 | yes |\n| [OpenJDK](http://openjdk.java.net/) | 1.7 | yes |\n| [CUDA](http://nvidia.com/cuda) | 11.0 | yes, if compiling with GPU support |\n| [gperftools](https://github.com/gperftools/gperftools) | | yes |\n| [gdal](http://gdal.org/) | 2.4.2 | yes |\n| [Arrow](https://arrow.apache.org/) | 3.0.0 | yes |\n\n## CentOS 7\n\nHeavyDB requires a number of dependencies which are not provided in the common CentOS/RHEL package repositories. A prebuilt package containing all these dependencies is provided for CentOS 7 (x86_64).\n\nUse the [scripts/mapd-deps-prebuilt.sh](scripts/mapd-deps-prebuilt.sh) build script to install prebuilt dependencies.\n\nThese dependencies will be installed to a directory under `/usr/local/mapd-deps`. The `mapd-deps-prebuilt.sh` script also installs [Environment Modules](http://modules.sf.net) in order to simplify managing the required environment variables. Log out and log back in after running the `mapd-deps-prebuilt.sh` script in order to active Environment Modules command, `module`.\n\nThe `mapd-deps` environment module is disabled by default. To activate for your current session, run:\n\n    module load mapd-deps\n\nTo disable the `mapd-deps` module:\n\n    module unload mapd-deps\n\nWARNING: The `mapd-deps` package contains newer versions of packages such as GCC and ncurses which might not be compatible with the rest of your environment. Make sure to disable the `mapd-deps` module before compiling other packages.\n\nInstructions for installing CUDA are below.\n\n### CUDA\n\nIt is preferred, but not necessary, to install CUDA and the NVIDIA drivers using the .rpm using the [instructions provided by NVIDIA](https://developer.nvidia.com/cuda-downloads). The `rpm (network)` method (preferred) will ensure you always have the latest stable drivers, while the `rpm (local)` method allows you to install does not require Internet access.\n\nThe .rpm method requires DKMS to be installed, which is available from the [Extra Packages for Enterprise Linux](https://fedoraproject.org/wiki/EPEL) repository:\n\n    sudo yum install epel-release\n\nBe sure to reboot after installing in order to activate the NVIDIA drivers.\n\n### Environment Variables\n\nThe `mapd-deps-prebuilt.sh` script includes two files with the appropriate environment variables: `mapd-deps-\u003cdate\u003e.sh` (for sourcing from your shell config) and `mapd-deps-\u003cdate\u003e.modulefile` (for use with [Environment Modules](http://modules.sf.net), yum package `environment-modules`). These files are placed in mapd-deps install directory, usually `/usr/local/mapd-deps/\u003cdate\u003e`. Either of these may be used to configure your environment: the `.sh` may be sourced in your shell config; the `.modulefile` needs to be moved to the modulespath.\n\n### Building Dependencies\n\nThe [scripts/mapd-deps-centos.sh](scripts/mapd-deps-centos.sh) script is used to build the dependencies. Modify this script and run if you would like to change dependency versions or to build on alternative CPU architectures.\n\n    cd scripts\n    module unload mapd-deps\n    ./mapd-deps-centos.sh --compress\n\n## macOS\n\n[scripts/mapd-deps-osx.sh](scripts/mapd-deps-osx.sh) is provided that will automatically install and/or update [Homebrew](http://brew.sh/) and use that to install all dependencies. Please make sure macOS is completely up to date and Xcode is installed before running. Xcode can be installed from the App Store.\n\n### CUDA\n\n`mapd-deps-osx.sh` will automatically install CUDA via Homebrew and add the correct environment variables to `~/.bash_profile`.\n\n### Java\n\n`mapd-deps-osx.sh` will automatically install Java and Maven via Homebrew and add the correct environment variables to `~/.bash_profile`.\n\n## Ubuntu\n\nMost build dependencies required by HeavyDB are available via APT. Certain dependencies such as Thrift, Blosc, and Folly must be built as they either do not exist in the default repositories or have outdated versions. A prebuilt package containing all these dependencies is provided for Ubuntu 18.04 (x86_64). The dependencies will be installed to `/usr/local/mapd-deps/` by default; see the Environment Variables section below for how to add these dependencies to your environment.\n\n### Ubuntu 16.04\n\nHeavyDB requires a newer version of Boost than the version which is provided by Ubuntu 16.04. The [scripts/mapd-deps-ubuntu1604.sh](scripts/mapd-deps-ubuntu1604.sh) build script will compile and install a newer version of Boost into the `/usr/local/mapd-deps/` directory.\n\n### Ubuntu 18.04\n\nUse the [scripts/mapd-deps-prebuilt.sh](scripts/mapd-deps-prebuilt.sh) build script to install prebuilt dependencies.\n\nThese dependencies will be installed to a directory under `/usr/local/mapd-deps`. The `mapd-deps-prebuilt.sh` script above will generate a script named `mapd-deps.sh` containing the environment variables which need to be set. Simply source this file in your current session (or symlink it to `/etc/profile.d/mapd-deps.sh`) in order to activate it:\n\n    source /usr/local/mapd-deps/mapd-deps.sh\n\n### Environment Variables\n\nThe CUDA and mapd-deps `lib` directories need to be added to `LD_LIBRARY_PATH`; the CUDA and mapd-deps `bin` directories need to be added to `PATH`. The `mapd-deps-ubuntu.sh` and `mapd-deps-prebuilt.sh` scripts will generate a script named `mapd-deps.sh` containing the environment variables which need to be set. Simply source this file in your current session (or symlink it to `/etc/profile.d/mapd-deps.sh`) in order to activate it:\n\n    source /usr/local/mapd-deps/mapd-deps.sh\n\n### CUDA\n\nRecent versions of Ubuntu provide the NVIDIA CUDA Toolkit and drivers in the standard repositories. To install:\n\n    sudo apt install -y \\\n        nvidia-cuda-toolkit\n\nBe sure to reboot after installing in order to activate the NVIDIA drivers.\n\n### Building Dependencies\n\nThe [scripts/mapd-deps-ubuntu.sh](scripts/mapd-deps-ubuntu.sh) and [scripts/mapd-deps-ubuntu1604.sh](scripts/mapd-deps-ubuntu1604.sh) scripts are used to build the dependencies for Ubuntu 18.04 and 16.04, respectively. The scripts will install all required dependencies (except CUDA) and build the dependencies which require it. Modify this script and run if you would like to change dependency versions or to build on alternative CPU architectures.\n\n    cd scripts\n    ./mapd-deps-ubuntu.sh --compress\n\n## Arch\n\n[scripts/mapd-deps-arch.sh](scripts/mapd-deps-arch.sh) is provided that will use [yay](https://aur.archlinux.org/packages/yay/) to install packages from the [Arch User Repository](https://wiki.archlinux.org/index.php/Arch_User_Repository) and custom PKGBUILD scripts for a few packages listed below. If you don't have `yay` yet, install it first: https://github.com/Jguer/yay#installation\n\n### Package Version Requirements:\n\n### CUDA\n\nCUDA and the NVIDIA drivers may be installed using the following.\n\n    yay -S \\\n        linux-headers \\\n        cuda \\\n        nvidia\n\nBe sure to reboot after installing in order to activate the NVIDIA drivers.\n\n### Environment Variables\n\nThe `cuda` package should set up the environment variables required to use CUDA. If you receive errors saying `nvcc` is not found, then CUDA `bin` directories need to be added to `PATH`: the easiest way to do so is by creating a new file named `/etc/profile.d/mapd-deps.sh` containing the following:\n\n    PATH=/opt/cuda/bin:$PATH\n    export PATH\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fheavyai%2Fheavydb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fheavyai%2Fheavydb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fheavyai%2Fheavydb/lists"}