Building Paimon C++#

System setup#

Paimon uses CMake as a build configuration system. We recommend building out-of-source. For example, you could create paimon-cpp/build-release and invoke cmake $CMAKE_ARGS .. from this directory.

Building requires:

  • A C++17-enabled compiler. On Linux, gcc 9 and higher should be sufficient. Windows and MacOS are not supported for now.

  • At least 2GB of RAM for a minimal build, 8GB for a minimal debug build with tests and 16GB for a full build.

On Ubuntu/Debian you can install the requirements with:

sudo apt-get install \
     build-essential \
     cmake

We also provide a docker template to help you get started quickly. See in .devcontainer folder for more details.

Building#

All the instructions below assume that you have cloned the paimon-cpp git repository:

$ git clone https://github.com/alibaba/paimon-cpp.git
$ cd paimon-cpp

Manual configuration#

The build system uses CMAKE_BUILD_TYPE=Release by default, so if this argument is omitted then a release build will be produced.

Two build types are possible:

  • Debug: doesn’t apply any compiler optimizations and adds debugging information in the binary.

  • Release: applies compiler optimizations and removes debug information from the binary.

Note

You can also run default build with flag -DPAIMON_EXTRA_ERROR_CONTEXT=ON for more error msg context.

Minimal release build (2GB of RAM for building or more recommended):

$ mkdir build-release
$ cd build-release
$ cmake ..
$ make -j8       # if you have 8 CPU cores, otherwise adjust
$ make install

Minimal debug build with unit tests (4GB of RAM for building or more recommended):

$ mkdir build-debug
$ cd build-debug
$ cmake -DCMAKE_BUILD_TYPE=Debug -DPAIMON_BUILD_TESTS=ON ..
$ make -j8       # if you have 8 CPU cores, otherwise adjust
$ make unittest  # to run the tests
$ make install

The unit tests are not built by default. After building, one can also invoke the unit tests using the ctest tool provided by CMake.

Faster builds with Ninja#

Many contributors use the Ninja build system to get faster builds. It especially speeds up incremental builds. To use ninja, pass -GNinja when calling cmake and then use the ninja command instead of make.

Optional Components#

By default, the C++ build system creates a fairly minimal build. We have several optional system components which you can opt into building by passing boolean flags to cmake.

  • -DPAIMON_ENABLE_ORC=ON: Paimon integration with Apache ORC

  • -DPAIMON_ENABLE_LANCE=ON: Paimon integration with Lance

  • -DPAIMON_ENABLE_AVRO=ON: Apache Avro libraries and Paimon integration

  • -DPAIMON_ENABLE_JINDO=ON: Support for Alibaba Jindo filesystems

  • -DPAIMON_ENABLE_LUMINA=ON: Support for Lumina vector index

Optional Targets#

For development builds, you will often want to enable additional targets in enable to exercise your changes, using the following cmake options.

  • -DPAIMON_BUILD_TESTS=ON: Build executable unit tests.

Optional Checks#

The following special checks are available as well. They instrument the generated code in various ways so as to detect select classes of problems at runtime (for example when executing unit tests).

  • -DPAIMON_USE_ASAN=ON: Enable Address Sanitizer to check for memory leaks, buffer overflows or other kinds of memory management issues.

  • -DPAIMON_USE_TSAN=ON: Enable Thread Sanitizer to check for races in multi-threaded code.

  • -DPAIMON_USE_UBSAN=ON: Enable Undefined Behavior Sanitizer to check for situations which trigger C++ undefined behavior.

Some of those options are mutually incompatible, so you may have to build several times with different options if you want to exercise all of them.

CMake version requirements#

We support CMake 3.16 and higher.

LLVM and Clang Tools#

We currently use LLVM for library builds and for developer tools such as code formatting with clang-format. LLVM can be installed via most modern package managers (apt, yum, etc.).