Data Types#

Interface#

class Blob#

Represents a binary large object (BLOB) that can be read from various sources.

The Blob class provides a unified interface for handling binary data from different sources including file path and blob descriptor. It supports reading data through input streams and provides descriptor-based serialization.

Public Functions

~Blob()#
PAIMON_UNIQUE_PTR<Bytes> ToDescriptor(const std::shared_ptr<MemoryPool> &pool) const#

Converts the blob to a blob descriptor.

Parameters:

pool – The memory pool to use for allocation.

Returns:

A blob descriptor bytes representing the blob.

const std::string &Uri() const#

Gets the URI of the blob.

Result<std::unique_ptr<InputStream>> NewInputStream(const std::shared_ptr<FileSystem> &fs) const#

Creates an input stream for reading the blob data.

Parameters:

fs – The file system to use for reading.

Returns:

A result containing the input stream or an error.

Result<PAIMON_UNIQUE_PTR<Bytes>> ToData(const std::shared_ptr<FileSystem> &fs, const std::shared_ptr<MemoryPool> &pool) const#

Reads the blob data to bytes.

Parameters:
  • fs – The file system to use for reading.

  • pool – The memory pool to use for allocation.

Returns:

A result containing the blob data bytes or an error.

Public Static Functions

static Result<std::unique_ptr<Blob>> FromPath(const std::string &path)#

Creates a Blob from a file path.

Parameters:

path – The file path to create the blob from.

Returns:

A result containing the created blob or an error.

static Result<std::unique_ptr<Blob>> FromPath(const std::string &path, int64_t offset, int64_t length)#

Creates a Blob from a file path with specified offset and length.

Parameters:
  • path – The file path to create the blob from.

  • offset – The starting offset within the file.

  • length – The length of data to read from the file.

Returns:

A result containing the created blob or an error.

static Result<std::unique_ptr<Blob>> FromDescriptor(const char *buffer, uint64_t length)#

Creates a Blob from a blob descriptor.

Parameters:
  • buffer – The buffer of the blob descriptor.

  • length – The length of the buffer.

Returns:

A result containing the created blob or an error.

static Result<std::unique_ptr<::ArrowSchema>> ArrowField(const std::string &field_name, bool nullable = false, std::unordered_map<std::string, std::string> metadata = {})#

Creates an Arrow field definition for the Blob type.

This function constructs an Arrow Field (internally using arrow::large_binary()) and exports it to the C data interface structure ArrowSchema. It automatically injects Paimon-specific metadata to identify the field as a BLOB.

Parameters:
  • field_name – The name of the Arrow field.

  • nullable – Whether the field can contain null values (defaults to false).

  • metadata – A map of key-value metadata to be attached to the field.

Returns:

A result containing a unique pointer to the generated ArrowSchema or an error.

class Decimal#

A data structure representing data of Decimal.

It might be stored in a compact representation (as a long value) if values are small enough.

Public Types

using int128_t = __int128_t#
using uint128_t = __uint128_t#

Public Functions

inline Decimal(int32_t precision, int32_t scale, int128_t value)#
inline int32_t Precision() const#

Get the precision of this decimal.

The precision is the number of digits in the unscaled value.

inline int32_t Scale() const#

Get the scale of this decimal.

inline int128_t Value() const#

Get the underlying int128_t value of this decimal.

inline uint64_t LowBits() const#

Get the low 64 bits of the decimal value.

inline uint64_t HighBits() const#

Get the high 64 bits of the decimal value.

inline bool IsCompact() const#
Returns:

Whether the decimal value is small enough to be stored in a long.

std::string ToString() const#
inline int64_t ToUnscaledLong() const#
Returns:

A long describing the unscaled value of this decimal.

std::vector<char> ToUnscaledBytes() const#
Returns:

A byte array describing the unscaled value of this decimal.

inline bool operator==(const Decimal &other) const#
inline bool operator<(const Decimal &other) const#
inline bool operator>(const Decimal &other) const#
int32_t CompareTo(const Decimal &other) const#

Public Static Functions

static inline bool IsCompact(int32_t precision)#
Returns:

Whether the decimal value is small enough to be stored in a long.

static inline Decimal FromUnscaledLong(int64_t unscaled_long, int32_t precision, int32_t scale)#

Creates an instance of Decimal from an unscaled long value and the given precision and scale.

static Decimal FromUnscaledBytes(int32_t precision, int32_t scale, Bytes *bytes)#

Creates an instance of Decimal from an unscaled byte array value and the given precision and scale.

Public Static Attributes

static constexpr int32_t DEFAULT_PRECISION = 10#
static constexpr int32_t DEFAULT_SCALE = 0#
static constexpr int32_t MIN_PRECISION = 1#
static constexpr int32_t MAX_PRECISION = 38#
class Timestamp#

A data structure representing data of Timestamp without timezone.

This data structure is immutable and consists of a milliseconds and nanos-of-millisecond since 1970-01-01 00:00:00. It might be stored in a compact representation (as a long value) if values are small enough. Timestamp range from 0000-01-01 00:00:00.000000000 to 9999-12-31 23:59:59.999999999.

Public Functions

inline Timestamp()#
inline Timestamp(int64_t millisecond, int32_t nano_of_millisecond)#
inline int64_t GetMillisecond() const#

Get the number of milliseconds since 1970-01-01 00:00:00.

inline int32_t GetNanoOfMillisecond() const#

Get the number of nanoseconds (the nanoseconds within the milliseconds).

The value range is from 0 to 999,999.

inline Timestamp ToMillisTimestamp() const#

Converts this Timestamp object to millis Timestamp object (ignore nanos_of_millisecond).

int64_t ToMicrosecond() const#

Converts this Timestamp object to microsecond.

inline int64_t ToNanosecond() const#

Converts this Timestamp object to nanoseconds.

inline bool operator==(const Timestamp &other) const#
inline bool operator<(const Timestamp &other) const#
std::string ToString() const#

Converts the Timestamp object to a string representation in UTC (GMT).

The format of the returned string is “YYYY-MM-DD HH:MM:SS.nnnnnnnnn”, where the date and time are in UTC (GMT), and the nanoseconds are derived from the millisecond and nanosecond parts of the timestamp.

Note

This method uses UTC (GMT) time zone when formatting the time. This is different from the Java Paimon implementation, which may convert the timestamp to the local time zone of the machine running the Java process.

Public Static Functions

static inline Timestamp FromEpochMillis(int64_t milliseconds)#

Creates an instance of Timestamp from milliseconds.

The nanos-of-millisecond field will be set to zero.

Parameters:

milliseconds – The number of milliseconds since 1970-01-01 00:00:00; a negative number is the number of milliseconds before 1970-01-01 00:00:00.

static inline Timestamp FromEpochMillis(int64_t milliseconds, int32_t nanos_of_millisecond)#

Creates an instance of Timestamp from milliseconds and a nanos-of-millisecond.

Parameters:
  • milliseconds – The number of milliseconds since 1970-01-01 00:00:00; a negative number is the number of milliseconds before 1970-01-01 00:00:00.

  • nanos_of_millisecond – The nanoseconds within the millisecond, from 0 to 999,999.

static inline bool IsCompact(int32_t precision)#
Returns:

Whether the timestamp data is small enough to be stored in a long of milliseconds.

Public Static Attributes

static const int32_t DEFAULT_PRECISION#
static const int32_t MILLIS_PRECISION#
static const int32_t MIN_PRECISION#
static const int32_t MAX_PRECISION#
static const int32_t MAX_COMPACT_PRECISION#