Skip to main content

Fluss

Streaming Storage for Real-Time Analytics

What is Fluss?

Fluss is a streaming storage built for real-time analytics which can serve as the real-time data layer for Lakehouse architectures. With its columnar stream and real-time update capabilities, Fluss integrates seamlessly with Apache Flink to enable high-throughput, low-latency, cost-effective streaming data warehouses tailored for real-time applications.

Key Features

Sub-Second Latency
Fluss supports low-latency streaming reads and writes, similar to Apache Kafka. Combined with Apache Flink, Fluss enables the creation of high-throughput, low-latency streaming data warehouses, optimized for real-time applications.
Columnar Stream
Fluss stores streaming data in a columnar format, delivering up to 10x improvement in streaming read performance. Networking costs are significantly reduced through efficient pushdown projections.
Streaming & Lakehouse Unification
Fluss unifies data streaming and the data Lakehouse by serving streaming data on top of the Lakehouse. This allows for low latencies on the Lakehouse and powerful analytics to data streams.
Real-Time Updates
The PrimaryKey Table supports real-time streaming updates for large-scale data. It also enables cost-efficient partial updates, making it ideal for enriching wide tables without expensive join operations.
Changelog Generation & Tracking
Updates generate complete changelogs that can be directly consumed by streaming processors in real time. This allows to streamline streaming analytics workflows and reduce operational costs.
Lookup Queries
Fluss supports ultra-high QPS for primary key point lookups, making it an ideal solution for serving dimension tables. When combined with Apache Flink, it enables high-throughput lookup joins with exceptional efficiency.