Version: 0.7

Overview

Below, we provide an overview of the key components of a Fluss cluster, detailing their functionalities and implementations. Additionally, we will introduce the various deployment methods available for Fluss.

Overview and Reference Architecture

The figure below shows the building blocks of Fluss clusters:

When deploying Fluss, there are often multiple options available for each building block. We have listed them in the table below the figure.

Component	Purpose	Implementations
Fluss Client	The Fluss Client is the entry point for users to interact with Fluss Cluster. It is responsible for managing Fluss Cluster like: Admin operation: like create or delete database/table etc Table operation: like write, read, delete data	Flink Connector
CoordinatorServer	CoordinatorServer is the name of the central work coordination component of Fluss. The coordinator server is responsible to: Manage the TabletServer Manage the metadata Coordinate the whole cluster, e.g. data re-balance, recover data when tablet servers down	Local Cluster Distributed Cluster Docker run / Docker compose
TabletServer	TabletServers are the actual node to manage and store data.
External Components
ZooKeeper	warning Zookeeper will be removed to simplify deployment in the near future. For more details, please checkout Roadmap. Fluss leverages ZooKeeper for distributed coordination between all running CoordinatorServer instances and for metadata management.	Zookeeper
Remote Storage (optional)	Fluss uses file systems as remote storage to store snapshots for Primary-Key Table and store tiered log segments for Log Table.	HDFS Aliyun OSS Amazon S3
Lakehouse Storage (optional)	Fluss's DataLake Tiering Service will continuously compact Fluss's Arrow files into Parquet/ORC files in open lake format. The data in Lakehouse storage can be read both by Fluss's client in a Union Read manner and accessed directly by query engines such as Flink, Spark, StarRocks, Trino.	Paimon Iceberg (Roadmap)
Metrics Storage (optional)	CoordinatorServer/TabletServer report internal metrics and Fluss client (e.g., connector in Flink jobs) can report additional, client specific metrics as well.	JMX Prometheus

How to deploy Fluss

Fluss can be deployed in three different ways:

NOTE:

Local Cluster is for testing purpose only.

Overview and Reference Architecture​

How to deploy Fluss​

Overview and Reference Architecture

How to deploy Fluss