Configuration
The configuration of Celeborn is divided into static and dynamic categories, with details provided in the Configuration Guide.
Static Configuration
Static configuration, referred to as CelebornConf
, loads configurations from the default file located at $CELEBORN_HOME/conf/celeborn-defaults.conf
.
Dynamic Configuration
Dynamic configuration allows for changes to be applied at runtime, as necessary, and it takes precedence over the corresponding static configuration in the
Celeborn Master
and Worker
. A configuration key's dynamic nature is indicated by the isDynamic
property, as listed in All Configurations.
This means that configurations tagged with the dynamic property can be updated and refreshed while Celeborn is running.
Config Level
At present dynamic configuration supports various config levels including:
SYSTEM
: The system configurations.TENANT
: The dynamic configurations of tenant id.TENANT_USER
: The dynamic configurations of tenant id and username.
When applying dynamic configuration, the following is the order of precedence for configuration levels:
SYSTEM
level configuration takes precedence over static configuration and the defaultCelebornConf
. If the system-level configuration is absent, it will fall back to the static configuration defined inCelebornConf
.TENANT
level configuration supersedes theSYSTEM
level, meaning that configurations specific to a tenant id will override those set at the system level. If tenant-level configuration is absent, it will fall back to the system-level dynamic configuration.TENANT_USER
level configuration takes precedence overTENANT
level. Configurations specific to both a tenant id and username will override those set at the tenant level. If tenant-user-level configuration is missing, it will fall back to the tenant-level dynamic configuration.
Config Service
The config service provides a configuration management service with a local cache for both static and dynamic configurations. Moreover, ConfigService
is
a pluggable service interface whose implementation can vary based on different storage backends. The storage backend for ConfigService
is specified by the
configuration key celeborn.dynamicConfig.store.backend
, and it currently supports filesystem (FS
) and database (DB
) as storage backends by default.
Additionally, users can provide their own implementation by extending the ConfigService
interface and using the fully qualified class name of the implementation
as storage backend. If no storage backend is specified, this indicates that the config service is disabled.
FileSystem Config Service
The filesystem config service enables the use of dynamic configuration files, the location of which is set by the configuration key celeborn.dynamicConfig.store.fs.path
.
The template for the dynamic configuration is as follows:
# SYSTEM level configuration
- level: SYSTEM
config:
[config_key]: [config_val]
...
# TENANT level configuration
- tenantId: [tenant_id]
level: TENANT
config:
[config_key]: [config_val]
...
users:
# TENANT_USER level configuration
- name: [name]
config:
[config_key]: [config_val]
...
For example, a Celeborn worker celeborn-worker
has 10 storage directories or disks and the buffer size is set to 256 KiB. A tenant tenantId1
only uses half of the storage
and sets the buffer size to 128 KiB. Meanwhile, a user user1
needs to change the buffer size to 96 KiB at runtime. The example configurations are as follows:
# SYSTEM level configuration
- level: SYSTEM
config:
celeborn.worker.flusher.buffer.size: 256K # sets buffer size of worker to 256 KiB
# TENANT level configuration
- tenantId: tenantId1
level: TENANT
config:
celeborn.worker.flusher.buffer.size: 128K # sets buffer size of tenantId1 to 128 KiB
users:
# TENANT_USER level configuration
- name: user1
config:
celeborn.worker.flusher.buffer.size: 96K # sets buffer size of tenantId1 and user1 to 128 KiB
Database Config Service
The database config service updates dynamic configurations stored in the database using the JDBC approach. Configuration settings for the database storage backend
are defined by the celeborn.dynamicConfig.store.db.*
series of configuration keys. To use the database as a config store backend, it is necessary to create tables for
dynamic configurations at the various configuration levels. The sql script for MySQL configuration tables is located under $CELEBORN_HOME/db-scripts
directory.
After the creation of configuration tables, dynamic configuration of config levels is specified via inserting a configuration record in corresponding config level table.
Above example dynamic configurations can be supported via the following sql:
CREATE TABLE IF NOT EXISTS celeborn_cluster_info (
id int NOT NULL AUTO_INCREMENT,
name varchar(255) NOT NULL COMMENT 'celeborn cluster name',
namespace varchar(255) DEFAULT NULL COMMENT 'celeborn cluster namespace',
endpoint varchar(255) DEFAULT NULL COMMENT 'celeborn cluster endpoint',
gmt_create timestamp NOT NULL,
gmt_modify timestamp NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY `index_cluster_unique_name` (`name`)
);
# SYSTEM level configuration
CREATE TABLE IF NOT EXISTS celeborn_cluster_system_config (
id int NOT NULL AUTO_INCREMENT,
cluster_id int NOT NULL,
config_key varchar(255) NOT NULL,
config_value varchar(255) NOT NULL,
type varchar(255) DEFAULT NULL COMMENT 'conf categories, such as quota',
gmt_create timestamp NOT NULL,
gmt_modify timestamp NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY `index_unique_system_config_key` (`cluster_id`, `config_key`)
);
# TENANT/TENANT_USER level configuration
CREATE TABLE IF NOT EXISTS celeborn_cluster_tenant_config (
id int NOT NULL AUTO_INCREMENT,
cluster_id int NOT NULL,
tenant_id varchar(255) NOT NULL,
level varchar(255) NOT NULL COMMENT 'config level, valid level is TENANT,USER',
name varchar(255) DEFAULT NULL COMMENT 'tenant sub user',
config_key varchar(255) NOT NULL,
config_value varchar(255) NOT NULL,
type varchar(255) DEFAULT NULL COMMENT 'conf categories, such as quota',
gmt_create timestamp NOT NULL,
gmt_modify timestamp NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY `index_unique_tenant_config_key` (`cluster_id`, `tenant_id`, `name`, `config_key`)
);
INSERT INTO celeborn_cluster_info ( `id`, `name`, `namespace`, `endpoint`, `gmt_create`, `gmt_modify` )
VALUES
( 1, 'default', 'celeborn-worker', 'celeborn-namespace.endpoint.com', '2024-02-27 22:08:30', '2024-02-27 22:08:30' );
# SYSTEM level configuration
# sets buffer size of celeborn-worker to 256 KiB
INSERT INTO `celeborn_cluster_system_config` ( `id`, `cluster_id`, `config_key`, `config_value`, `type`, `gmt_create`, `gmt_modify` )
VALUES
( 1, 1, 'celeborn.worker.flusher.buffer.size', '256K', 'QUOTA', '2024-02-27 22:08:30', '2024-02-27 22:08:30' );
# TENANT/TENANT_USER level configuration
# TENANT: sets buffer size of tenantId1 to 128 KiB
# TENANT_USER: sets buffer size of tenantId1 and user1 to 96 KiB
INSERT INTO `celeborn_cluster_tenant_config` ( `id`, `cluster_id`, `tenant_id`, `level`, `name`, `config_key`, `config_value`, `type`, `gmt_create`, `gmt_modify` )
VALUES
( 1, 1, 'tenantId1', 'TENANT', '', 'celeborn.worker.flusher.buffer.size', '128K', 'worker', '2024-02-27 22:08:30', '2024-02-27 22:08:30' ),
( 2, 1, 'tenantId1', 'TENANT_USER', 'user1', 'celeborn.worker.flusher.buffer.size', '96K', 'worker', '2024-02-27 22:08:30', '2024-02-27 22:08:30' );
Rest API
In addition to viewing the configurations, Celeborn support REST API available for both master and worker including:
/conf
: List the conf setting of master and worker./listDynamicConfigs
: List the dynamic configs of master and worker.
The API providers of listing configurations refer to Available API providers