Cluster Planning

Node Spec

Empirical size configs for Celeborn nodes The principle is to try to avoid any hardware(CPU, Memory, Disk Bandwidth/IOPS, Network Bandwidth/PPS)becoming the bottleneck. The goal is to let all the hardware usage be close to each other, for example let the disk IOPS/Bandwidth usage and network usage stay roughly the same so that data will be perfectly pipelined and no back-pressure will be triggered.

The goal is hard to reach, and perhaps has a relationship with workload characteristics, and also Celeborn’s configs can have some impact. In our former experience, vCores: memory(GB): Bandwidth( Gbps): Disk IO (KIOps) is better to be 2: 5: 2: 1. We didn’t thoroughly conduct experiments on various configs(it’s hard to do so), so it’s merely a reference.

Worker Scale

You need to estimate your cluster's max concurrent shuffle size(ES), and get the total usable disk space of a node(NS). The worker count can be (ES * 2 / NS).