Apache Celeborn™

Celeborn is an intermediate data service for Big Data compute engines (i.e. ETL, OLAP and Streaming engines) to boost performance, stability, and flexibility. Intermediate data typically include shuffle and spilled data.

Get Started Go to GitHub

Integrate Celeborn with Apache Spark™

Apache Spark™ is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.

Integrate Celeborn with Apache Flink®

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams.

Integrate Celeborn with Apache Hadoop MapReduce®

Apache Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner.