Apache Celeborn™ 0.5.4 Release Notes
Highlight
- Support retry when sending RPC to LifecycleManager
- Support custom implementation of EventExecutorChooser to avoid deadlock when calling await in EventLoop thread
- Interrupt spark task should not report fetch failure
- Fix flink client memory leak of TransportResponseHandler#outstandingRpcs for handling addCredit response
Improvement
- [CELEBORN-1757] Add retry when sending RPC to LifecycleManager
- [CELEBORN-1841] Support custom implementation of EventExecutorChooser to avoid deadlock when calling await in EventLoop thread
- [CELEBORN-1859] DfsPartitionReader and LocalPartitionReader should reuse pbStreamHandlers get from BatchOpenStream request
- [CELEBORN-1882] Support configuring the SSL handshake timeout for SSLHandler
- [CELEBORN-1897] Avoid calling toString for too long messages
Stability and Bug Fix
- [CELEBORN-1818] Fix incorrect timeout exception when waiting on no pending writes
- [CELEBORN-1838] Interrupt spark task should not report fetch failure
- [CELEBORN-1846] Fix the StreamHandler usage in fetching chunk when task attempt is odd
- [CELEBORN-1850] Setup worker endpoint after initalizing controller
- [CELEBORN-1865] Update master endpointRef when master leader is abnormal
- [CELEBORN-1867] Fix flink client memory leak of TransportResponseHandler#outstandingRpcs for handling addCredit response
- [CELEBORN-1883] Replace HashSet with ConcurrentHashMap.newKeySet for ShuffleFileGroups
- [CELEBORN-1885] Fix nullptr exceptions in FetchChunk after worker restart
Credits
Thanks to the following contributors who helped to review and commit to Apache Celeborn 0.5.4 version:
Contributors | |||||
---|---|---|---|---|---|
Aidar Bariev | Ethan Feng | Minchu Yang | Nan Zhu | Sanskar Modi | Xinyu Wang |
Xu Hang | Yihe Li | Zaynt Shuai | Ziyi Wu |