architecture
Spark Architecture
Apache Spark is a unified, open-source, distributed data processing engine for big data. In this article, we will discuss about the Spark architecture, its distributed nature and how it achieves processing of high volume data.
Posted August 4, 2022 by Rohith ‐ 7 min read
⌖ apache spark bigdata architecture transformations distributed-system actions rdd
Spark Memory Management
The main feature of apache spark is its ability to run computations in memory. Hence, it is obvious that memory management plays a very important role in the whole system. In this article we will dive into spark memory management.
Posted August 9, 2022 by Rohith ‐ 11 min read
⌖ apache spark bigdata architecture memory jvm yarn heap off-heap distributed-system gc