rdd
Spark Architecture
Apache Spark is a unified, open-source, distributed data processing engine for big data. In this article, we will discuss about the Spark architecture, its distributed nature and how it achieves processing of high volume data.
Posted August 4, 2022 by Rohith ‐ 7 min read
⌖ apache spark bigdata architecture transformations distributed-system actions rdd
RDD in Spark
RDD (Resilient Distributed Dataset) in spark is a fundamental data structure of Spark. It is the primary data abstraction in Apache Spark and the Spark Core.
Posted August 31, 2022 by Rohith ‐ 5 min read
⌖ apache spark bigdata distributed-system spark-fundamentals rdd