Sign in

This article illustrates the fundamental operations in the MapReduce programming paradigm. Modern big data frameworks like Spark, Flink, Pig, and Hive, and most functional programming languages, all provide interfaces for these common operations in some form or another, so understanding them will allow you to learn these tools more easily. The basic operations are map, partition, group, sort, reduce, and combine.

MapReduce is derived from functional programming concepts. Each of these operations are higher order functions, which in layman’s terms, means each operator’s behavior can be customized by passing it a function you define. …

Angela Ding