Python hosts several classic functional-style programming tools like map and reduce. These offer powerful means to express functional patterns, often useful in data processing pipelines (among many other applications.) The Python reduce function can be difficult to understand without a background in functional programming. In this article, we’ll cover the basic concepts of reduce and walk through a simple example.
Quick Intro: Reading the Manual
As of Python 3.0+, the reduce
function lives in the functools module. Its functionality, as described by the official documentation:
Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])` calculates ((((1+2)+3)+4)+5). The left argument, x, is the accumulated value and the right argument, y, is the update value from the iterable. If the optional initializer is present, it is placed before the items of the iterable in the calculation, and serves as a default when the iterable is empty. If initializer is not given and iterable contains only one item, the first item is returned.
The documentation also provides the following example function, stating that reduce
is the rough equivalent in functionality:
def reduce(function, iterable, initializer=None): it = iter(iterable) if initializer is None: value = next(it) else: value = initializer for element in it: value = function(value, element) return value
Figure 1: Python.org’s example of reduce functions basic operation.
There are two key functionalities here to note:
- The initialization of using either the
initializer
value or falling back to the first item initerable
. - The continuous update of
value
in a looping fashion.
The first is less essential to the nature of reduce
, but essential to understanding starting conditions. The continuous updating of the return value is the crux of how reduce
works: it accumulates a value based on previous calculations. Let’s break this down using a simple example.
Basic Example: Using Reduce on Integer Sequence
Python’s reduce function takes two arguments: a function and an interable. Optionally, one can also provide an initializer that will be used as the starting value, as discussed in Figure 1 above. In our example, the function will be a simple function that adds two numbers and a list of integer values ranging from 1-5 (inclusive) from which to pull arguments. Let’s run this to see the output:
import functools # defines the iterable nums = [1, 2, 3, 4, 5] def two_sum(a: int, b: int) -> int: """ Adds two arguments together. """ return a + b # uses the iterable and function as arguments for reduce output = functools.reduce(two_sum, nums) >>> 15
The function two_sum
is used here to be as explicit as possible. Conventionally, the syntax for this would be substituted with a lambda function as follows:
# use lambda instead of explicit function def. functools.reduce(lambda a, b: a + b, nums) >>> 15
The answer is the same, and the syntax is much reduced. Now that we’ve seen the syntax for applying the reduce
function let’s break the process down. Essentially, the reduce
function maintains a state for two values at any given step:
- accumulator: a running value from previous applications of the function argument.
- current: the current value is taken as the next item in the iterable argument.
For the example above, using the simple a + b
function and the nums
iterable, the process can be recorded as such:
- Initial step:
- Accumulator (
a
) = 1 (first element of the list, by default) - Current value (
b
) = 2 (second element of the list) - Operation:
a + b
= 1 + 2 = 3
- Accumulator (
- Second step:
- Accumulator (
a
) = 3 (result of the previous step) - Current value (
b
) = 3 (third element of the list) - Operation:
a + b
= 3 + 3 = 6
- Accumulator (
- Third step:
- Accumulator (
a
) = 6 (result of the previous step) - Current value (
b
) = 4 (fourth element of the list) - Operation:
a + b
= 6 + 4 = 10
- Accumulator (
- Fourth step:
- Accumulator (
a
) = 10 (result of the previous step) - Current value (
b
) = 5 (fifth and last element of the list) - Operation:
a + b
= 10 + 5 = 15
- Accumulator (
Final result: 15
In practice, the fourth step
is conceptualized as nth step
where n
is the total number of items in the iterable. For example, if our list of numbers extended to 15, the above would have 14 total steps (15 – 1 where the “1” is the first element used as the initializer.) Let’s break it down visually for an even better understanding:
In this illustration, we see the current
value being referenced as the next item in the nums
iterable. At each step, that value is used as the b
argument to the function
argument, with the previous value as the a
. Semantically, it can be thought of as: at each step, apply the function using the previous value and the next value as arguments.
Note: In all examples, there was no initializer
value used. This optional argument is essentially like adding an extra value to the start of the iterable. It is particularly useful for passing the output of one function as input to another, while also using a common iterable – nums
in out case.
Final Thoughts
Functional programming is a powerful tool often used in applications where the primary state change is in application data. These tools allow developers to use arguments and existing functions with ease to create stateless flows capable of powerful results. For those interested in more complex applications, check out Google’s MapReduce paper for a detailed discussion of how functional paradigms like map and reduce can be applied to complex distributed systems.