## lunes, 8 de agosto de 2016

### Reading R Code. The function Reduce in R (part II)

[ This post is the third one in a series about `Reduce`. To understand it you also need to read the first and second posts in this series.]

### Loops

Our implementation of `Reduce` directly translates the recursive procedures capturing the `fold-left` and `fold-right` computations. I think that this kind of recursive thinking is not only more high-level but also easier to understand, and that was the reason for beginning with them. However, in languages that don't provide mechanisms for optimizing recursion, iteration is always expressed via loops or another non-recursive procedure.

We have to do so in R if we hope to handle sequences of certain length, otherwise we'll get something like this:

``````> source("my_reduce.R")
> my_reduce("+", 1:100000)
Error: evaluation nested too deeply ...``````

Let's get started with converting the higher-level recursion used so far into a lower-level kind of loop for our first simple implementation in order to see how this can be easily done for the rest of the function.

That first implementation was for `fold-left` with a given initial value:

``````my_reduce <-
function(f, x, init) {
f <- match.fun(f)

iter <- function(acc, x) {
len <- length(x)
if (!len) {
acc
}
else {
first <- x[[1L]]
rest <- tail(x, -1L)

iter(f(acc, first), rest)
}
}

iter(init, x)
}``````

Using a `for` loop instead of recursion we get this self-explanatory version:

``````my_reduce <-
function(f, x, init) {
f <- match.fun(f)
acc <- init

for (e in x) {
acc <- f(acc, e)
}
acc
}``````

Accessing elements by its index is another way to get the same thing. The refactored function based on indices would be:

``````my_reduce <-
function(f, x, init) {
f <- match.fun(f)
acc <- init
len <- length(x)
ind <- seq_len(len)

for (i in ind) {
acc <- f(acc, x[[i]])
}
acc
}``````

`seq_len` is an R function that takes a number and produces an integer vector from 1 to that number (included). So, `seq_len(4)` produces the sequence `1, 2, 3, 4`. `ind` in the code above just refers to the numerical indices of `x` to be used in the loop.

A cosmetic and optional detail. Since the local variable `acc` takes the value passed as `init`, and the initial `init` value won't be used any more, we can stick with `init` as the name for the accumulated value so far, and remove `acc`:

``````my_reduce <-
function(f, x, init) {
f <- match.fun(f)
len <- length(x)
ind <- seq_len(len)

for (i in ind) {
init <- f(init, x[[i]])
}
init
}``````

### `Reduce`, at last!

That's actually the R implementation for left folding with `init` given:

``````function(f, x, init) {
len <- length(x)
f <- match.fun(f)
ind <- seq_len(len)

for (i in ind) init <- forceAndCall(2, f, init, x[[i]])
init
}``````

The only subtle difference lies in how `f` is called. Instead of a direct application, it uses the `forceAndCall` R function. The purpose of this is to avoid delayed evaluation of arguments. That means that we force R to evaluate the first 2 arguments (note the `2` as first argument) of `f` immediately before executing the `f` body, rather than leaving R to evaluate them after. This is rather technical at this point and we can go ahead without fully understanding. Just note that this construct is like calling the function but in a better way when that function is to be the value passed to a higher-order function like `Reduce`. We might go deeper into this in the future.

Once we have transformed our initial version into a loop-based version the first half of `Reduce` is now easy to understand.

`````` 1 Reduce <-
2 function(f, x, init, right = FALSE, accumulate = FALSE)
3 {
4   mis <- missing(init)
5   len <- length(x)
6
7   if(len == 0L) return(if(mis) NULL else init)
8
9   f <- match.fun(f)
10
11
12   if(!is.vector(x) || is.object(x))
13     x <- as.list(x)
14
15   ind <- seq_len(len)
16
17   if(mis) {
18     if(right) {
19       init <- x[[len]]
20       ind <- ind[-len]
21     }
22     else {
23       init <- x[[1L]]
24       ind <- ind[-1L]
25     }
26   }
27
28   if(!accumulate) {
29     if(right) {
30       for(i in rev(ind))
31         init <- forceAndCall(2, f, x[[i]], init)
33     }
34     else {
35       for(i in ind)
36         init <- forceAndCall(2, f, init, x[[i]])
37     }
38     init
39   }
40   else {
..     # to be explained later
..   }
.. }``````

Apart from pretty minor differences, it is the same code as in our implementation but using indexing and looping instead of recursion.

Let's see:

• Line 7 is just the code that handles what happens if the given object is empty, the empty list or an empty vector, typically. It returns `NULL` if no initial value is given; otherwise, returns the initial value.

• Lines 12-13 introduce one new thing only present in the official R version. Our version assumes a list as input. In R, a vector is expected, These lines ensure that if the value is not a vector or is an object in general it is converted to a type that the following code can deal with, a `list`.

• Lines 17-26 initialize variables `init` and `ind` accordingly to the kind of fold (left or right) requested if no initial value were provided. `init` is set exactly as in our version. Regarding `ind`, instead of working on the rest of the list directly as in our recursive implementation, indices are initialized appropriately, `ind[-1L]` produces the next indices to the index of the first element in order to process the rest of list for left folding. On the other hand `ind[-len]` produces the indices for right folding: those of all elements save the last one (that has been used as `init`). These indices will be reverted in line 30 to process the remaining list from right to left as explained in the previous post.

• Finally [lines 29-37], the function `f` is iteratively called with the proper
order of parameters for each kind of fold, and the result of folding returned in line 38.

The rest of `Reduce` is the code handling the case were the caller asks for a sequence of partial results rather than the final reduction. We'll look at it in the next post.