## martes, 23 de agosto de 2016

### Reading R Code. Factors in R

R factors are a common source of confusion for R beginners, even in their most basic use. In this post I try to shed some light on R factors. But rather than only rely on external sources of information or on the R documentation, I will make some experiments and have a look at the implementation to understand what a factor is and how it behaves in the most simple use cases.

First of all, R factors are intended for representing categorical variables. A categorical variable is a variable that can take values of a limited set of so-called categories. For instance, to encode the gender of a population we would use a categorical variable with two possible categories, male and female. Or to encode the educational level of a population under 18 we can use a categorical variable with three categories: elementary, middle and high.

In addition, categories of a categorical variable may or may not have some internal ordering. We cannot ascribe any sound ordering to the first example of categorical variable, but we should in the second case (elementary `<` middle `<` high)

In R, the term `levels` stands for categories of the categorical variable (the factor). `labels` in turn suggests something like custom names for those categories.

This distinction might be blurry, and it is. Think of some data you are going to analyze, say, a record of the gender of 6 persons:

``> x <- c("F", "M", "F", "F", "F", "M")``

If we want to convert this into a factor, which are the levels? Maybe, "male" and "female"?

``````> factor(x, c("male", "female"))
 <NA> <NA> <NA> <NA> <NA> <NA>
Levels: male female``````

Yikes! This doesn't work. Interestingly, no error message has been triggered, and levels are stored as we intended. But what are those `<NA>`s there?

Let's try without supplying levels and see what happens:

``````> factor(x)
>  F M F F F M
> Levels: F M``````

This looks better, at least nasty `<NA>`s have disappeared. So, when no levels are supplied, levels end up being the distinct elements of the original `x`. It seems that we could pass them instead of our `"male"` and `"female"` before:

``````> factor(x, levels = c("M", "F"))
>  F M F F F M
> Levels: M F``````

Fine! This works. Only that levels appear now in a different order, the one we have supplied.

Still we would prefer the more human-friendly "male" and "female". It might be that these words are rather labels. Let's try:

``````> factor(x, labels = c("male", "female"))
>  male   female male   male   male   female
> Levels: male female``````

Much better, but wait, our initial vector was another one. This totally wrecks havoc with `x`!

Looks like the order in which we pass the labels makes a serious effect. And it certainly does:

``````> factor(x, labels = c("female", "male"))
>  female male   female female female male
> Levels: female male``````

Great! If for some hidden reason (maybe, say, because we expect that in a plot of this factor the legend shows male scores before female ones) we still want the other ordering of levels, we may try to combine both parameters:

``````> factor(x, levels = c("M", "F"), labels = c("male", "female"))
>  female male   female female female male
> Levels: male female``````

That's it! Particularly, note how each label corresponds to each level in the values supplied. To verify, something like the following (where there is a mismatch in the ordering of levels and labels)

``````> factor(x, levels = c("F", "M"), labels = c("male", "female"))
>  male   female male   male   male   female
> Levels: male female``````

ruins `x` again.

From these experiments we can draw some intuitive conclusions:

1. levels are the distinct elements in the object to be converted to factor.
2. If levels are not supplied they are constructed from those unique elements sorted in alphabetical order.
3. labels are the "names" we want to give to our levels.
4. If no supplied, labels are the same as the levels.
5. The order of labels passed must match the order of the levels, either of the default levels or of the levels we supply.
6. The less risky way to create factors with custom labels is to supply both parameters.

Looking at the implementation makes crystal clear what we have found out.

``````  1 function (x = character(), levels, labels = levels, exclude = NA,
2     ordered = is.ordered(x), nmax = NA)
3 {
4     if (is.null(x))
5         x <- character()
6     nx <- names(x)
7     if (missing(levels)) {
8         y <- unique(x, nmax = nmax)
9         ind <- sort.list(y)
10         y <- as.character(y)
11         levels <- unique(y[ind])
12     }
13     force(ordered)
14     exclude <- as.vector(exclude, typeof(x))
15     x <- as.character(x)
16     levels <- levels[is.na(match(levels, exclude))]
17     f <- match(x, levels)
18     if (!is.null(nx))
19         names(f) <- nx
20     nl <- length(labels)
21     nL <- length(levels)
22     if (!any(nl == c(1L, nL)))
23         stop(gettextf("invalid 'labels'; length %d should be 1 or %d",
24             nl, nL), domain = NA)
25     levels(f) <- if (nl == nL)
26         as.character(labels)
27     else paste0(labels, seq_along(levels))
28     class(f) <- c(if (ordered) "ordered", "factor")
29     f
30 }``````

Let us consider the second point above. If levels are not supplied they will be the distinct elements of the given vector:

``````if (missing(levels)) {
y <- unique(x, nmax = nmax)
ind <- sort.list(y)
y <- as.character(y)
levels <- unique(y[ind])
}``````

If levels are missing, we create a vector `y` of unique elements of `x`:

``````> y <- unique(x)
> y
 "F" "M"``````

Note that `nmax` is set by default to `NA` in the function header:

``````function (x = character(), levels, labels = levels, exclude = NA,
ordered = is.ordered(x), nmax = NA) ``````

and `unique(x, nmax = NA)` is equivalent to the call just above.

Also, note that if the order of elements in `x` were different, say, `c("M", "F", "M")`, `unique` would produce the distinct elements just by removing duplicates in the given vector:

``````> unique(c("M", "F", "M"))
 "M" "F"``````

So `unique` doesn't sort the result.

To sort the result we need a sorting function, here `sort.list`. This function produces the indices suitable for subsetting `y`:

``````> sort.list(y)
 1 2

> y[sort.list(y)]
 "F" "M"``````

Note also that `y` holds a character vector always. So levels produced are always a character vector sorted in alphabetical order.

If levels are supplied it is expected that they match the unique elements of `x`. Pay attention to this crucial line:

``f <- match(x, levels)``

`match` is a nice but maybe tricky function to understand, it produces the positions (= indices) of matches of `x` in `levels`.

Let's try some examples. Recall that `x` is:

``````> x
 "F" "M" "F" "F" "F" "M"

> match(x, c("M", "F"))
 2 1 2 2 2 1``````

Indeed, the first element of `x` matches the `2`nd element of `levels`; the second element of `x` matches the `1`st element of `levels`, and so on.

If there is no match, a `NA` is produced by default:

``````> match(x, c("M", "O"))
 NA  1 NA NA NA  1``````

Now we see where the `NA`s in our initial wrong attempt with "male" and "female" as levels came from.

If labels are not supplied they are just the levels as the function header states (`levels` is the default value for `labels`).

If they are supplied, these lines give labels to levels:

``````levels(f) <- if (nl == nL)
as.character(labels)
else paste0(labels, seq_along(levels))``````

Actually, the line that is executed for our previous examples with `labels` is just:

``levels(f) <- as.character(labels)``

since in our examples the number of levels are labels are the same (`nl == nL`, where `nl <- length(labels)` and `nL <- length(levels)`)

Note again that levels with those new labels are still always a character object.

The last line in the snippet above deals with the case where a single label is supplied. We haven't tried an example for that, it is time to do it now:

``````> factor(x, labels = "gender")
 gender1 gender2 gender1 gender1 gender1 gender2
Levels: gender1 gender2``````

This is a valid way to assign labels to levels, though not so good for this example. The code responsible of producing this vector is:

``paste0(labels, seq_along(levels))``

`seq_along(levels)` produces a sequence `1, 2, ... n` where `n` is the length of `levels`. `paste0` just concatenates the single label with those digits in the sequence via recycling.

However, the most remarkable fact we may notice from looking at the implementation is that a `factor` in R is just an integer vector whose attributes "levels" (as we have seen), "names" (as names of `x` [see line 6 and lines 18-19 in the implementation]) and, above all, "class" are set appropriately.

As for the latter the last but one line in the implementation sets the `"class"` attribute to `"factor"` always. Besides, if `x` is an ordered vector (the default behavior), or if we pass `TRUE` as value to the `ordered` argument, `"class"` is set to `c("ordered", "factor")`:

``class(f) <- c(if (ordered) "ordered", "factor") ``

Note the order of elements in the last vector for "class". The order here matters, and means that in such a case `f` is like an instance of the class `"ordered"`, that in turn is somewhat like a subclass of "`factor`". [See `?class` for more details about this kind of inheritance mechanism.]

A nice way of finishing this post is to demonstrate what we have discovered, that a factor is, from the R's internal point of view, just an integer vector with certain attributes set appropriately.

``````> f <- c(1L, 2L, 1L, 1L, 1L, 2L)
> levels(f) <- c("female", "male")
> class(f) <- "factor"
> str(f)
Factor w/ 2 levels "female","male": 1 2 1 1 1 2
> f
 female male   female female female male
Levels: female male

> f2 <- c(1L, 3L, 3L, 2L, 1L, 2L)
> levels(f2) <- c("elem", "middle", "high")
> class(f2) <- c("ordered", "factor")
> str(f2)
Ord.factor w/ 3 levels "elem"<"middle"<..: 1 3 3 2 1 2
> f2
 elem   high   high   middle elem   middle
Levels: elem < middle < high``````

## lunes, 8 de agosto de 2016

### Reading R Code. The function Reduce in R (part II)

[ This post is the third one in a series about `Reduce`. To understand it you also need to read the first and second posts in this series.]

### Loops

Our implementation of `Reduce` directly translates the recursive procedures capturing the `fold-left` and `fold-right` computations. I think that this kind of recursive thinking is not only more high-level but also easier to understand, and that was the reason for beginning with them. However, in languages that don't provide mechanisms for optimizing recursion, iteration is always expressed via loops or another non-recursive procedure.

We have to do so in R if we hope to handle sequences of certain length, otherwise we'll get something like this:

``````> source("my_reduce.R")
> my_reduce("+", 1:100000)
Error: evaluation nested too deeply ...``````

Let's get started with converting the higher-level recursion used so far into a lower-level kind of loop for our first simple implementation in order to see how this can be easily done for the rest of the function.

That first implementation was for `fold-left` with a given initial value:

``````my_reduce <-
function(f, x, init) {
f <- match.fun(f)

iter <- function(acc, x) {
len <- length(x)
if (!len) {
acc
}
else {
first <- x[[1L]]
rest <- tail(x, -1L)

iter(f(acc, first), rest)
}
}

iter(init, x)
}``````

Using a `for` loop instead of recursion we get this self-explanatory version:

``````my_reduce <-
function(f, x, init) {
f <- match.fun(f)
acc <- init

for (e in x) {
acc <- f(acc, e)
}
acc
}``````

Accessing elements by its index is another way to get the same thing. The refactored function based on indices would be:

``````my_reduce <-
function(f, x, init) {
f <- match.fun(f)
acc <- init
len <- length(x)
ind <- seq_len(len)

for (i in ind) {
acc <- f(acc, x[[i]])
}
acc
}``````

`seq_len` is an R function that takes a number and produces an integer vector from 1 to that number (included). So, `seq_len(4)` produces the sequence `1, 2, 3, 4`. `ind` in the code above just refers to the numerical indices of `x` to be used in the loop.

A cosmetic and optional detail. Since the local variable `acc` takes the value passed as `init`, and the initial `init` value won't be used any more, we can stick with `init` as the name for the accumulated value so far, and remove `acc`:

``````my_reduce <-
function(f, x, init) {
f <- match.fun(f)
len <- length(x)
ind <- seq_len(len)

for (i in ind) {
init <- f(init, x[[i]])
}
init
}``````

### `Reduce`, at last!

That's actually the R implementation for left folding with `init` given:

``````function(f, x, init) {
len <- length(x)
f <- match.fun(f)
ind <- seq_len(len)

for (i in ind) init <- forceAndCall(2, f, init, x[[i]])
init
}``````

The only subtle difference lies in how `f` is called. Instead of a direct application, it uses the `forceAndCall` R function. The purpose of this is to avoid delayed evaluation of arguments. That means that we force R to evaluate the first 2 arguments (note the `2` as first argument) of `f` immediately before executing the `f` body, rather than leaving R to evaluate them after. This is rather technical at this point and we can go ahead without fully understanding. Just note that this construct is like calling the function but in a better way when that function is to be the value passed to a higher-order function like `Reduce`. We might go deeper into this in the future.

Once we have transformed our initial version into a loop-based version the first half of `Reduce` is now easy to understand.

`````` 1 Reduce <-
2 function(f, x, init, right = FALSE, accumulate = FALSE)
3 {
4   mis <- missing(init)
5   len <- length(x)
6
7   if(len == 0L) return(if(mis) NULL else init)
8
9   f <- match.fun(f)
10
11
12   if(!is.vector(x) || is.object(x))
13     x <- as.list(x)
14
15   ind <- seq_len(len)
16
17   if(mis) {
18     if(right) {
19       init <- x[[len]]
20       ind <- ind[-len]
21     }
22     else {
23       init <- x[[1L]]
24       ind <- ind[-1L]
25     }
26   }
27
28   if(!accumulate) {
29     if(right) {
30       for(i in rev(ind))
31         init <- forceAndCall(2, f, x[[i]], init)
33     }
34     else {
35       for(i in ind)
36         init <- forceAndCall(2, f, init, x[[i]])
37     }
38     init
39   }
40   else {
..     # to be explained later
..   }
.. }``````

Apart from pretty minor differences, it is the same code as in our implementation but using indexing and looping instead of recursion.

Let's see:

• Line 7 is just the code that handles what happens if the given object is empty, the empty list or an empty vector, typically. It returns `NULL` if no initial value is given; otherwise, returns the initial value.

• Lines 12-13 introduce one new thing only present in the official R version. Our version assumes a list as input. In R, a vector is expected, These lines ensure that if the value is not a vector or is an object in general it is converted to a type that the following code can deal with, a `list`.

• Lines 17-26 initialize variables `init` and `ind` accordingly to the kind of fold (left or right) requested if no initial value were provided. `init` is set exactly as in our version. Regarding `ind`, instead of working on the rest of the list directly as in our recursive implementation, indices are initialized appropriately, `ind[-1L]` produces the next indices to the index of the first element in order to process the rest of list for left folding. On the other hand `ind[-len]` produces the indices for right folding: those of all elements save the last one (that has been used as `init`). These indices will be reverted in line 30 to process the remaining list from right to left as explained in the previous post.

• Finally [lines 29-37], the function `f` is iteratively called with the proper
order of parameters for each kind of fold, and the result of folding returned in line 38.

The rest of `Reduce` is the code handling the case were the caller asks for a sequence of partial results rather than the final reduction. We'll look at it in the next post.

## sábado, 6 de agosto de 2016

### Reading R Code. The function Reduce in R (part I)

[This post is the second one about the function `Reduce`. It assumes everything discussed in the first post about Reduce]

The function `Reduce` in R is based on the Lisp function `reduce`. It also adds an option borrowed from Haskell's `scan` functions.

Let's see this by exploring a few simple examples and observing their behavior in R as well as in Lisp and Haskell.

### What does `Reduce` do?

``````## Example 1.
# R
> Reduce(`+`, 1:4)
 10

# Lisp
> (reduce #'+ '(1 2 3 4))
10

Prelude> foldl (+) 1 [2, 3, 4]
10``````

Look at the Haskell function. I have used `foldl`, that is the `fold-left` function in Haskell, because R and Lisp do `fold-left` by default. Note also that while in Haskell the initial value has to be specified, in R and in Lisp we don't need to do so. When not specified the initial value for left-folding is the first element in the given list.

The evaluation of all these functions (`fold-left`) expands as follows for the given example:

``````(((1 + 2) + 3) + 4)

## Example 2.
# R
> Reduce(`-`, 1:4)
 -8

# Lisp
> (reduce #'- '(1 2 3 4))
-8

Prelude> foldl (-) 1 [2, 3, 4]
-8``````

The expansion is the same:

``````(((1 - 2) - 3) - 4)

## Example 3.
# R
> Reduce(`-`, 1:4, right = TRUE)
 -2

# Lisp
> (reduce #'- '(1 2 3 4) :from-end t)
-2

Prelude> foldr (-) 4 [1, 2, 3]
-2``````

The evaluation expands to this now:

``(1 - (2 - (3 - 4)))``

This is `fold-right`. In R and Lisp we have to set an option for this fold, and at this point it should be clear why these two functions are described as "left" and "right" folds. Note also that, for `fold-right` the initial value, if given (as in Haskell) become the right-most item in the expansion. In fact, in R and Lisp if the initial value is not given, it is the last element of the given list.

R provides an extra feature borrowed from Haskell for producing a sequence of results of accumulated values rather than the final reduction.

In Haskell, there is a special set of functions for this purpose, `scan` functions, `scanr` corresponds to `foldr` and `scanl` to `foldl`:

``````Prelude> scanl (-) 1 [2, 3, 4]
[1,-1,-4,-8]``````

Observe the partial results in the computations above:

``````initial value:        =  1
1st result   :  1 - 2 = -1
2nd result   : -1 - 3 = -4
3rd result   : -4 - 4 = -8``````

Or, for a right fold:

``````Prelude> scanr (-) 4 [1, 2, 3]
[-2,3,-1,4]``````

where partial results from right to left and in reverse order are:

``````initial value:          =  4
1st result   : 3 - 4    = -1
2nd result   : 2 - (-1) =  3
3rd result   : 1 - 3    = -2``````

The R equivalents need the `accumulate` parameter set to `TRUE`:

``````Reduce("-", 1:4, accumulate = TRUE)
>   1 -1 -4 -8``````

and

``````Reduce("-", 1:4, right = TRUE, accumulate = TRUE)
>  -2  3 -1  4``````

These functions are just variants of the tail-recursive pattern where a second accumulator is added to keep track of computed values so far.

So for instance `scanl` can be implemented as follows:

`````` (define (scan-left f i lox0)
(local [(define (f-for-lox-acc acc rsf lox)
(cond [(empty? lox) (reverse (cons acc rsf))]
[else
(f-for-lox-acc (f acc (first lox))
(cons acc rsf)
(rest lox))]))]
(f-for-lox-acc i empty lox0)))``````

Only and extra accumulator `rsf` has been added. It maintains a list of resulting values so far. Each time a new computed value is produced is inserted into the `rsf` accumulator. The result is finally reversed. It is possible to avoid the final `reverse` if computed values are appended to the end of `rsf`. Anyway it is clear the idea behind this.

### R implementation of `Reduce`

As we have seen, `Reduce` in R is a multipurpose function. Other languages as Haskell prefer independent elementary functions for each task (right/left fold and returning accumulated values or not). I find the latter more elegant, easier to read and easier to test. R, though, is fond of functions with lots of parameters, and full of conditionals to select the code to execute for a specific purpose.

Even so, and with the goal of firstly implementing our own version for better understanding the R official version, it is reasonable to begin with basic use cases and add more after those have been implemented.

One of the simplest use case is to execute `Reduce` for left fold with a given initial value. As usual, we need a very basic set of tests. (more should be added for good coverage). The following are hopefully enough at this stage:

``````test_that("test reduce: fold-left, init given", {
expect_equal(my_reduce("+", integer(0), init = 1), 1)
expect_equal(my_reduce("-", 2:4, init = 1), -8)
expect_equal(my_reduce("/", c(2, 9, 13), init = 7), 7/234)
expect_equal(my_reduce(list, 2:4, init = 1), (list(list(list(1, 2), 3), 4)))
})``````

The most natural implementation at this point is the translation into R of the `fold-left` tail recursive procedure (first variant), as this:

``````my_reduce <-
function(f, x, init) {
f <- match.fun(f)

iter <- function(acc, x) {
len <- length(x)
if (!len) {
acc
}
else {
first <- x[[1L]]
rest <- tail(x, -1L)

iter(f(acc, first), rest)
}
}

iter(init, x)
}``````

Assuming that test cases are saved into the file `test_my_reduce.R`, running tests with:

``````> library(testthat)
> test_file("test_my_reduce.R")``````

confirms that this implementation works.

A few points about this code. The first line applies `match.fun` as customary in R code to check the validity of the argument passed as `f`. I have used `1L` instead of just `1` for indexing. This is a common practice in R base code: when an integer is required the number as a literal integer (the number followed by `L`) is passed. Finally, I have intentionally named the auxiliary helper as `iter` for reasons that will be clear. Any name would have worked though.

The next step is handling the case where no initial value is passed. Some test cases for this:

``````test_that("test reduce: fold-left, init missing", {
expect_equal(my_reduce("+", integer(0)), NULL)
expect_equal(my_reduce("-", 1:4), -8)
expect_equal(my_reduce("/", c(7, 2, 9, 13)), 7/234)
expect_equal(my_reduce(list, 1:4), (list(list(list(1, 2), 3), 4)))
})``````

The first test case is for the empty vector. As no initial value is given, we cannot return it as before. By convention we return `NULL` in this case.

And this is an obvious implementation that passes all tests above [only the first lines are added, the `iter` function remains the same]:

``````my_reduce <-
function(f, x, init) {
f <- match.fun(f)
mis <- missing(init)
len <- length(x)

if (mis && !len) {
return(NULL)
}

if (mis) {
init <- x[[1L]]
x <- tail(x, -1L)
}

iter <- function(acc, x) {
len <- length(x)
if (!len) {
acc
}
else {
first <- x[[1L]]
rest <- tail(x, -1L)

iter(f(acc, first), rest)
}
}

iter(init, x)
}``````

What about the `fold-right` operation?

Let's write some tests for fold-right with and without the initial value given:

``````test_that("test reduce: fold-right, init given", {
expect_equal(my_reduce("+", integer(0), init = 1, right = TRUE), 1)
expect_equal(my_reduce("-", 1:3, init = 4, right = TRUE), -2)
expect_equal(my_reduce("/", c(7, 2, 9), init = 13, right = TRUE), 63/26)
expect_equal(my_reduce(list, 1:3, init = 4, right = TRUE),
list(1, list(2, list(3, 4))))
})

test_that("test reduce: fold-right, init missing", {
expect_equal(my_reduce("+", integer(0), right = TRUE), NULL)
expect_equal(my_reduce("-", 1:4, right = TRUE), -2)
expect_equal(my_reduce("/", c(7, 2, 9, 13), right = TRUE), 63/26)
expect_equal(my_reduce(list, 1:4, right = TRUE),
list(1, list(2, list(3, 4))))
})``````

As for the implementation it looks like it would be a very different piece of code if we were to translate the natural recursive procedure examined in the post referred above. However, when designing functions that handle several possibilities we should seek after maximizing the amount of code shared by each of them. The best possible scenario for the function at hand would be one in which the tail-recursive pattern employed for `fold-left` is shared by the `fold-right` operation. It turns out that we already have a promising candidate, the second variant of `fold-left`. Remember that it has the same signature as `fold-right`, and it shouldn't be difficult to adapt it so that it serves as a tail-recursive version for `fold-right`. A closer examination into the expansions of both procedures gives the answer. Let's recall them once again:

``````fold-right   (-) 1 '(2 3 4) : 2 - (3 - (4 - 1))
fold-left-2  (-) 1 '(2 3 4) : 4 - (3 - (2 - 1)) ``````

The only difference lies in the order of elements in the given list. The function we are searching for (a tail-recursive `fold-right`) should have the same expansion as `fold-right`:

``fold-right-2 (-) ??         : 2 - (3 - (4 - 1))``

Which arguments should it receive? Obviously:

``fold-right-2 (-) 1 '(4 3 2)``

Hence, `fold-right-2` is just `fold-left-2` with the input list reversed, and, as you surely remember, the only remaining difference between our `fold-left`, the one employed in R, `fold-left-1` and `fold-left-2` is the order of arguments passed to `f`.

All of these points are captured by the following implementation [number of lines included]:

``````1  my_reduce <-
2  function(f, x, init, right = FALSE) {
3    iter <- function(acc, x) {
4      len <- length(x)
5      if (!len) {
6        acc
7      }
8      else {
9        first <- x[[1L]]
10       rest <- tail(x, -1L)
11
12       if (right) iter(f(first, acc), rest) else iter(f(acc, first), rest)
13     }
14   }
15
16   f <- match.fun(f)
17   mis <- missing(init)
18   len <- length(x)
19
20   if (mis && !len) {
21     return(NULL)
22   }
23
24   if (mis) {
25     if (right) {
26       init <- x[[len]]
28     }
29     else {
30       init <- x[[1L]]
31       x <- tail(x, -1L)
32     }
33   }
34   else {
35     if (right) {
36       x <- rev(x)
37     }
38   }
39
40   iter(init, x)
41 }``````

Changes with respect to the previous implementation are:

1. The nested function `iter` has been modified so that the recursive call passes parameters to `f` in the order suitable to the requested kind of fold [line 12].
2. The handling of options for the arguments `init` and `right` has been appropriately extended [lines 24-38]. Note, in particular, that when right is `TRUE` we need to reverse the sequence, as explained. The R function `rev` does the job [lines 27, 36].

This implementation passes our current test set. Nice!

Do you think that this function is getting long? If so you are with me. As I said before I prefer shorter functions with as few parameters as possible and independent helpers. But R base code is plenty of this kind of monolithic functions rich in parameters and conditional branches.

[To be continued]