Should we still be using for-loops?

June 14, 2016

Yes, I used a provocative title, and of course the answer is “yes”. The reason for the title is that we should reflect on what place the venerable for-loop should have in modern day programming. A for-loop is one of the first things a novice programmer is taught. But is that right? Should we treat the for-loop as a go-to tool for everyday programming tasks?

For-loops are a relic from the dawn of computers when programmers had to work with arrays that were nothing more than consecutive memory locations holding values of the same type. Those days are over. While arrays still have their place in systems programming, higher level languages ship with safer alternatives.

Different languages have different for-loops. This article is about the classic for-loop as seen in the C language and adopted by numerous other programming languages, most notably C# and Java. That is, for-loops that require the programmer to specify an initialisation, increment and a continuation step. Other languages, such as Scala, do get “for-loops” right (Scala’s for keyword is nothing like that seen in C, C# and Java; it is more comparable to Haskell’s do-keyword).

Problems with for-loops

The venerable for-loop is a versatile tool, but that versatility comes at a price.

For-loops are hard to read

The purpose of a for-loop is not clear from its head, i.e. the line(s) containing the three steps. The head only shows how iteration is done, not for what purpose. A thorough read of the for-loop’s body is required to determine that. Consider the following example:

This problem becomes worse as the complexity of the body increases and loops are nested.

For-loops often violate the single-responsibility principle

Take a look at MyFunkyFunction, the for-loop has two responsibilities: to filter out odd elements, and to square any remaining entries. Worse, it even cannot do these two tasks on its own, it relies on surrounding code to initialise and return the result.

For-loops bring unnecessary state

Iterator variables and intermediate results clutter scopes. Again, this problem becomes worse as loops are nested.

For-loops are mostly boilerplate

This traditional for-loop pattern is very common in code:

The only thing that differs from loop to loop is the function f(.). The rest is boilerplate.

For-loops are fragile to implementation changes

It is hard to write general functions using for loops as the following, simple task will illustrate.

Suppose you’re asked to write a function that takes a collection of integers and returns a collection of their squares. A naive solution would be

Sure this works for arrays, but it only works for arrays. The question was for a solution that generalizes to any collection. So let’s try again:

This code is better and will work for a larger class of collections, but introduces two new issues. The first is that we have to decide on a concrete collection to return from the function. The second is that the complexity of this code is also implementation dependent: If the list uses an array or some sort of hash map, the (amortized) running cost will be \mathcal{O}(n). However, if xs is implemented using a linked list, the running cost will be \mathcal{O}(n^2).

The latter can be solved by working on collections that implement the iterator pattern. However, the burden of choosing a suitable concrete result type is still placed on the programmer. This final version of the Squares function is the most general of the three:

I hope this short example illustrates how many choices a programmer has to make to implement something as simple as the Squares function. This is both unnecessary, as the collection’s implementation is much better suited to determine the best way to solve this. It is also dangerous as each decision introduces room for mistakes. As for-loops tend to proliferate through code, the risk of something going wrong increases.

The issue of relying on Squares to pick a suitable concrete return type cannot be underestimated. Suppose a programmer decided that a linked list is exactly right for his use case. That programmer has no guarantee that Squares returns something just as appropriate.

For-loops should be for low-level functions

The first problem, poor readability, stems from the fact that for-loops are used, among other purposes, to create, filter, transform and fold collections and any combination thereof. But what a particular for-loop is used for, is unclear. A better solution would be to abstract each use case into a reusable function.

And indeed, these functions are offered by modern programming languages and libraries, C++’s STL, Java’s Stream Library, C#’s LINQ Extension Methods and Scala’s Collections.

Using C#’s Select extension method, the Squares function may be written as:

This completely removes the for-loop and associated problems. No longer does the programmer of client code have to worry about choosing a concrete return type.

And how about using this strategy to implement MyFunkyFunction?

In this new version, each function call fulfills a well-defined atomic task. By delegating common tasks such filtering and mapping to libraries, our code becomes robust to implementation changes, more focused, more readable, and better to maintain. Also the surrounding code is gone, as intermediate variables are no longer necessary.

While it is fine to use the occasional for-loop to implement these lower-level library functions, they should not permeate application code. Consider this: while the virtual machine running your web application’s MSIL or Java bytecode is implemented in C, you wouldn’t write the web application itself in C. And just as low-level functions are implemented using for-loops, you shouldn’t be using them to write your web application.