Getting more out of the range-based for statement in C++11

Among the most useful features added to C++11 is the range-based for statement. It is defined to be equivalent to the usual iterator-based loop from begin() to end() and makes standard iteration look way more appealing.

Keeping the noise out of a large fraction of iterator-based loops is great, but other common for loops are missing out! They have not received a convenient shorthand. Luckily, you can make some yourself by feeding the right containers to the range-based for statement.

For example, the regular counting loops

for (auto i = begin, e = end; i < e; ++i)
for (size_t i = 0, e = container.size(); i < e; ++i)

can be written as

for (const auto i : range(begin, end))
for (const auto i : indices(container))

if you define indices() and range() the right way.

Using the range-based version has no drawbacks: it is easier to read, quicker to write and compiles to equivalent assembly (calls to operator!= and operator++ get optimized away even at g++ -O1).

You can do a lot with the range-based for statement. What I have found useful so far is:

  • range(begin, end): for (auto i = begin, e = end; i != e; ++i)
  • range(end): same as range(0, end)
  • indices(container): same as range(container.size())
  • keys(container_of_pairs): shorthand for getting only value.first
  • values(container_of_pairs): shorthand for getting only value.second

The code is available on github under the Boost Software License. It has no external dependencies and using it should be as simple as putting the header file into your include path. It is also fairly minimal, implementing only what is needed to make it work. A lot could be added, like rbegin(), rend(), generic reverse(), correct iterators, …

The approach has its limits, and one of them is exposed when you implement something like enumerate():

for (auto e : enumerate(c)) {
    // e.index is the current index
    // e.value is the current value
}

For standard containers (as well as the keys() and values() ranges above) the type in the for statement controls whether you get a copy or a reference of what you are iterating over as well as whether it should be const or not:

for (const auto e : c) // e is a const copy
for (auto & e : c) // e is a mutable reference

That can only work as long as what you want to put into e is an lvalue (i.e. c.begin() is an output iterator). It breaks down for enumerate() which cannot return a reference without extra overhead.

That is why I decided to split enumerate() into two functions. enumerate_byref() always grabs ‘value’ by reference and enumerate_byval() makes a copy. Usage then becomes:

for (const auto e : enumerate_byref(container))
    e.value = true; // ok, modifies container!
for (auto e : enumerate_byval(container))
    e.value = true; // ok, changes copy

Which is far from the ideal syntax, but at least it is explicit about what is going on.

Thanks to my employer, CeleraOne GmbH, for permitting me to publish this.

Comments 3

  1. Leonid Volnitsky wrote:

    There is somewhat similar project at http://volnitsky.com/project/ro/

    Posted 24 Mar 2013 at 13:34
  2. Leo Goodstadt wrote:

    Nice code. Could you explain a little more the rationale for splitting up enumerate_byref and enumerate_byval.

    More to the point, enumerate_byval seems to provide little benefit over enumerate_byref. Benchmarking seems to suggest that this is almost no overhead in passing by reference instead of by value, e.g. for integers.

    The only other point is if the user is going to modify the value / counter inside the loop, and he/she wants to discard such changes. This seems to be an odd and confusing paradigm…

    Admittedly, the usual distinctions between using const T&, T& and T “don’t work” when using an enumerating range. But that is true for enumerate_byref as well.

    Thus with:

    for (auto e : enumerate_byref(container))

    even with the “byref” in the name, are we going to know for sure that “e.value” here is a reference, rather than a copy?

    Have I missed anything?

    Posted 14 Jun 2013 at 20:07
  3. Christian wrote:

    The main reason for the split in enumerate() was that the code

    for (auto e : enumerate(container)) e.value = 12;

    looked surprising. Does that change the container or not?

    First I made value const so this would just not compile. But in practice, you often do want to change the value. Once I decided to change the name to enumerate_byref I felt I also had to provide enumerate_byval.

    Currently I’m leaning towards just not having enumerate – the code may be more verbose, but it’s also more readable.

    Posted 06 Jul 2013 at 21:02

Post a Comment

Your email is never published nor shared. Required fields are marked *