Engineering Blog

                            

Ignoring the fact that elements are copied in range loops

Range is the form of for loop that iterates over a slice or map. We may forget or be unaware of how a range loop assigns values, leading to common mistakes. First, let’s remind ourselves how to use a range loop; then we’ll look at how values are assigned.

Concepts

A range loop allows iterating over different data structures:

  • String
  • Array
  • Pointer to an array
  • Slice
  • Map
  • Receiving channel

Compared to a classic for loop, a range loop is a convenient way to iterate over all the elements of one of these data structures. It’s also less error-prone because we don’t have to handle the condition expression and iteration variable manually, which may avoid mistakes such as off-by-one errors. Here is an example with an iteration over a slice of strings:

s := []string{"a", "b", "c"}
for i, v := range s {
fmt.Printf("index=%d, value=%s\n", i, v)
}

In some cases, we may only be interested in element value but not index. So, in such cases we will use for loop in following way:

s := []string{"a", "b", "c"}
for _, v := range s {
fmt.Printf("value=%s\n", v)
}

And if we don’t want value and only insterested in index than we can define for loop in:

for i := range s {}

Common mistake while updating values using for loop

Understanding how the value is handled during each iteration is critical for using a range loop effectively. Let’s see how it works with a concrete example. We create an account struct containing a single balance field:

type account struct {
    balance float32
}

Next, we create a slice of account structs and iterate over each element using a range loop. During each iteration, we increment the balance of each account:

accounts := []account{
    {balance: 100.},
    {balance: 200.},
    {balance: 300.},
}
for _, a := range accounts {
    a.balance += 1000
}

Following this code, which of the following two choices do we think shows the slice’s content?
[{100} {200} {300}] or [{1100} {1200} {1300}]

The answer is [{100} {200} {300}]. In this example, the range loop does not affect the slice’s content. Let’s see why.

  • If we assign the result of a function returning a struct, it performs a copy of that struct.
  • If we assign the result of a function returning a pointer, it performs a copy of the memory address (an address is 64 bits long on a 64-bit architecture)

It’s crucial to keep this in mind to avoid common mistakes, including those related to range loops. Indeed, when a range loop iterates over a data structure, it performs a copy of each element to the value variable (the second item).

Coming back to our example, iterating over each account element results in a struct copy being assigned to the value variable a. Therefore, incrementing the bal- ance with a.balance += 1000 mutates only the value variable (a), not an element in the slice.

This can be achieved with either a classic for loop or a range loop using the index instead of the value variable:

//Uses the index variable to access the element of the slice
for i := range accounts {
    accounts[i].balance += 1000
}
//Uses the traditional for loop
for i := 0; i < len(accounts); i++ {
    accounts[i].balance += 1000
}

Both iterations have the same effect: updating the elements in the accounts slice. Which one should we favor? It depends on the context. If we want to go over each element, the first loop is shorter to write and read. But if we need to control which ele- ment we want to update (such as one out of two), we should instead use the second loop.

Updating slice elements: A third option

Another option is to keep using the range loop and access the value but modify the slice type to a slice of account pointers:

accounts := []*account{
    {balance: 100.},
    {balance: 200.},
    {balance: 300.},
}
for _, a := range accounts {
    a.balance += 1000
}

However, this option has two main downsides. First, it requires updating the slice type, which may not always be possible. Second, if performance is important, we should note that iterating over a slice of pointers may be less efficient for a CPU because of the lack of predictability (we will discuss this point in mistake, “Not understanding CPU caches”).

In general, we should remember that the value element in a range loop is a copy. Therefore, if the value is a struct we need to mutate, we will only update the copy, not the element itself, unless the value or field we modify is a pointer. The favored options are to access the element via the index using a range loop or a classic for loop. In the next section, we keep working with range loops and see how the provided expression is evaluated.

Previous Post
Next Post