Engineering Blog

Inefficient Slice Initialization

A slice is a flexible and extensible data structure to implement and manage collections of data. Following are the different ways of slice initialization.

var varWay []string
literalWay := []string{}
newWay := new([]string)
makeWay := make([]string,0)
  • First way, It is the idiomatic way to declare an empty slice. The slice is actually nil, which means it will be null when marshalled to JSON and will succeed nil checks.
  • Second way, It should probably only be used when the literal is going to start with values in it, as in literalWay := []string{“Nepal”, “Canada”,”Japan”,”India”}. Otherwise prefer make().
  • Third way, returns a pointer to the slice. Same as ptrStyle := &[]string{}. Only use if we need want a pointer.
  • Fourth way, It is the same as the literal style, but is preferred for idiomatic reasons when the slice doesn’t need non-zero starting values. make() allows the slice to be initialized with a starting length and capacity, which can have good performance implication in some scenario.

While initializing a slice using make, we saw that we have to provide a length and an optional capacity. Forgetting to pass an appropriate value for both of these parameters when it makes sense is a widespread. Let’s see precisely when this is considered appropriate.

Suppose we want to implement a convert function that maps a slice of Euro into a slice of Doller and both slices will have the same number of elements. Here is a first implementation.

func convertEmptySlice(euros []Euro) []Doller {
	//creates the output slice
	dollers := make([]Doller, 0)

	for _, euro := range euros {
		//convert euro to doller and append to the output slice
		dollers = append(dollers, euroToDoller(euro))
	}
	return dollers
}

At first, we initialize an empty slice of Doller elements using make([]Doller,0). Then, we use append to add the Doller elements. At first, dollers is empty, so adding the first element allocates a backing array size 1. Every time the backing array is full, Go creates another array by doubling its capacity.

The logic of creating another array because the current one is full is repeated multiple times when we add a third element, a fourth, a fifth and so on. Assuming input slice has 1,000 elements this algorithm requires allocating 10 backing arrays and copying more than 1,000 elements in total from one array to another. This leads to additional effort for the GC to clean all these temporary backing arrays.

Performance-wise, there is no goof reason not to give the Go runtime a helping hand.There are two different options for this. The first option is to reuse the same code but allocate the slice with a given capacity.

func convertGivenCapacity(euros []Euro) []Doller {
	n := len(euros)
	//intialize with 0 length and given capacity
	dollers := make([]Doller, 0, n)

	for _, euro := range euros {
		//convert euro to doller and append to the output slice
		dollers = append(dollers, euroToDoller(euro))
	}
	return dollers
}

The only changes is to initialize dollers with a capacity equal to n, the length of euros. Internally, Go pre-allocates an array of n elements. Therefore, adding up to n elements means reusing the same backing array and hence reducing the number of allocations drastically.

The second option is to allocate dollers with a given length.

func convertGivenLength(euros []Euro) []Doller {
	n := len(euros)
	//intialize with given length
	// if capacity is not passed then it is same as its length
	dollers := make([]Doller, n)
	for i, euro := range euros {
		//set element i of the slice
		dollers[i] = euroToDoller(euro)
	}
	return dollers
}

Because we initialize the slice with a length, n elements are already allocated and initialized to the zero values of Doller. Hence, to set elements, we have to use, not append but dollers[i].

Which option is best ? Let’s run a benchmark with the three solutions and an input slice of 1 millions elements

https://github.com/Sugaml/golang-best-practice/tree/main/slice-init

benchmark test output

Converting one slice type into another is a frequent operation for GO developers. As we have seen, if the length of the future slice is already known, there is no good reason to allocate an empty slice first. Out options are to allocate a slice with either a given capacity or a given length. Of these two solutions, we have seen that the second tends to be slightly faster.But using a given capacity and append can be easier to implement and read in some contexts.

References

  • 100 Go Mistakes and how to avoid them, Teiva Harsanyi, Manning Publications Co
  • https://blog.logrocket.com/benchmarking-golang-improve-function-performance
  • https://www.golangprograms.com/go-language/slices-in-golang-programming.html
Previous Post
Next Post