List Is it really as effective as you think?

ListIs it really as efficient as you think

A .NET List is a list you can use as an array. Isn’t it? Yeah, I’m sure it is. And I can use it anytime, can’t I? Well, “No”. And “No”. Sometimes it’s not a good idea at all.

Introduction

What is a List<T>? Well, according to MSDN[^]:

List<T> Class

Represents a strongly typed list of objects that can be accessed by index. Provides methods to search, sort, and manipulate lists.

So, a List<T> is a list of items that you can treat as an array. Excellent. We’re done here.

Let’s explore a little bit more on this:

So, a List<T> is a List of elements of type T that you can treat as an Array. Simple – we’re done. The source code for the List<T> class. And let’s have a quick look at the fields:

public class List<t> : IList<t>, System.Collections.IList, IReadOnlyList<t>
{
private const int _defaultCapacity = 4;

private T[] _items;
[ContractPublicPropertyName(“Count”)]
private int _size;</t></t></t>

If you look at this you would notice items is an array of Generic type. Does it mean what I think it really does? Let’s have a closer look on this:

static readonly T[] _emptyArray = new T[0];

// Constructs a List. The list is initially empty and has a capacity
// of zero. Upon adding the first element to the list the capacity is
// increased to 16, and then increased in multiples of two as required.
#if !FEATURE_CORECLR
[TargetedPatchingOptOut(“Performance critical to inline across NGen image boundaries”)]
#endif
public List() {
_items = _emptyArray;
}

There you go. List as you can see for yourself, is an array which you can treat as List, not the other way around.

That’s a big difference: it implies some operations do not work as the class name might imply:

  • A List is an Array, so when you try to add an element and there is no room, and the bigger array is allocated and all existing elements are copied into it (If you are familiar with a StringBuilder, that is exactly what it does as well).
  • When you insert an element, the data for all elements with a higher index has to be copied into a new location.
  • When you delete an element, the data for all elements with a higher index has to be copied to a new location.
  • If you add an element and the internal array is full, a new, larger internal array is allocated and all existing data is copied to the new list before the new element can be added at the end.

And that’s important because it defines what the various operations you can do on a List<T> can cost in terms of time.

The first thing to note is that as an array, a List<T> is very efficient to access individual elements via the index. But the other operations which make a List<T> more useful than an Array all come with a cost – and that can be quite a large cost.

Examples

Adding Items

When you use the Add method to add an item to a List<T>, it checks to see if there is room in the existing array and if not it:

  • Creates a new array, twice the size of the old one.
  • Copies each entry from the old array to the new one,
  • Sets the “last” entry in the new array to the new value, and moves the “last” indicator on by one.

Inserting Items

When you use the Insert method, it’s worse. A lot worse.

Start by doing the Add checks, and allocate and copy if there isn’t enough room.

Then, move every element of the array with an index greater than or equal to the value of the insert point index one place to the right.

So if you insert a single element at the beginning of a List<T> you potentially will copy the entire array twice in effect: once to allocate enough space, and then again to make room at the front.

Delete Items

Delete is better than Insert because it doesn’t need the check-the-size-and-copy operations – but it still requires the processor to move every single element of the array with an index greater than or equal to the value of the delete point index one place to the left.

Summary

I’m not saying that you shouldn’t use List<T> – no way, it’s one of the most useful classes in .NET – but you should be aware that using it can have a serious performance impact if you do the wrong thing, or you don’t think carefully about exactly how you are using it.

  • If you know (even roughly) how many elements you will be adding to your list in advance, use the overloaded constructor to allocate a suitably sized array to start with: That will save both memory allocations and time in copying elements from one array to another.
  • If you want a one-dimensional array, you are probably better off using a List<T> instead. You will use the same amount of memory, access time will be about the same, and you will gain flexibility in your code.
  • Avoid inserting elements near the front of a list: each time you do, you will move all subsequent elements in memory. Where ever possible, use Add instead to avoid this.
  • Use InsertRange and RemoveRange whenever possible instead of individual Insert and Remove operations.
  • When deleting elements, think very carefully. If you can delete from the tail forward instead of head back, do – it saves moving elements you are going to delete anyway.
  • If you are going to insert or delete more than a couple of elements that do not have sequential indexes, think carefully: it may be more efficient to construct a new List<T> and do the copying yourself. But this is a difficult call: Array.Copy (used internally by the insert and Remove methods) is a lot faster than individual element copying as it can use “bulk move” codes. It may be worth timing your code to work this out – but it’s going to depend on too many variables to provide a generalized “this is faster” recommendation.
  • Always use List<T>.Clear or create a new List<T> instead of deleting the elements of an existing List<T> one-by-one.

 

That’s it for now. Thanks for dropping by !!! Feel free to comment on this post or you can drop me an email at naik899@gmail.com

Leave a Reply