How come the complexity of fetching a value from an array by it’s index is O(1)?
I thought the algorithm has to go through all the indexes, find the correct index, and then know what value to return. So that means the complexity is O(n/2) on average.
Why is it actually O(1)?
1
The key to understanding why array access is O(1) is understanding that they have a known length. In a language like C, we can guarantee this by allocating a sequential block of memory. Since it’s sequential in memory, we can use pointer arithmetic for direct addressing and thus direct access. An array object would follow a similar principle with its internal organization though the implementation might be more complex. Since all we are doing is some addition, an operation that takes O(1) time, we have an operation that over all takes O(1) time. This addition is also why in C and C++ at least, all items in an array need to be the same type. They all are required to occupy the same number of bytes for this pointer arithmetic to work.
Contrast this to a singly linked list where we have a defined beginning of the list and then we add on to that list by having a pointer or reference address point to the next node in the list. Each node contains two things, a pointer to the next node in the list and a data item. Since you can’t go back, each time you want an item from the list, even if you know exactly where it is, you need to traverse the whole list up to that item. Thus, we wind up with O(n/2) for average case access and O(n) for worst case. We can mitigate this some by doing a doubly linked list with a tail pointer as well as a head pointer but we still are limited to a form of sequential, rather than truly random, access. We set the last pointer to null to essentially tell the computer the same thing we are telling it when we give in the size of an array.
Because the index n
of an array points to the n+1th element in the array (using zero-based indexing). Some simple math allows you to calculate the exact position of the desired element in O(1)
.
Further Reading
Java Arrays
7
Simple answer:
Because you can calculate
453 + 6
in O(1) time too.
Lets write some code and look at it more closely.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
int index = atoi(argv[1]);
int a[10];
for(int i = 0; i < 10; i++) { a[i] = 42 + i; }
printf("foo!n");
int val = a[index];
printf("%dn",val);
return 0;
}
Running this through gcc -S
we can get the assembly for it and look at that more closely.
The reason that foo
is in there, is so that I can quickly spot the spot between the two printf
s.
leaq L_.str(%rip), %rdi
movb $0, %al
callq _printf
leaq L_.str1(%rip), %rdi
movslq -28(%rbp), %rcx
movl -80(%rbp,%rcx,4), %edx
movl %edx, -88(%rbp)
movl -88(%rbp), %esi
movl %eax, -92(%rbp) ## 4-byte Spill
movb $0, %al
callq _printf
Disclaimer: I’m not an assembly guy – its been decades since I worked with MIPS, and have never done more than glance at intel. I may have this wrong. This may be different on your system, because I didn’t actually run gcc, but rather clang.
After the first printf
is done, it starts out by loading the address of the format for str1 ("%dn
). From a bit above that I didn’t include, -28(%rbp)
is the result from the atoi
call. I have this to avoid the system trying to optimize it all out (because it will do that). And it does some other moving around things too.
However, there is no loop there. Its done in a constant number of instructions no matter what index you are looking for. This is the definition of O(1).
Lets look at an earlier part – the population of the array.
movl $0, -84(%rbp)
LBB0_1: ## =>This Inner Loop Header: Depth=1
cmpl $10, -84(%rbp)
jge LBB0_4
## BB#2: ## in Loop: Header=BB0_1 Depth=1
movl -84(%rbp), %eax
addl $42, %eax
movslq -84(%rbp), %rcx
movl %eax, -80(%rbp,%rcx,4)
## BB#3: ## in Loop: Header=BB0_1 Depth=1
movl -84(%rbp), %eax
addl $1, %eax
movl %eax, -84(%rbp)
jmp LBB0_1
LBB0_4:
leaq L_.str(%rip), %rdi
movb $0, %al
callq _printf
And hear you can see the loop which populates the array. There is a compare, a jump if greater than down to a label, stuff, and the jump back to the head after incrementing 1.
The number of times this will run is dependent on the size of the array (in this case 10). There is only one loop which iterates over all of the indexes. That is O(N).
Now, if this was a linked list rather than array, you would have to walk the linked list in order to find the nth element. Then going to the nth element would also be O(N). There are ways to make accessing a particular element of a linked list faster by using other data structures – but its still not O(1).