On lists, cache, algorithms, and microarchitecture
In this test, all elements of the list are stored sequentially in a contiguous block of memory. Since the core does only a minimum amount of work and memory bandwidth is the bottleneck in most cases we can rather easily infer cache level sizes from these results. The processor has four hardware prefetchers:
These prefetchers cover sequential accesses and large objects spanning multiple cache lines, but there’s nothing that would help with lists of small objects.
Source: pdziepak.github.io