例如 <linux/list.h> 的 list_for_each() macro:
#define list_for_each(pos, head) \ for (pos = (head)->next; prefetch(pos->next), pos != (head); \ pos = pos->next)經過實證反而效能較差 (短 list、null prefetch),硬體自己做的不會較差。
其它:
- reordering structures that commonly accessed together fields are found in the same cache line
- linked-list => cache-unfriendly
- singly-linked hlist hash table list
- likely()
The problem with prefetch