2008

2007

2006

2005

2004

Adaptive Replacement Cache in Python

▁ jan 01 2008

Today we’re going to talk about ARCs, or Adaptive Replacement Caches.

So what is it? Well, lets talk about some other caching algorithms first.

  • LRU: Last Recently Used. This means that you have a cache with a fixed size. For each “hit” on an entry, you increase the hit count (or the timestamp of access time) on the entry, and you keep the cache sorted sequentially. Once the cache “overflows”, you expunge the least used entry.
  • MRU: Most Recently Used. This is used when your entries are mostly unpredictable. You keep the most recent items in the beginning of the list, and you expunge from the end once you hit the fixed size.

So what’s an ARC? Well, it’s basically a more sophisticated caching algorithm, which sort of, kind of, looks like a MRU/LRU. We’ll try to explain it in further detail here, and also provide a Python implementation.

If we look at the Wikipedia article for an ARC, it states that:

ARC improves the basic LRU strategy by splitting the cache directory into two lists, T1 and T2, for recently and frequently referenced entries. In turn, each of these is extended with a ghost list (B1 or B2), which is attached to the bottom of the two lists. These ghost lists act as scorecards by keeping track of the history of recently evicted cache entries, and the algorithm uses ghost hits to adapt to recent change in resource usage. The combined cache directory is organised in four LRU lists

…which looks like this:

basic_arc.png

There’s an overall strategy to this;

  • When you first cache something, it will end up in T1. T1 will only contain entries that are hit only once.
  • Once a cache hit occurs in T1, it will be moved to T2. T2 will hold anything that is hit more than once.
  • When either of these heaps overflow (T1 & T2), the least hit entry will be evicted. This just means that it will be moved to the ghost-heap.
  • Once you do a cache lookup, and you find the entry in the ghost, you will move the entry back into the main heap, and evict the least hit entry out into the ghost. Basically a swap.
  • When a ghost overflows, the entry will be completely expunged, and will have to be calculated from scratch again.

Lets see what happens when something first gets cached:

first_insert.png

OK, so ‘x’ gets cached, and since it’s never hit, it ends up in T1, with a hit count of 1. When we cache something else (call it ‘y’), that also ends up in T1. The second that something gets hit in the cache, we must move it to T2:

swap_t2.png

Say that T2 is filled to the brim, we must evict the least used entry to the ghost:

evict_t2.png

And then say that B2 gets too full:

b2_expunge.png

‘x’ is now dead. Any requests for x will fail, and it will need to be processed again. That’s the basic premise of ARC caching. It’s not rocket science, but it’s a step up from LRU.

Anyway, I promised a Python implementation, so lets take a look at it.

DISCLAIMER: This was written on behalf of a client, who kindly allowed me to show it to the public. There’s no license on it, so lets just assume that it’s under BSD.

First of, grab a hold of the file. It includes a test in the main method, which does some basic Fibonacci calculations. That’s good, cause it will recursively requests a sequence of numbers until we reach 1 or 0, so we can cache the result of those calculations. Initializing and using the cache is easy;

c = ARC(1000) # Fixed size 1000, must be divisible by 4.
@c.cache # Decorator from `ARC'
def fib(x):
    ...

Lets do a run, and look at the statistics it prints after execution:

jespern@cantor:Work% python memoize-arc.py
[...some output truncated...]
ARC statistics:
-----------------
T1: 0
T2: 100
B1: 0
B2: 0
-----------------
<Heap: 0 key(s), 0 ghosted, max 250>
<Heap: 100 key(s), 0 ghosted, max 250>

Nothing remains in T1, since it was all hit more than once. T2 can hold all of the 100 fibonacci numbers we’ve requested, so nothing’s evicted to the ghosts either.

Go try it! Adjust the maxsize of the cache, and check out the statistics. See what happens. Have fun, comments welcome.

Edit, January 3rd 2008: As ‘bayareaguy’ kindly points out, ARC is patented by IBM. You might want to look at 2Q instead.

← Previous: IM ON UR DESK, MAKIN SUR U CODE PROPAH  //  Next: SQLAlchemy caching (second level)

comments

Jesper Noehr, 11 months ago:

Thanks bayareaguy, I’ve updated the post with your suggestion.

bayareaguy, 11 months ago:

Given that the ARC algorithm is patented. You may want to suggest readers investigate 2Q instead. http://lwn.net/Articles/131554/

René Leonhardt, 11 months ago:

Interesting read, many thanks! Is there a Python implementation of 2Q available?

Paul, 10 months, 2 weeks ago:

Interesting post — I’ve just implemented ARC (as well as the FRC ‘variant’) along with 2Q and MQ caches in Java, and have a couple of comments on your implementation. 1) The actual ARC algorithm doesn’t evict the ‘least used’ down to the ghost lists; it evicts the least RECENTLY used. If you use a heap, sorted by hits, you may well get a better result (I don’t know!) but you suddenly have O(log n) maintenance times on your hits, with is the problem with LFU lists generally (and something ARC avoids) 2) If you make the above change (LFU to LRU lists), per the algorithm, your algorithm may as well not have B1 and B2 lits. Follow it through; the passage of something from t2 -> b2, say, and then back to the head of t2 on a hit (which pushes the tail out into b2 in the ‘swap’) then t2 and b2 may as well just be merged into one list; the behaviour is the same. The point of the ghost lists is that they don’t actually contain data, they just track the ‘key’ of the node. That is, once a node moves from t2 -> b2, you evict it from the cache proper and throw away the data; b1 + b2 merely contain REFERENCES to the old objects. If you have a 1000 element cache, t1 and t2 are (say) 500 elements each, not 250. b1 + b2 are also (say) 500 each, but they don’t ‘count’ in the cache size as they’re only holding keys, not data; in the case of vm paging, say, you might have 8 KB pages but 4-byte pointers so the b1 + b2 lists are statistically insignificant in terms of size overhead. 3) This cache also isn’t (as far as I can tell; I don’t speak Python!) ‘adaptive’ at all (the ‘A’ in ‘ARC’); the point is that the balance between t1 and t2 is dynamic, not fixed. You keep a ‘target size’ for t1, which is how big you’d like it to be (this reflects the expected ratio of ‘recent’ hits, which will likely be in t1, to ‘frequent’ hits, which will likely be in t2). When you get a hit in b1, you adjust this ‘target size’ upwards (you had a ghost hit to something hit recently but once); if you get a hit in b2, you bring t1’s target size down to make t2 bigger (you had a ghost hit to something hit frequently). In this way, the sizes of t1 and t2 tune themselves not only to your usage patterns (saving you to having to tune) but also adjust over time if you have a sudden burst of recurring activity to ‘recent’ nodes (or, conversely, to ‘frequent’ ones) In other words, what you’ve implemented is more like a Fixed Replacement Cache (detailed in one of the same papers as ARC) but a ‘LFU’ variant, rather than LRU.

Jesper Noehr, 10 months, 2 weeks ago:

@Paul, Thank you very much for your comments, you seem to have a more thorough understanding than I do ;-) Is there a chance we may look at your Java implementation? I’d definitely be interested in seeing how you’ve done it.

eric casteleijn, 1 month, 3 weeks ago:

Here’s a working version of the algorithm as I understand it as a python decorator. (I take full responsibility, being Erpian. ;)

http://code.activestate.com/recipes/576532/

powered by