An arena is one block of bytes and a cursor. To allocate, it reads the cursor, hands you that spot, and slides the cursor forward by your request. That's the whole algorithm — no free list, no block sizes, no bookkeeping. Press allocate and watch the cursor march.
Level 1 filled the block. Level 2 empties it: free_all is a single store — it sets the cursor back to 0. Every allocation is gone at once, and the next one reuses the same address the first one had.
There is no walking a list of objects to release them one by one. You snap the cursor to the start, and the bytes below it are free again — instantly, regardless of how many things were in there. Step through it:
The proof it's the same spot: allocate 32 bytes — the cursor lands at offset 40 (8 bytes of leading padding, then 32). Call free_all — the cursor is 0. Allocate 32 bytes again — back to offset 40, and the returned address is byte-for-byte the address the first allocation had. The arena didn't ask anyone for memory; it just rewound its own cursor.
Levels 1 and 2 were the mechanism. Level 3 is when it's the right tool: a pile of allocations that all die at the same moment.
A frame of a game builds path strings, dispatch lists, particle batches — dozens or hundreds of little allocations — and every one of them is garbage the instant the frame ends. A parser builds a tree of nodes, and they all die together when evaluation finishes. The allocations have different sizes and different types but one shared lifetime. That shared lifetime is exactly the thing an arena exploits: it doesn't track them individually because it doesn't need to — it ends them all with one cursor reset.
The counterfactual: an arena is the wrong tool when the lifetimes aren't bundled — when some allocations must outlive the others by minutes and you don't know which ones up front, or when you need to hand memory back to the system mid-run. If a value has to survive the reset, you copy it out into a longer-lived allocator first. The arena's whole advantage — no per-object tracking — is precisely what makes it useless when the objects have independent lifetimes.
A general-purpose allocator treats each of those frame allocations as independent: it stores a size next to each block, threads them onto free lists, splits and coalesces, and on a multithreaded path takes a lock. Allocation costs hundreds of instructions and a possible cache miss; every allocation needs a matching free or it leaks; a stray free-after-free is a crash. The arena throws all of that away — because it knows up front the lifetimes are one. You trade per-object flexibility you don't need for an allocation that's nearly free and a release that's a single store.
Nothing this cheap is free. The arena buys its speed by refusing to do the things a general allocator does — and those refusals are the bill. Pick one:
The emergent payoff: add up those three refusals and the arena's deal is simple — you own one decision (when to reset) and in exchange the allocator owns nothing else. There are no individual frees to forget, no free-list corruption to chase, no per-object lifetimes to reason about. The discipline collapses from "match every allocation with a free, in the right order" down to "reset the region at its obvious boundary." When the boundary is obvious — end of frame, end of parse, end of request — that's a trade worth taking every time.
That's the arc: L1 one block + a cursor, allocate = bump it forward → L2 free_all snaps the cursor to 0, so everything is reclaimed at once and the next allocation reuses the same address → L3 that's the right shape exactly when many allocations share one lifetime → L4 the bill is what it refuses to do, and trading that for "reset at an obvious boundary" is the whole point — which is what temp_allocator already is.