As long as your puny little working set (2^16 small keys) fits into L2 cache and...

As long as your puny little working set (2^16 small keys) fits into L2 cache and get is perfectly covered by the L1 dTLB you won't see the cost of touching random pages in a big hash table larger than the last level TLB coverage and on chip caches. There won't be any TLB stalls waiting for the page walkers and you won't miss the lost spacial locality in the key-space preserved by B(+)trees if everything is in L2 cache. At the very least it proves that hash tables can be a good fit for point queries of datasets too large for linear searching or sorting + binary searches, but not yet large enough to exhaust CPU cache capacity.