Feb 10 2017 cs.OS
As its price per bit drops, SSD is increasingly becoming the default storage medium for cloud application databases. However, it has not become the preferred storage medium for key-value caches, even though SSD offers more than 10x lower price per bit and sufficient performance compared to DRAM. This is because key-value caches need to frequently insert, update and evict small objects. This causes excessive writes and erasures on flash storage, since flash only supports writes and erasures of large chunks of data. These excessive writes and erasures significantly shorten the lifetime of flash, rendering it impractical to use for key-value caches. We present Flashield, a hybrid key-value cache that uses DRAM as a "filter" to minimize writes to SSD. Flashield performs light-weight machine learning profiling to predict which objects are likely to be read frequently before getting updated; these objects, which are prime candidates to be stored on SSD, are written to SSD in large chunks sequentially. In order to efficiently utilize the cache's available memory, we design a novel in-memory index for the variable-sized objects stored on flash that requires only 4 bytes per object in DRAM. We describe Flashield's design and implementation and, we evaluate it on a real-world cache trace. Compared to state-of-the-art systems that suffer a write amplification of 2.5x or more, Flashield maintains a median write amplification of 0.5x without any loss of hit rate or throughput.