Paper Review - TLDRAM

Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture 🔗

This work introduced Tiered-Latency DRAM (TL-DRAM), a DRAM architecture that provides both low latency and low cost-per-bit. This work presented mechanisms that take advantage of the TL-DRAM substrate by using its low-latency segment as a hardware-managed cache. It is shown that TL-DRAM significantly improve both system performance and energy efficiency across a variety of systems and workloads.

Strengths

This work really inspired me a lot. When we are handling trade-offs, besides balancing on the metrics (result part), we can also split the resources and strike a balance between at the very begining (source part).
At first I am confused why the authors considered both single-core and multi-programmed experiments. It dawned on me when I saw the various substrate of TL-DRAM especially the OS-related part.
This paper really presents great graphs. These timeline graphs helped me a lot when learning DRAM latency.

Weaknesses

The segmentation (place of the transistors) has to be decided statically, so it may not be very smart when facing different kinds of applications.
I did not see much more about the manufacture cost, and I suppose it would be very small.
Why reducing memory to be more simple is a good thing in Section 5: While this approach reduces the overall available memory capacity, it keeps both the hardware and the software design simple.
It is still a little quite hard to recall the latency name when it is referred later (probably because I am not familiar with it).

Can you do better?

I would explore some relation with NUMA (non-uniform memory access), which really seemed very similar. And I may apply some new technology in NUMA into TL-DRAM.
I would carry some experiments on TL-DRAM with more tiers (the authors said it has bad performance but did not provide further details).

Takeways

The name is interesting: Too Long Dont Read AM, just guess …
It is really cool to interact with operating systems:
- Especially the part of considering different kinds of OS-level cache behaviors.