Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM Technology 🔗
Ambit is the first work that integrates support for bulk bitwise operations directly into a DRAM memory array, and achieved high throughput and efficiency.
Strengths
First, there are three advantages of Ambit itself: (1) no bandwidth limit with in-memory computing; (2) no extra dram command and address management change; (3) keep the same DRAM-microarchitecture for all technologies.
Second, there are mainly two contributions beside proposing Ambit (cost and effi):
- It presents the industrial availability.
- Ambit can be implemented with modest change to the DRAM design.
- The work also proved the reliability of Ambit rigorously.
- It proves the better performance over both ideal and real-world applications.
- It presents the industrial availability.
It is not necessary to replace all common commodity DRAMs. Only replacing for the data-centric components is enough.
I really admire the detailed explanation towards DRAM, it would be the reference when I introduce thoes primitive DRAM commands to others.
Very good explanation towards how to make TRA work. It is rigorous enough to consider the five potential issues.
Weaknesses
Measuring the modification cost is really important. What is modest change exactly? It would be better if there is more discussion over the modification over common DRAM.
The three real-applications still seems a bit specific. Could we come up with even more commonly used cases? In other words, are bitwise operations really that frequently used?
Is there any application scene that bit operations are used not in "bit-xxx" functions?
Ambit still need new ISA support, and consequently new compiler support, which is not convenient to be applied.
The implementation part is not clear (at least for me), how to modify the DRAM physically to implement those functions.
Can you do better?
I would explain why use these three bool operation, and-or-not. Is it the optimal combination to reach full-logic? Why don't use xand-xor-not? I would do some bool computation and prove it.
I think there is still too many modifications to make it support ambit: rigorous, but annoying. I dont have better ideas, but I would like to try explaning this weakness.
Other Comments
This paper smoothly presents the motivation, methods, potential issues and respect solutions to build Ambit. It really tells a good story and it seemed like a wonderful journey to read this paper.