There’s an thrilling new graphics card memory know-how on the horizon that could see big beneficial properties in one of the vital necessary features of GPUs: memory bandwidth. The new GPU SCM with DRAM tech can ship peak beneficial properties of as much as 12.5x in comparison with high-bandwidth memory (HBM), whereas additionally decreasing energy consumption.
HBM is a know-how that AMD utilized in a few of its earlier greatest graphics card contenders, most notably the AMD Vega lineup of GPUs such because the AMD Radeon VII. It sought to massively enhance memory bandwidth – a feat it very a lot achieved – by stacking the memory chips on prime of one another and housing them a lot nearer to the GPU than standard GDDR. However, it was pricey to supply, restricted in general capability, and in addition tough to chill.
This new analysis, then, seems to enhance upon the overall thought of HBM by permitting for a lot bigger quantities of memory, due to using a ‘storage class memory’ (SCM) fairly than DRAM for the majority of the storage, with a smaller portion of DRAM then used as a learn/write cache for that storage. This association is analogous to how most of the greatest SSD fashions work, with a small portion of DRAM used as a learn/write cache in entrance of the slower NAND.
The kind of SCM used here’s a variation on Intel’s 3DXPoint memory, which debuted a number of years in the past as an intriguing center floor between sooner however risky (loses its knowledge with out energy) DRAM, and slower however non-volatile (retains knowledge with out energy) NAND. It was briefly bought in each RAM stick and SSD codecs, nevertheless it by no means fairly discovered a extra everlasting position within the PC ecosystem.
New refinements on this fashion of SCM, although, have led this analysis workforce to consider {that a} model of this kind of memory could be viable to be used with GPUs. The benefit of this memory is that it’s cheaper to supply than DRAM, and runs with a lot decrease energy consumption. That means graphics playing cards could include big memory capacities with out breaking the financial institution or melting themselves.
Crucial to creating the SCM work, although, is offering the extremely excessive bandwidth and low latency wanted for graphics memory to be helpful. That’s the place the DRAM cache and intelligent hardware-based memory tagging system devised by the POSTECH analysis workforce at Soongsil University comes into play.
The intricacies of the analysis are extremely complicated, with it involving data-flow fashions to foretell typical memory entry patterns, as an illustration. The upshot, although, is that the workforce – utilizing a modified Nvidia A100 GPU – could see vital performance benefits to its new system in comparison with utilizing standard HBM memory (which itself already affords a lot larger bandwidth than normal GDDR configurations). The paper summarises the outcomes by saying:
“Compared to HBM, the HMS (heterogeneous memory stack) improves performance by as much as 12.5× (2.9× general) and reduces power by as much as 89.3% (48.1% general). Compared to prior works, we cut back DRAM cache probe and SCM write visitors by 91-93% and 57-75%, respectively.”
It’s an fascinating improvement, though it comes with a few main caveats concerning gaming GPUs. For a begin, the analysis relies on knowledge heart workloads, the place memory utilization patterns and general system necessities are fairly totally different to gaming workloads. Plus, this analysis is simply that: it’s within the analysis part. Even if it had been instantly recognized because the clear path to next-gen performance, we’d most likely be taking a look at a few generations of playing cards passing earlier than we’d see this new know-how built-in into new graphics playing cards.
Still, it’s all the time thrilling to see figures pointing to a 12.5x (and even 2.9x) performance enhance and 89.3% energy discount, as they trace that there are nonetheless some ways the graphics card and wider PC tech market can proceed to eke out extra performance, regardless of it being ever harder for chip produces to make smaller and smaller chips.
https://www.pcgamesn.com/nvidia/dram-cache-gpu-graphics-card-tech