[ad_1]
- HBF provides ten occasions HBM capability whereas remaining slower than DRAM
- GPUs will entry bigger knowledge units via tiered HBM-HBF reminiscence
- Writes on HBF are restricted, requiring software program to concentrate on reads
The explosion of AI workloads has positioned unprecedented stress on reminiscence techniques, forcing firms to rethink how they ship knowledge to accelerators.
Excessive-bandwidth reminiscence (HBM) has served as a quick cache for GPUs, permitting AI instruments to learn and course of key-value (KV) knowledge effectively.
Nevertheless, HBM is pricey, quick, and restricted in capability, whereas high-bandwidth flash (HBF) provides a lot bigger quantity at slower speeds.
How HBF enhances HBM
HBF’s design permits GPUs to entry a wider knowledge set whereas limiting the variety of writes, roughly 100,000 per module, which requires software program to prioritize reads over writes.
HBF will combine alongside HBM close to AI accelerators, forming a tiered reminiscence structure.
Professor Kim Joungho of KAIST compares HBM to a bookshelf at dwelling for quick examine, whereas HBF features like a library with way more content material however slower entry.
“For a GPU to carry out AI inference, it should learn variable knowledge known as the KV cache from the HBM. Then, it interprets this and spits out phrase by phrase, and I feel it’ll make the most of the HBF for this activity,” mentioned Professor Kim.
“HBM is quick, HBF is gradual, however its capability is about 10 occasions bigger. Nevertheless, whereas HBF has no restrict on the variety of reads, it has a restrict on the variety of writes, about 100,000. Due to this fact, when OpenAI or Google write packages, they should construction their software program in order that it focuses on reads.”
HBF is anticipated to debut with HBM6, the place a number of HBM stacks interconnect in a community, rising each bandwidth and capability.
The idea envisions future iterations like HBM7 functioning as a “reminiscence manufacturing facility,” the place knowledge will be processed immediately from HBF with out detouring via conventional storage networks.
HBF stacks a number of 3D NAND dies vertically, just like HBM stacking DRAM, and connects them with through-silicon vias (TSVs).
A single HBF unit can attain 512GB capability and obtain as much as 1.638TBps bandwidth, far exceeding customary SSD NVMe PCIe 4.0 speeds.
SK Hynix and Sandisk have demonstrated diagrams displaying higher NAND layers related via TSVs to a base logic die, forming a purposeful stack.
Prototype HBF chips require cautious fabrication to keep away from warping within the decrease layers, and extra NAND stacks would additional improve the complexity of the TSV connections.
Samsung Electronics and Sandisk plan to connect HBF to Nvidia, AMD, and Google AI merchandise inside the subsequent 24 months.
SK Hynix will launch a prototype later this month, whereas the businesses are additionally engaged on standardization via a consortium.
HBF adoption is anticipated to speed up within the HBM6 period, and Kioxia has already prototyped a 5TB HBF module utilizing PCIe Gen 6 x8 at 64Gbps. Professor Kim predicts that the HBF market might surpass HBM by 2038.
Through Sisajournal (initially in Korean)
Comply with TechRadar on Google Information and add us as a most well-liked supply to get our professional information, critiques, and opinion in your feeds. Make sure that to click on the Comply with button!
And naturally you too can observe TechRadar on TikTok for information, critiques, unboxings in video kind, and get common updates from us on WhatsApp too.
[ad_2]

