- 3D HBM-on-GPU design reaches file compute density for demanding AI workloads
- Peak GPU temperatures exceeded 140°C with out thermal mitigation methods
- Halving the GPU clock charge lowered temperatures however slowed AI coaching by 28%
Imec offered an examination of a 3D HBM-on-GPU design geared toward rising compute density for demanding AI workloads on the 2025 IEEE Worldwide Electron Units Assembly (IEDM).
The thermal system-technology co-optimization strategy locations 4 high-bandwidth reminiscence stacks instantly above a GPU via microbump connections.
Every stack consists of twelve hybrid-bonded DRAM dies, and cooling is utilized on high of the HBMs.
Thermal mitigation makes an attempt and efficiency trade-offs
The answer applies energy maps derived from industry-relevant workloads to check how the configuration responds underneath lifelike AI coaching circumstances.
This 3D association guarantees a leap in compute density and reminiscence per GPU.
It additionally provides increased GPU reminiscence bandwidth in comparison with 2.5D integration, the place HBM stacks sit across the GPU on a silicon interposer.
Nevertheless, the thermal simulations reveal extreme challenges for the 3D HBM-on-GPU design.
With out mitigation, peak GPU temperatures reached 141.7°C, far above operational limits, whereas the two.5D baseline peaked at 69.1°C underneath the identical cooling circumstances.
Imec explored technology-level methods similar to merging HBM stacks and thermal silicon optimization.
System-level methods included double-sided cooling and GPU frequency scaling.
Lowering the GPU clock charge by 50% lowered peak temperatures to under 100°C, however this variation slowed AI coaching workloads.
Regardless of these limitations, Imec argues that the 3D construction can ship increased compute density and efficiency than the two.5D reference design.
“Halving the GPU core frequency introduced the height temperature from 120°C to under 100°C, attaining a key goal for the reminiscence operation. Though this step comes with a 28% workload penalty…” stated James Myers, System Expertise Program Director at Imec.
“…the general bundle outperforms the two.5D baseline due to the next throughput density supplied by the 3D configuration. We’re at the moment utilizing this strategy to check different GPU and HBM configurations…”
The group suggests this strategy may assist thermally resilient {hardware} for AI instruments in dense knowledge facilities.
Imec presents this work as a part of a broader effort to hyperlink know-how selections with system conduct.
This consists of the cross-technology co-optimization (XTCO) program, launched in 2025, which mixes STCO and DTCO mindsets to align know-how roadmaps with system scaling challenges.
Imec stated that XTCO allows collaborative problem-solving for crucial bottlenecks throughout the semiconductor ecosystem, together with fabless and system corporations.
Nevertheless, such applied sciences will seemingly stay confined to specialised amenities with managed energy and thermal budgets.
Through TechPowerUp
Comply with TechRadar on Google Information and add us as a most well-liked supply to get our skilled information, opinions, and opinion in your feeds. Make sure that to click on the Comply with button!
And naturally you may as well comply with TechRadar on TikTok for information, opinions, unboxings in video type, and get common updates from us on WhatsApp too.
