- 3D HBM-on-GPU design achieves record compute density for demanding AI workloads
- Maximum GPU temperatures exceeded 140°C without thermal mitigation strategies
- Halving GPU clock frequency reduced temperatures but slowed AI training by 28%
Imec presented a study of a 3D HBM-on-GPU design aimed at increasing computational density for demanding AI workloads at the 2025 IEEE International Electron Devices Meeting (IEDM).
The thermal system technology co-optimization approach places four high-bandwidth memory stacks directly above a GPU through microbump connections.
Each stack consists of twelve hybrid bonded DRAM dies and cooling is applied on top of the HBMs.
Thermal mitigation testing and performance trade-offs
The solution uses power maps derived from industry-relevant workloads to test how the configuration reacts under realistic AI training conditions.
This 3D arrangement promises a leap in computational density and memory per GPU.
It also offers higher GPU memory bandwidth compared to 2.5D integration, where HBM stacks sit around the GPU on a silicon spacer.
However, the thermal simulations reveal serious challenges for the 3D HBM-on-GPU design.
Without mitigation, peak GPU temperatures reached 141.7°C, well above operating limits, while the 2.5D baseline topped out at 69.1°C under the same cooling conditions.
Imec explored technology-level strategies such as HBM stack aggregation and silicon thermal optimization.
System-level strategies included dual-sided cooling and GPU frequency scaling.
Reducing the GPU clock frequency by 50% lowered peak temperatures below 100°C, but this change slowed the AI training workload.
Despite these limitations, Imec claims the 3D structure can deliver higher compute density and performance than the 2.5D reference design.
“Having the GPU core frequency brought the peak temperature from 120°C to below 100°C, achieving a key memory performance target. Although this step comes with a 28% workload penalty…” said James Myers, System Technology Program Director at Imec.
“…the overall package outperforms the 2.5D baseline thanks to a higher throughput density offered by the 3D configuration. We are currently using this approach to study other GPU and HBM configurations…”
The organization suggests that this approach could support thermally resistant hardware for AI tools in dense data centers.
Imec presents this work as part of a broader effort to connect technology decisions with system behavior.
This includes the cross-technology co-optimization program (XTCO), launched in 2025, which combines STCO and DTCO thinking to align technology roadmaps with system scaling challenges.
Imec said XTCO enables collaborative problem solving for critical bottlenecks across the semiconductor ecosystem, including fabless and system companies.
However, such technologies are likely to remain limited to specialized facilities with controlled power and thermal budgets.
Via TechPowerUp
Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews and opinions in your feeds. Be sure to click the Follow button!
And of course you can too follow TechRadar on TikTok for news, reviews, video unboxings, and get regular updates from us on WhatsApp also.



