- AI -Datacentre overwhelming air cooling with increasing force and heat
- Liquid cooling becomes important as server density increases with AI growth
- New hybrid cooling cuts off power and water but faced the adoption of the hesitation
When AI transforms everything from search engines to logistics, its hidden cost becomes harder and harder to ignore, especially in the data center. The power needed to run generative AI is pushing infrastructure in addition to what traditional air cooling can handle.
To explore the scope of the challenge, I spoke with Daren Shumate, founder of Shumate Engineering and Stephen Spinazzola, company director of Mission Critical Services.
With decades of experience building large data centers, they are now focused on solving AIS energy and cooling requirements. From failing air systems to the promise of new hybrid cooling, they explained why AI is forcing data centers to a new era.
What are the biggest challenges in cooling a data center?
Stephen Spinazzola: The biggest challenges in refrigerated data centers are electricity, water and space. With high density computing, such as the data centers running artificial intelligence comes tremendous heat that cannot be cooled with a conventional air cooling system.
The typical cabinet loads have doubled and triple with the insertion of AI. An air cooling system simply cannot catch the heat generated by the high kw/ cabinet loads generated by AI cabinet clusters.
We have performed calculation fluid dynamics (CFD) at several data center halls, and an air cooling system shows high temperatures above acceptable levels. The air flows we map with CFD, showing temperature levels above 115 degrees F. This can result in servers shutting down.
Water cooling can be performed in a smaller room with smaller current, but it requires huge amount of water. A recent study determined that a single hyper -scaled facility would need 1.5 million liters of water a day to cool and wetting.
These restrictions present major challenges for engineers, while planning the new generation of data centers that can support the unprecedented demand we see for AI.
How does the AI change the norm when it comes to data center heat drainage?
Stephen Spinazzola: With CFS modeling showing potential servers that shut down with conventional air cooling within AI cabinet clusters, the need for direct fluid cooling (DLC) is required. AI is typically inserted in 20-30 cabinet clusters on or over 40 kW per day. Cabinet. This represents a quadrupled increase in kW/ cabinet with the implementation of AI. The difference is staggering.
A typical chat-GPT query uses about 10 times more energy than a Google search and it’s just for a basic generative AI feature. Several advanced queries require significantly more power to undergo an AI-Klynggård to process large-scale computing between multiple machines.
It changes the way we think of power. Therefore, the energy needs move the industry to use more fluid cooling techniques than traditional air cooling.
We talk a lot about cooling, how about delivering actual power?
Daren Shumate: There are two overall new challenges in delivering power to AI-Computing: How to move power from UPS output boards to high-density state and how to create high densities of UPS power from the tool.
Power for racks is still achieved with both branch circuits from distribution PDUs to rack PDUs (plug strips) or with plug-in busway over the racks with the in-rack PDUs connected to the bus road at each tripod. The nuance is now what the ampacity of Busway makes sense of the strip and what is commercially available.
Even with plug-in-busway available at an ampacity of 1,200 A, the power density of power forces the exposure of a larger amount of separate busway circuits to meet density and striping requirements. Additional complicating power distribution is specific and varying requirements for individual data center end users from branch circuit surveillance or distribution preferences.
Depending on site restrictions, data center cooling design may have intermediate voltage -Ups. The driven by voltage falls concerns solves MV UPS concerns about the need to have very large feeder channel banks, but also introduces new intermediate voltage/utilization voltage stations in the program. And when considering intermediate voltage ops, another consideration is the usability of MV Rotary UPS systems vs. Static, etc. solutions.
What are the benefits/disadvantages of the different cooling techniques?
Stephen Spinazzola: There are two types of DLC on the market today. Emersion cooling and cold plate. Emersion cooling uses large tanks of a non-conditioning liquid with the servers placed vertically and fully wiped out in the liquid.
The heat generated by the servers is transferred to the liquid and then transferred to the buildings cooled water system with a closed loop heat exchanger. Emersion tanks take up less space but require servers that are configured for this type of cooling.
Cold cooling uses a cooling plate attached to the bottom of the chip stack that transfers the energy from the chip stack to a liquid that is directed throughout the cabinet. The liquid is then led to an end of the Row Cooling Distribution Unit (CDU) that transfers the energy to the building -cooled water system.
The CDU contains a heat exchanger to transfer energy and 2N pumps on the secondary side of the heat exchanger to ensure continuous fluid current to the servers. Cooling of cold plate is effective for server cooling, but it requires a huge amount of fluid tube connections that need to have interrupted leakage stop technology.
Air cooling has been proven technique for cooling data centers that have existed for decades; However, it is ineffective for the high density stands needed to cool AI data centers. As the loads increase, it becomes more difficult to fail it using CFD modeling.
You present another cooler, how does it work and what are the current challenges of adopting?
Stephen Spinazzola: Our patent events hybrid-dry/Adiabatic Cooling (HDAC) Design solution provides unique two temperatures of coolant from a single closed loop, providing a higher temperature fluid for cooling DLC servers and a lower temperature fluid for conventional air cooling.
Because HDAC at the same time uses 90 percent less water than a cooler -cooled tower system and 50 percent less energy than an air -cooled cooler system, we have managed to get the very important power consumption efficiency (PUE) figure down to approx. 1.1 annually for the type of hyperscale data center needed to process AI. Typical AI data centers produce a pue ranging from 1.2 to 1.4.
With the lower PUE, HDAC provides an approximate 12% more usable IT current from the same size tool strength. Both economic and environmental benefits are significant. With a system that provides both an economic and environmental advantage, HDAC only requires “a sip of water”.
The challenge for adoption is simple: No one will go first.