For those who know me, I have long stated that waste is abundant when it comes to power and provisioning in data centers as the IT load capabilities for failover and self-healing are rarely examined or included in the power supply design. Data center floors get power capacity based on predicted compute power needs that are allocated to each rack and then duplicated for redundancy and failure (in most cases). Redundant power rarely gets used. In fact, most hope it never does! In some cases, the backup power is via UPS and generators, renewables, fuel cells, or another diversely routed power feed. None of which are cheap. The problem is that facilities design ignores the failover mechanisms built into the software.
If a single software program fails over to multiple servers in multiple locations, do all of those locations need full redundancy of all power systems? If a single application can go down with minimal to no business impact, does it also need to reside on fully redundant servers with dual power, dual network (with dual power), and dual storage (with dual power) connections? The answer is no, but we still design power as if they do. Much of the power resources remain unused, expensive, standby assets. As applications move to the cloud or are retired, how do you reclaim that power? Does every business have a great commissioning/decommissioning plan and a business strategy to balance consumed energy to risk? Enter software-defined power (SDP).
Balancing act
Today, the data center is quite different than the data center of yore. Colocation data centers are built with the mind to supply dual power to every resource. It is simply easier to plan and sell but can be immensely wasteful. Hyperscale and cloud providers, of course, throw in redundant everything and are designed with the concept that everything will fail. Enterprise private data centers in some cases embrace various layers of risk-based power, but in most cases, just follow standard accepted practices of dual everything except for some smaller “run to fail” scenarios. Dual everything, in large part, is due to the engineering community that has been trained to create redundant facilities as IT is not in their purview.
As data centers reach capacity or shed loads, balancing power and cooling across the floor can be daunting, which leads the way to DCIM and literally hundreds of offerings with varied capabilities. Most DCIM solutions still rely heavily on human input and manipulation.
Newer software-defined data centers can move networking, server, storage, WAN, and other resources around on the fly by removing dependencies on physical hardware through orchestration. This removes much of the human burden for rebalancing the floor loading as physical resources get deployed once and services that use them are “virtually” moved mitigating the need to decommission and recommission physical hardware assets after they become part of the data center ecosystem.
The final pillar of the software-defined data center is power. SDP allows power resources to react to and with the IT environment and to increase utilization of power assets. It makes standby power usable, enables loads to dynamically move based on power costs, and fosters dynamic balancing of loads and compute across the whitespace. By actively managing power at the software level with artificial intelligence (AI), microgrids, peak shaving, and node capping all become possible via rack asset level intelligence and orchestration. Power is now talking to compute, and the results are changing the data center for the better.
There are multiple ways to provide power consumption and availability intelligence. The best ways incorporate a combination of solution agnostic hardware and software at the rack level, removing the need for human action, input, and intervention. Where access in the rack is not available (think colocation), soft breakers can be used to control loads.
The value of SDP exists for all data centers. For users with data located across several facilities, the options multiply. For colocation providers, the act of actively managing power increases SLA response times and can increase profit across white space by dynamically allocating power and reclaiming unused and stranded power capacity. For endusers, the possibilities are vast whether their compute is in their own data center or in a provider space via SDP. In fact, one study showed a five-year ROI of 1,153% mainly based on using power resources already purchased, removing the need to expand.
Concepts defined
Peak shaving occurs when your power demand is tuned. In some cases, peak costs per kWh are double non-peak fees. Peak shaving decreases demand during peak utilization times from the power company. In a policy driven, software-defined data center with SDP, not only are the workloads monitored, but better decisions occur in real time as to when and where the processing occurs. With SDP, compute can utilize standby power resources when costs are high or “move” compute to lower cost power sources. Software-defined decisions can enable the application to start up the instance at another site, somewhere else in the data center, or have the facility get power from UPS or microgrid to handle needs during peak periods, thereby shaving peak power costs.
An important part of shifting normally unused power on the standby side to use for active compute loads is node capping technology. Node capping assures the maximum power draw of a given resource and is built into many hardware platforms. Policy driven startup round robin options further ensure that loads do not exceed either the primary or secondary power delivery capacity.
Node power capping is made possible via AI that talks to the intelligent platform management interface (IPMI) and other platforms. IPMI is a set of computer interface specifications for an autonomous computer subsystem that provides management and monitoring capabilities via a network interface for remote control. The integration with SDP results in better management of all stranded capacity, including that stranded on the primary power delivery path is available for use and power capacity is regained.
Take, for example, a single 5kW cabinet. The cabinet is provisioned with 5kW of primary and 5kW of secondary power. Technically, there is 10kW of power capacity in the cabinet, but only 5 (or less than 5) from the primary side is typically used. This leaves 5kW of stranded power capacity for the standby leg plus any unused capacity from the primary power path. SDP would allow resources to “move” to the cabinet through orchestration to consume that spare capacity.
With node capping, the power capacity of the physical hardware rack resource (server for example) is limited to its actual allocated primary power within the physical rack (or secondary in the event of failure). SDP intelligence allows the secondary power to be used when idle, and loads to dynamically shift off of the power resource or be shut down so that the singular capacity of either power path is not exceeded. Demand is capped through communications between the residing software and the physical hardware.
Tangible benefits
With SDP, loads are spread intelligently across the floor based on demand, power availability, and cooling capacity through artificial intelligence and consistent monitoring. SDP reacts to changes in the environment based on policy driven factors on the fly. The possibilities achieved through marrying power and IT are virtually endless.
In a perfect data center, PUE would be as close to 1 as possible. While an actual 1 is never achievable, you can certainly get closer to 1 by allowing power delivered to be consumed and renewables or other power sources to be incorporated. A natural progression would be tie-in cooling allowing loads to be moved dynamically on the data center floor should a cooling unit fail or to balance heat in/cooling out (deltaT). Adding intelligence between facilities and compute is the missing link to a balanced, utilization optimized floor.
In the colocation world, often power is a pass-through cost and therefore of little concern to the provider. However, as consumers demand better PUE performance, the ability to sublet over-procured space, and lower pass-through power costs, the ability to marry resources through AI, orchestration, and automation will become a highly desirable selling point and additional revenue stream if they take on the management/setup of services. From an asset standpoint, UPS systems battery life is improved when batteries are used as opposed to sitting idle. As colos build out more renewables into their environments, those power paths become active/active.
For those that build data centers for others, the ability to build in SDP intelligence day one increases the flexibility and options across the white space by creating power availability zones. For multiple integrated compute sites, the ability to remotely control power and orchestrate workloads will save on initial capital expenditures, ongoing maintenance costs, downtime, and unnecessary truck rolls for repairs. Automating the decisions removes the human element and ensures that money-saving tasks don’t take a back seat to other daily operations providing maximum savings. It’s about time for facilities and IT to make intelligent decisions together!