Skip Navigation to main content U.S. Department of Energy Energy Efficiency and Renewable Energy
Industrial Technologies Program
About the ProgramProgram AreasInformation ResourcesFinancial OpportunitiesTechnologiesDeploymentHome
Energy Matters: Information and Energy Solutions for Industry
Home/Current Issue About Energy Matters Archives Articles by Topic Subscribe Related Links

Five Ways to Reduce Data Center Server Power Consumption

From the Fall 2008 issue of Energy Matters

Reprinted from The Green Grid

An image of several green square boxes interspersed with round dots. To the right of this image are the words "the green grid: get connected to efficient IT."

It is not necessary to invest in large-scale hardware refresh programs or consolidation exercises to improve energy efficiency of data center servers. Identifying energy wasters, employing power saving features, right-sizing, powering down underutilized servers, and decommissioning legacy servers all help to reduce energy use. This article presents five ways to cut server power use by adjusting the way servers are running.

The Green GridTM is a global consortium dedicated to advancing energy efficiency in data centers and business computing ecosystems. In furtherance of its mission, The Green Grid is focused on:

  • Defining meaningful, user-centric models and metrics
  • Developing standards, measurement methods, processes and new technologies to improve data center performance against the defined metrics
  • Promoting the adoption of energy efficient standards, processes, measurements and technologies.

Comprised of an interactive body of members who share and improve current best practices around data center efficiency, The Green Grid's scope includes collaboration with end users and government organizations worldwide to ensure that each organizational goal is aligned with both developers and users of data center technology. All interested parties are encouraged to join and become active participants.

The Green Grid Board of Directors is comprised of the following member companies: AMD, APC, Dell, HP, IBM, Intel, Microsoft, Rackable Systems, Sun Microsystems and VMware.

Reducing energy use at the point of consumption (the server) provides benefits at all other levels by reducing load on power and cooling facilities which in turn reduces their own energy use.

Installed servers in data centers today are mainly x86 commodity servers. These servers consume much of the power allocated to IT server equipment, and present the largest opportunity for saving power in the data center. A significant reduction in energy usage can be realized if data center professionals reconsider the mindset that all servers need to be powered on at all times.

The conventional wisdom is that servers must be kept running 24/7/52 because restarting them poses a potential downtime risk. However, research suggests that this perception is false.

In a series of laboratory tests over a 5 month period, 123 servers were restarted several times daily by disconnecting and reconnecting the power using an automated power strip outlet. Out of 18,826 restarts, not a single component failure occurred.

By utilizing scripting and systems management tools such as Wake-on-Lan capabilities, many organizations can implement key energy-saving processes, without impacting operations, capital budgets or system reliability.

Listed below are five recommendations that will help data center professionals reduce their overall data center energy consumption by making changes at the server level.

1. Identify the Culprits

To understand how applying new practices will affect energy consumption, it is necessary to first identify and document all running servers within the data center, determine their business purpose and measure their power consumption. Organizations do not typically measure power consumption on a per-server basis, however, it is possible to generate estimates without too much difficulty.

The latest generation of servers feature built-in power monitoring via their out-of-band management capabilities. The vast majority of currently installed (older) servers do not have this ability. Therefore, other measurement methods must be used.

It is possible to instrument the power delivery infrastructure (e.g. 'smart' power strips) which can monitor power usage for each server in real time and provide accurate power usage statistics. Be aware, however, that this will require investment in new hardware, impact operations during installation, and add overhead when implementing and monitoring the solution.

A low-cost, low-disruption method bases power usage calculations on a server's CPU utilization. A study which compared different levels of power consumption based on thousands of servers with differing workloads concluded that power consumption tracks very closely with utilization. Therefore, this single metric can be used as a relatively accurate estimate of power consumption.

Internal disks spin and draw power all the time; the only additional power they draw when being accessed is to move the read/write head. The dynamic differential between idle and fully utilized is only around 30% of the disks' power draw, and as a fraction of overall system power that is negligible.

Memory is constantly being refreshed and drawing power regardless of it being read or written to. The change in memory power draw with use is also not significant when taken as a fraction of overall system power use.

Most I/O and memory use also comes with some CPU activity, since the CPU is used to manage and monitor the progress of the task, and as such, disk and memory use correlates to CPU utilization.

The CPU varies dramatically in its power draw, since the architecture has been optimized to shut down when in idle states. As such, it is unique in being the only component of the system that has a marked effect on system level power draw based on its utilization. The figure below illustrates a model where server power consumption scales linearly with CPU utilization.

This figure is a rectangular graph, with the vertical axis on the left labeled "Power Consumption (W)", and the horizontal axis on the bottom labeled "CPU Utilization %". The numbers along the vertical axis range from 0 to 350, and the horizontal axis numbers range from 0 to 100. There is a line running from the middle of the graph on the left at the 200 mark up toward the right hand side of the graph to the 300 mark.

Figure 1. CPU Utilization to Power Consumption

Most servers are already collecting CPU utilization information via systems management software; however, few organizations make use of this data other than for capacity planning. By taking average CPU utilization over a defined period of time, it is possible to calculate an estimate of the power consumed for that period.

Since our model scales linearly from idle to maximum utilization, once we know the power draw of a server at peak usage and at idle it is simple to estimate power usage at any utilization rate.

Until recently, the only power figure published for servers was the rating of the power supply, which is typically much higher than the actual power consumed. However, server manufacturers are now publishing actual power utilization figures for current models at idle and maximum utilization. This is being driven by the adoption of the ASHRAE Thermal Guidelines or similar manufacturers report which provides power ratings for minimum, typical, and full configuration.

Most organizations standardize server specifications. Therefore, it is likely that a limited number of differing server models exist at any particular site. Therefore, measuring the power consumption of a single server of each type at full load and at idle will not be complicated or time consuming and will provide sufficient accuracy to make informed decisions.

Once these figures are available, an estimate of power consumption (P) at any specific CPU utilization (n%) can be calculated using the following formula:

Pn = (Pmax – Pidle) * n/100 + Pidle

Example:

If a server has a maximum power draw of 300 Watts (W) and an idle power draw of 200W, then at 5% utilization the power draw would approximate to:

Power Utilization at 5%
= (300 - 200) * 5/100 + 200
= 100 * 0.05 + 200
= 205W

If the server was running at that average utilization for a 24-hour period, then the energy usage would equate to the following:

205W * 24 = 4920 Watt hour (Wh) = 4.92 kilowatt hour (kWh)

Through empirical measurement of various servers using a power analyzer, this approximation has proven to be accurate to within ±5% across all CPU utilization rates.

A baseline of current power usage throughout the data center can be created by adding up the power usage for all the servers in the data center. This data can then inform later decisions regarding which changes will have the most positive impact on overall server power usage.

2. Enable Server Processor Power-Saving Features

In recent years, x86 server processors have begun to incorporate the power-saving architectures that are common in both desktop and laptop computers. Enabling this feature can result in overall system power savings of up to 20%.

The power savings is achieved by reducing the frequency multiplier (Frequency identifier or FID) and the voltage (Voltage identifier or VID) of the CPU. Intel's version of this technology is known as either Enhanced Intel SpeedStep Technology (EIST) or Demand Based Switching (DBS). AMD's version is Cool'n'Quiet or PowerNOW. The combination of a specific CPU frequency and voltage is known as a performance state (p-state). "P-state control" refers to the capability to reduce frequency and voltage.

Altering the p-state can reduce a server's power consumption when at low utilization but can still provide the same peak level of performance when required. The switch between p-states is dynamically controlled by the operating system and occurs in microseconds, causing no perceptible performance degradation.

This is a rectangular graph, with the vertical axis on the left labeled "Power Consumption (Watts)", and the horizontal axis on the bottom labeled "CPU Utilization." The numbers along the horizontal axis range from "Idle" to 100%. There are two lines on the graph. The top line begins at the "Idle" mark and runs toward the upper right corner of the graph. The bottom line also begins at the "Idle" mark and runs horizontally to the 60% mark, then points up to join the top line at the upper right corner.

Figure 2: Impact of p-state on Power Consumption (AMDData7)

Although a processor may be p-state capable, both the system Basic Input-Output System (BIOS) and the operating system must be capable of enabling the feature to make use of it. Check the BIOS of a representative sample of each server model to find out if the server supports the relevant version of the p-state technology.

Instructions on how to implement p-states on the three main x86 commodity server operating systems can be found in Appendix A of the full article (PDF 220 KB). Download Adobe Reader.

3. Right-Size Server Farms

In recent years, Web services have driven the growth of server farms in data centers, which, in many cases, are overprovisioned. The analysis of server farm usage patterns will reveal the potential for 'right-sizing'. Unneeded capacity can be turned off, but the server farm can still provide sufficient resiliency for agreed upon service levels.

Data center owners should perform analysis of server utilization data (CPU, disk and network) across all servers in a server farm. Average utilization across the farm is likely to follow a daily, weekly and/or monthly pattern.

Collection of utilization data over time can inform decisions regarding how many actual servers are required to provide peak service levels plus resilience. It is likely that this number is lower than the actual number of servers in the farm, meaning that the surplus capacity can be powered down to conserve energy.

For example, if a server farm consisting of 10 servers has a maximum utilization (max of CPU, disk or network utilization per server) across the farm of 50%, this is an aggregate utilization of 500%, which equates to five servers running at 100%. To provide sufficient headroom and still allow for resilience, the farm could easily run with seven servers (peak utilization of 500/7 = 71%). Under this scenario, if one server failed, sufficient capacity would still exist (with six servers, peak utilization would be at 83%) and three warm standby servers would still be available to rapidly recover the availability levels should one of the active servers fail. In this example, a power-saving equivalent of up to three servers is possible for this farm.

It is possible to automate the restart of servers by using either built-in out-of-band power management capabilities or Wake-on-LAN tools. Out-of-band management capabilities can be controlled via vendor specific software or through standard simple network management protocol methods.

4. Power Down Servers When Not in Use

Not all servers need to be operational 24/7/52. Individual servers may be powered down for certain periods of the day. Typical examples are servers found in test and development environments. The test team will know when a test run has finished, and particular test machines could then be powered down until they are needed. In addition, development build systems should be powered down until a build run is required.

Certain types of servers will regularly go unused for lengthy periods of time. These should be targeted for powering down. CPU statistics will show that certain machines have a consistently low (typically an idle server will run at <1%) CPU utilization for long periods of time. Analysis of server utilization will reveal a pattern when the servers are busy; machines could be scheduled to power down for the periods of time that they are idle and then powered up in time to perform their useful work.

For example, a server executing backup software which is only busy from 10 PM until 6 AM could be scheduled to power itself down at 8 AM every day and then be powered up by operations management tools or a job scheduling system at 9 PM ready to perform the next night's backups. If the server were required for a restore during the day, the operator could run a script that would power the machine back up, run the restore and then power the machine back down.

5. Decommission Old Systems that Provide No Useful Work

Evidence suggests that a significant number of installed servers are not used at all by anyone, such as older servers that have fallen out of use but have not been decommissioned.

Servers of this type will usually have very low utilization rates all the time, with only the occasional spikes of utilization when standard housekeeping tasks run (backups, virus scans, etc.). But, these machines are performing no useful purpose, while consuming power and heating the data center.

Once a machine has been identified as "unused", confirm this status by analyzing network statistics. This exercise will ensure that all connections to the machine in question are from management systems and not from other business systems or from end users. If end users are indeed linked to the server in question, they should be contacted to determine how the server is providing useful work. It is highly likely that the connections are merely legacy in nature and can be terminated.

Once the server has been categorically confirmed as unused it can either be decommissioned, or turned off and put aside as stock ready for deployment should users develop a relevant requirement.

Keeping a legacy server around simply because it is available may be poor efficiency practice. New servers available today offer better performance with significantly reduced energy demands. If the decision is made to retire legacy servers, they should be processed for recycling and/or repurposing. Most server manufactures offer global recycling programs.

Read More Energy Matters Articles on These Topics

Feedback: Your comments on this article Reprint this article