A staggering 80% of data centres fail to exceed 60% of their original design specification. Paul Rivett, group operations director, CNet Training, identifies common poor practices and offers an overview of the key factors that need to be considered to ensure efficient, resilient data centre operations
The cost of energy required by high-density power and cooling systems that support mission-critical compute makes energy consumption a key business consideration for the operation of data centres. Striving for energy efficiency is easier said than done when balancing operational effectiveness with energy use, and with governments and regulatory bodies taking a greater interest in the energy consumption of the data centre sector, this adds further emphasis to energy efficiency endeavours.
Contributions to sustainability through energy efficiency can be achieved by fully optimising current data centre assets; ensuring they are operated efficiently and effectively and by maximising the use of their capabilities; idle time is usually not effective or efficient. Achieving this requires a structured dialogue to fully understand the business needs, facilitating an evaluation of the data centre and therefore, for the assets to be deployed usefully.
A comprehensive audit of the data centre is the best way to validate the operational capabilities against the business expectations. With the critical components of the data centre having dependencies upon one another it is essential that the audit does not have an impact on normal operations, so an audit plan should be created.
Given that the power and cooling infrastructures are subservient to the IT assets, this is the logical starting point for an audit, the results of which can facilitate asset optimisation. All too often, the IT systems are not paid the level of attention required to guarantee service provision while effectively utilising the power and cooling infrastructures. This leads to common poor practices:
• Failure to adopt a blanking plate policy for IT equipment cabinets
• Failure to remove redundant IT hardware which is often left powered up
• Poor cable management in the rear of cabinets reducing the removal of heat
• Blockages in raised access floors restricting or even preventing air flow
• Failure to understand the operating parameters of the IT hardware and its potential
These issues have a significant impact on effectiveness and efficiency.
Many data centres continually struggle to effectively provide cooling for increasing compute demands, but before considering actions such as adjusting flow rates or changing operating environments, the effectiveness of airflow management needs to be clarified. Poor airflow management is a significant consumer of energy but is relatively easy to correct. Failing to maximise the supply and return air temperature cycles leads to unnecessary energy consumption and can cause bypass and recirculation of cool and hot air, simply wasting the capability of the cooling system.
The priority is to maximise the supply air temperature, so that the conditioned air effectively reaches the air intakes of the IT equipment. IT and facilities management need to work together to understand the operating parameters of the IT environment and adjust the supply air temperature accordingly.
This is especially relevant with modern IT equipment now being able to operate at up to 35°C (95°F). This provides an opportunity to raise the operating temperature, which in turn reduces energy consumption.
The return air temperature brings us back to the actual primary purpose of the cooling system in a data centre, and that is the removal of heat. An efficient cooling system should be able to remove heat from the data centre or recycle it. To optimise the cooling system, it is essential that the following are regularly monitored:
• Cabinet supply and return air temperatures
• Floor plenum pressure if a raised floor is in use
• Relative humidity and dew point
• CRAC airflow rates
• Delta T across the IT equipment
• Thermal quantification (TQ) testing following any major operational changes
The use of computational fluid dynamic (CFD) applications has become an integral planning component enabling data centre operators to map, evaluate and adjust its thermal footprint.
After reviewing and aligning IT and cooling, a better understanding of the power demands should be revealed and the potential energy savings become more apparent.
Historically, many data centres have been designed with unnecessarily high levels of resilience and redundant components to cover all potential incidents. However, surveys suggest that 80% of data centres fail to exceed 60% of their original design specification, usually due to under-utilisation and a general acceptance of transmission losses. As data centres attempt to become more energy efficient and to optimise their capabilities, these transmission losses need to be controlled.
It is essential to understand what energy capacity you currently have and how it is distributed. A baseline needs to be established and this is where a simple metric such as power usage effectiveness (PUE) comes into play. There are varying opinions on PUE and how it should be applied but that aside, it does provide a simple equation of the total facility power divided by the IT equipment power. That simple comparison exposes how much power actually reaches the IT equipment and how much is therefore used or lost elsewhere. A measuring and monitoring strategy should be implemented across the components in the power distribution path, identifying areas of power losses and inefficiency such as:
• Transmission losses along the electrical distribution system
• Conversion losses across electrical components
• UPS input versus output (load capacity)
• Generator heater block demands
• Excessive or under-utilised electrical components
With the power distribution consumption known, it is not simply a matter of just reducing energy consumption. There are many factors that can have an unforeseen impact on the operational capability. An accurate baseline provides an opportunity to revisit the business needs and to re-evaluate what is required. Commonly known as ‘continuous commissioning’, this is where the sequence of operations is revisited. In many cases the business may have changed but the operational capabilities have not. This information can be used to establish an energy efficiency strategy to optimise energy distribution across the data centre.
Examples of actions are:
• Identify primary areas of concern
• Identify under-utilised components
• Determine key areas of improvement
• Understand what is achievable
• Understand potential costs, resource demands and timelines
The next step is to get the buy-in of both senior management and key stakeholders. This is essential to assess any perceived risk and to set priorities for what is both acceptable and achievable. In many data centres, actions based on the pursuit of energy efficiency are treated with a sceptical mind-set as the belief is they will inevitably lead to more unforeseen cost to implement. It can take a concerted effort and drive to optimise what already exists and to fully utilise its capability, but it is worth it?
Once the status of the data centre environment is properly understood, it is essential that the processes, procedures and working practices are also reviewed to ensure that they are aligned to the data centres operational requirements.
Businesses must be prepared to invest in effective measuring and monitoring assets to ensure they have access to real-time data to show, with confidence, the on-going optimisation of energy use. Remember, if you cannot measure it, how can you manage it?