Power Systems are servers with a complete software stack optimized by IBM. This does not prevent it from being an open system, both in hardware and software.
In June 2018, the United States regained the top spot in the top 500 most powerful supercomputers on the planet. And IBM can be proud: the top two machines, the IBM Summit Oak Ridge Labs and the Sierra Lawrence Livermore Laboratory, are based on its latest processor, the POWER9, which combines performance and energy efficiency.
This success will undoubtedly strengthen customers who have opted for a processor architecture other than the classic x86. Many have made this choice for their critical applications: the top 10 global players in the banking, insurance and telecommunications sectors have opted for POWER. Most operate with Power Systems servers “on premise”, since the data is critical. But the transition to the Cloud, at least in hybrid mode, which is inevitable, IBM is offering more and more options to operate Power servers in the Cloud.
Openness: the engine of success for Power Systems servers
There are several reasons for the success of Power Systems servers. Since 2013, the POWER architecture has been open. The five founding members of the OpenPower Foundation (including Google and Nvidia) have been joined by more than 300 partners. By sharing the hardware specifications of its processor, IBM wanted to develop the ecosystem around this high-performance architecture in the scientific and technical field. In addition, there is a native hypervisor, PowerVM, which requires only a few hardware resources. Virtualization can be done in many ways: with PowerVM or with KVM for virtualization on Linux.
The server is also open from a software point of view, as it can adapt to three operating systems: AIX, IBM i but also with Linux (with most distributions on the market). As IBM markets a complete platform, hardware and software, the whole set is optimized.
The processor, but also the bandwidth
The first POWER9-based solution was commercialized at the end of 2017. The improvements are essentially at the multithreading level (eight parallel instruction units against two previously). The memory bandwidths are reinforced and the NVLink bus connects CPU and GPU: the GPU can thus access the entire memory capacity of the machine, which avoids bottlenecks.
Several factors make the performance of Power Systems progress faster than other platforms. With each processor change, the entire architecture of the machine evolves, so that the power available to the processor can be exploited to the fullest. For example, improving bandwidth avoids latencies in data processing. On the chip itself, many tasks are processed by specialized circuits (recorded on the chip directly), such as encryption and compression, which allows the processor itself to be offloaded. For example, a dedicated module processes the 90% of the memory compression/decompression cycles. Thus, the processor is more available, and the server administrator has two solutions: either reduce the fraction of the CPU allocated to workloads (and add more) for the same environment; or maintain the same environment to accommodate more users.
Adapted to machine learning
The POWER architecture is particularly well-calibrated for artificial intelligence applications and in particular for machine learning. Therefore, the PowerAI (model AC922, now called Watson Machine Learning Accelerator), is a machine dedicated to the development of algorithms in this field. Its high bandwidth makes it a unique solution on the market. Specifically, data scientists can test their models faster, change their parameters frequently and, therefore, arrive at the final model in the shortest possible time. In addition, the software stack that accompanies PowerAI includes applications such as H2O (facilitating the creation of models adapted to data) and PowerAI Vision (artificial vision).
Many data scientists will testify to their experience during POWER Week, from May 20 to 24, 2019, an event dedicated to Power Systems.