Japan Announces 2+ Petaflop Supercomputer
Published in Blog
The Tokyo Institute of Technology announced the details of the “Tsubame 2.0,” the next-generation supercomputer system for the university that will start operation in the fall of 2010, at a press meeting. The computation capacity of the system is 2.39 PFLOPS (petaflops, double-precision value), which ranks second in the “Top500,” a ranking of supercomputers, as of June 2010. “It will be the first petaflops computer in Japan,” said Satoshi Matsuoka, professor at the Global Scientific Information and Computing Center (GSIC) of the university. “And it will be the first world-class supercomputer system for our university.”
However, the actual construction of the system, which will be conducted by NEC Corp and Hewlett-Packard Co, has yet to be done. The system has the “vector-scalar mixture architecture,” Matsuoka said. But the computation capacity of its graphics processing units (GPUs) accounts for 90% of the total computation capacity, making the system more like a vector computer. Therefore, the performance of the system slightly differs depending on the type of calculation. Specifically, the performance target in terms of the Linpack benchmark is 1.0−1.4 PFLOPS, which ranks third or fourth in the Top500 as of June 2010. On the other hand, for calculations that are suited for vector computers such as weather prediction, the performance can be more than 150 TFLOPS (teraflops), which is much higher than the world record (50 TFLOPS).

The backbone of the supercomputer system consists of 2,816 of six cores 2.93 GHz Intel Xeon 5600 microprocessor (Westmere-EP), and 4,224 Nvidia Tesla M2050 GPUs. The double precision arithmetic performance of the Tesla M2050 is much higher than that of the existing Tesla GPUs, which are developed mainly for single precision arithmetic. A unit of the Tesla M2050 has a performance of 515 GFLOPS (double-precision). The performance per node is 1.6 TFLOPS or 51.2 TFLOPS per rack.

The university made two major improvements for enhancing the performance of the system. First, it improved the memory bandwidth. Specifically the network bisection bandwidth (the minimum communication capacity of the cross section at a random part of the system) is about 200 Tbps, which is 33 times higher than that of the Tsubame 1.0, a supercomputer system constructed by the university in 2006.
The other improvement was made to the memory and its composition. The university structured a multilevel storage using not only DRAMs such as DDR3 but also SSDs (solid state drives) composed of flash memories. While the total memory capacity of the backbone system’s DRAMs is 80.6 Tbytes for microprocessors and 12.7 Tbytes for GPUs, the total memory capacity of the SSDs is 173.9 Tbytes. SSDs have a high performance in inputting and outputting data.
The new supercomputer system has one more noteworthy feature: low power consumption. While the power consumption of the Tsubame 1.0 including its cooling system is 0.85MW, that of the Tsubame 2.0, which has a 30 time higher computation capacity, is only 1MW. So, the power consumption per computation capacity was reduced to about 1/25. The performance value per watt (in terms of the Linpack benchmark) is expected to exceed 1,000 MFLOPS (megaflops) per watt and will possibly be ranked first in the Green500, a ranking of supercomputer’s energy saving performance, the university said.
The drastic decrease in the power consumption per computation capacity is also an advantage in terms of cost. The cost for the entire system and the basic maintenance cost for four years amount to ¥3.2 billion (US$35 million), which is low. While the normal cost to introduce a supercomputer is about ¥10 million per 1 TFLOPS, the cost to introduce the Tsubame 2.0 is about ¥3 million per 1 TFLOPS. The cost does not include electricity costs, which are about ¥100 million per year. If the electricity costs increased in the same ratio as the computation capacity, they could be up to ¥2.5 billion per year.
21st June, 2010
