China has done it again — except this time with a brand new supercomputer. The Sunway TaihuLight is now the fastest system in the world, according to the twice-per-year TOP500 list, with a stunning Linpack benchmark result of 93 petaflops. That makes it three times faster than the prior champion, China’s Tianhe-2, which we’ve covered numerous times on ExtremeTech and had sat on top of the list since it first went online in 2013.
What’s even more interesting this time around is what’s under TaihuLight’s hood: a locally developed ShenWei processor and custom interconnect, instead of parts sourced elsewhere. The ShenWei 26010 is a 260-core, 64-bit RISC chip that exceeds 3 teraflops at maximum tilt, putting it on par with Intel’s Knight’s Landing Xeo Phi. TaihuLight contains 40,960 ShenWei 26010s, one for each node that also contains 32GB of RAM, adding up to a total of over 10 million cores.
As HPCwire illustrates, each SW26010 processor chip has four main components:
Which are grouped together in this basic layout of a node:
The chip has four core groups, each with 64 elements and a single management processing element, for a total of 65 per group (to get to the 260 core total). Each group sports a 136.5 GB/sec memory controller; there’s no word on the process technology node used to manufacture the chip. The TOP500 report said that the chip also lacks any traditional L1-L2-L3 cache, and instead has 12KB of instruction cache and 64KB “local scratchpad” that works sort of like an L1 cache.
The custom interconnect, called the Sunway Network, is based on PCIe 3.0 and delivers 16 GB/sec of peak bandwidth between nodes, with 1ms latency.
TaihuLight will be used for climate, weather, and earth systems modeling; life science research; manufacturing; and data analytics, according to TOP500’s official report. The system is located at the National Supercomputing Center in Wuxi, which sits about two hours west of Shanghai.
This is the first time China has the most systems on the list, with 167, instead of the US, which is down to 165. China also now has the top two fastest systems as well. Europe has 105 systems, down two from November 2015. Cray continues to lead in total performance share at 19.9 percent, but that’s down from 25%. China’s National Research Center of Parallel Computing Engineering & Technology, which developed TiahuLight, takes the second spot here with this single machine, at 16.4% share, while IBM takes third with 10.7%.
Total combined performance of all 500 supercomputers has jumped significantly, from 420 petaflop/s six months ago to 566.7 now. Ninety-five systems on the list now exceed one petaflop.
Here’s the current list of the 10 fastest supercomputers in the world:
2. Titan: A Cray XK7 system at the Department of Energy’s Oak Ridge National Laboratory (17.59 petaflop/s).
3. Sequoia: An IBM BlueGene/Q system located at the Department of Energy’s Lawrence Livermore National Lab in California, with 1.57 million cores.
4. K Computer: A SPARC64 system with 705k cores at RIKEN Advanced Institute for Computational Science in Japan.
7. Piz Daint: Cray XC30 with 116k Xeon and Nvidia cores; located at the Swiss National Computing Centre in Switzerland.
9. Shaheen II: A Cray XC40 at King Abdullah’s University of Science and Technology in Saudi Arabia, marking the first appearance of a Middle East supercomputer in the top 10 (5.536 petaflop/s).