China Shakes Up ARM Servers
64-core chip leapfrogs competition
Rick Merritt
8/25/2015 08:00 PM EDT
Hot Chips event here the most aggressive ARM-based server processor to date. In the same session, Oracle described its first Sparc processor with integrated Infiniband.
Little known Phytium Technology Co. Ltd., founded in 2012, described a processor using 64 custom ARMv8 cores that will run at up to 2 GHz at 28nm. It can issue up to four instructions per cycle to hit up to 512 GFlops. The massive chip consumes 120W and fits in a 640mm2 die with about 3,000 pins.
The so-called Mars design surpasses existing high-end ARM-based server chips such as the
48-core ThunderX now sampling from Cavium and a high-end part still in the works at Broadcom.
In February EZchip said it will ship a 100-core ARMv8 made in a 28nm process, but it may not ship until 2017.
The Mars design has not yet taped out, but nevertheless impressed analysts and observers at the annual gathering of microprocessor designers here, in part because few had heard of the company.
Like IBM's Power 8, Mars uses external L3 cache and memory controllers.
“My God, who knew…this is by far the most aggressive 64-bit ARM chip to be announced – it’s just awesome, and it was definitely the surprise of this event,” said Nathan Brookwood, principal of Insight64 (Saratoga, Calif.).
Sam Naffziger, a fellow at AMD who moderated the session, called Mars a respectable design with a “good cache hierarchy and good bandwidth match.”
Hot Chips organizers were surprised to get a paper proposal from Phytium, a company they had not heard from previously. It had accepted several papers in the past from a China government- and university-backed team building the so-called
Godson processor.
“I was surprised we didn’t hear from [the Godson team] again this year,” said Ralph Wittig, a Hot Chips organizer. “When we got the Phytium paper we heard from ARM they were confident the startup was doing real stuff…their external memory modules are like IBM;s work on Power 8…we were highly impressed as a program committee,” Wittig said.
Adding to the mystery, a Phytium engineering manager was not able to get a U.S. visa in time for the event. He presented his slides by phone from China where the company has offices in Tainjin and Guangzhou.
One attendee familiar with Phytium said the team was not from the Godson project. The company’s Tianjin offices did suffer broken glass and shrapnel from the recent explosions there, he said.
In simulations on the SpecCPU 2006 rate benchmark, Mars hit 672 in integer and 585 in floating-point performance for a 64-core chip. However, observers noted its scaling from single-core performance was modest.
The chip is organized into eight-core panels in which four cores share a 4-MByte cache. Eight external chips provide a total of 128 Mbytes L3 cache and 16 DDR3-1600 channels.
Phytium’s custom 64-bit ARM core has 192 physical registers. A reorder buffer can hold up to 160 instructions, and about 210 instructions can be in-flight in the overall pipeline.
Phytium designed its own 64-bit ARM core code-named Xiaomi.
The chip dispatches and retires instructions in-order and executes them out-of-order. It uses an aggressive branch predictor and implements multithreading.
Mars supports MPI and Open MP interfaces for multiprocessing systems. Another processor in the works, called Earth, will be a lower cost, lower power device aimed more at today’s large data center
“I’m pretty sure [Mars] will be the first 64-core ARMv8 processor in the world,” said Charles Zhang, director of research for Phytium, speaking via a phone line to Hot Chips attendees. “It’s a good beginning…in next few years we will develop more powerful CPUs,” he said.
One of the biggest drawbacks of Mars is its size, said analysts. Achieving good yields on such a large chip will be difficult, they noted.
http://www.eetimes.com/document.asp?doc_id=1327526