What's new

China Dominates the World TOP500 Supercomputers

lets concentrate the topic on supercomputer only. if you want talk other topics, go to another more appropriate thread please. thanks.

See the talk here is already way off of supercomputer already. Not my fault.
 
Let get back to business.

There are currently three competiting domestic CPU that will power the 100+ petascale supercomputer, ShenWei, Loongson Godson, and Phytium Mars. Each has its own unique architectural design, Shenwei is RISC , Loongson is MIPS, and Phytium Mars is ARM-based design. So it is very interesting time ahead. Right now ShenWei 4th gen CPU is powering Sunway TaihuLight and I heard they got a 5th gen done ready to power up the 200-300 petaflops supercomputer. We will see which design is best at powering our next exascale supercomputer in 2020 as these three competing cpu architectural design is optimal for exascale. It is always good to see intense competition among our cpu scientist. As far as which design can be used on commercial computer, the desktop at your home so to speak, then I gotta go with Loongson Godson because MIPS instruction can be translated with an emulator read x-86 architectural instruction set so that means commercial OS, Window-based and iOS operation system, can be useful on Godson cpu.
 
Let get back to business.

There are currently three competiting domestic CPU that will power the 100+ petascale supercomputer, ShenWei, Loongson Godson, and Phytium Mars. Each has its own unique architectural design, Shenwei is RISC , Loongson is MIPS, and Phytium Mars is ARM-based design. So it is very interesting time ahead. Right now ShenWei 4th gen CPU is powering Sunway TaihuLight and I heard they got a 5th gen done ready to power up the 200-300 petaflops supercomputer. We will see which design is best at powering our next exascale supercomputer in 2020 as these three competing cpu architectural design is optimal for exascale. It is always good to see intense competition among our cpu scientist. As far as which design can be used on commercial computer, the desktop at your home so to speak, then I gotta go with Loongson Godson because MIPS instruction can be translated with an emulator read x-86 architectural instruction set so that means commercial OS, Window-based and iOS operation system, can be useful on Godson cpu.
you forgot this dsp chip to be used in tianhe-2a

--
Dr Lu, who leads the design of China's Tianhe supercomputers, said homegrown digital-signal processors (DSPs) will power the upgrade to the Tianhe-2A super, our sister website The Platform reports. Dr Lu revealed the development at the International Supercomputing Conference in Germany on Wednesday.

The boosted Tianhe-2A is due to go live before the end of 2016, and is apparently expected to perform 100PFLOPs – 100,000 trillion calculations per second – at its peak. It will, according to Dr Lu, consume up to 18MW of power, pack about three petabytes of system RAM, and use Intel Xeon E5-2692 processors from the Tianhe-2 plus the new homegrown accelerators.

Today's Tianhe-2 – the world's most powerful publicly known supercomputer – uses a mix of E5-2692 CPUs and Xeon Phi accelerators. Essentially, the 2A will use the China-crafted DSPs instead of the Phis, alongside the Xeon E5 processors, it appears. The Tianhe-2A will be built from 18,000 nodes, and run off a 30PB file system, we're told.

The Matrix2000 DSPs are 1GHz 64-bit chips, they draw 200W of power, and can perform 4.8TFLOPS of single and 2.4TFLOPS of double precision math. They are interfaced using x16 PCIe 3.0 links, and performance-wise, give GPUs and the Phi a run for their money.

"The Tiahne-2 machine (and its eventual successor sporting the DSP accelerators) is housed at the National University of Defense Technology (NUDT) in China," The Platform co-editor Nicole Hemsoth reports.
Code:
http://www.theregister.co.uk/2015/07/15/china_supercomputer_chips/

Matrix2000 GPDSP
FBKBnjI.jpg
 
Last edited:
Using 14nm process, the upcoming exascale machine is envisioned to achieve an energy efficiency ratio, measured by performance per Watt, of up to 60 Gflops/W (vs 6Gflops/W with SW26010), with new cutting-edge cooling tech supporting extreme power density of over 100KW/M3 for the whole system.

Let get back to business.

There are currently three competiting domestic CPU that will power the 100+ petascale supercomputer, ShenWei, Loongson Godson, and Phytium Mars. Each has its own unique architectural design, Shenwei is RISC , Loongson is MIPS, and Phytium Mars is ARM-based design. So it is very interesting time ahead. Right now ShenWei 4th gen CPU is powering Sunway TaihuLight and I heard they got a 5th gen done ready to power up the 200-300 petaflops supercomputer. We will see which design is best at powering our next exascale supercomputer in 2020 as these three competing cpu architectural design is optimal for exascale. It is always good to see intense competition among our cpu scientist. As far as which design can be used on commercial computer, the desktop at your home so to speak, then I gotta go with Loongson Godson because MIPS instruction can be translated with an emulator read x-86 architectural instruction set so that means commercial OS, Window-based and iOS operation system, can be useful on Godson cpu.

:coffee::D

Exascale硬件系统研究方面

主要从处理器结构、互连网络、整机基础架
构三个方面开展了研究。处理器研究的核心是能
效比约束,本课题提出了高性能GPDSP数据流
SPU
异构通用众核等多种不同技术路线分别开
展研究,在14nm 工艺下,处理器能效比有望达
30-60GFLOPS/W。互连网络研究主要提出了
两种不同的技术路线,分别是高维可扩展互连网
络和光电混合互连网络,基于两种不同架构分别
提出了有效支持10 万个节点规模的高速互连方
案。整机基础架构方面重点针对散热技术开展了
研究,提出了包括肋片型强化换热液冷冷板和相
变冷板等新型散热技术,可以有效满足E 级环境
下系统散热体积功耗密度达到100KW/M3以上
要求

@Bussard Ramjet
 
Let get back to business.

There are currently three competiting domestic CPU that will power the 100+ petascale supercomputer, ShenWei, Loongson Godson, and Phytium Mars. Each has its own unique architectural design, Shenwei is RISC , Loongson is MIPS, and Phytium Mars is ARM-based design. So it is very interesting time ahead. Right now ShenWei 4th gen CPU is powering Sunway TaihuLight and I heard they got a 5th gen done ready to power up the 200-300 petaflops supercomputer. We will see which design is best at powering our next exascale supercomputer in 2020 as these three competing cpu architectural design is optimal for exascale. It is always good to see intense competition among our cpu scientist. As far as which design can be used on commercial computer, the desktop at your home so to speak, then I gotta go with Loongson Godson because MIPS instruction can be translated with an emulator read x-86 architectural instruction set so that means commercial OS, Window-based and iOS operation system, can be useful on Godson cpu.

Who the hell is Phytium? Never heard of their chips before. Is Chinese HPC chip business about to experience a period of exponential growth and then Chinese are going to sell their HPC chips by pound? It would be fun to see Chinese supers take over top spots on top500 list in a few years with 4 different kinds of Chinese home brewed HPC chips.
 
Last edited:
Who the hell is Phytium? Never heard of their chips before. Is Chinese HPC chip business about to experience a period of exponential growth and then Chinese are going to sell their HPC chips by pound? It would be fun to see Chinese supers take over top spots on top500 list in a few years with 4 different kinds of Chinese home brewed HPC chips.

China Shakes Up ARM Servers

64-core chip leapfrogs competition

Rick Merritt

8/25/2015 08:00 PM EDT

Hot Chips event here the most aggressive ARM-based server processor to date. In the same session, Oracle described its first Sparc processor with integrated Infiniband.

Little known Phytium Technology Co. Ltd., founded in 2012, described a processor using 64 custom ARMv8 cores that will run at up to 2 GHz at 28nm. It can issue up to four instructions per cycle to hit up to 512 GFlops. The massive chip consumes 120W and fits in a 640mm2 die with about 3,000 pins.

The so-called Mars design surpasses existing high-end ARM-based server chips such as the 48-core ThunderX now sampling from Cavium and a high-end part still in the works at Broadcom. In February EZchip said it will ship a 100-core ARMv8 made in a 28nm process, but it may not ship until 2017.

The Mars design has not yet taped out, but nevertheless impressed analysts and observers at the annual gathering of microprocessor designers here, in part because few had heard of the company.


Like IBM's Power 8, Mars uses external L3 cache and memory controllers.

“My God, who knew…this is by far the most aggressive 64-bit ARM chip to be announced – it’s just awesome, and it was definitely the surprise of this event,” said Nathan Brookwood, principal of Insight64 (Saratoga, Calif.).

Sam Naffziger, a fellow at AMD who moderated the session, called Mars a respectable design with a “good cache hierarchy and good bandwidth match.”

Hot Chips organizers were surprised to get a paper proposal from Phytium, a company they had not heard from previously. It had accepted several papers in the past from a China government- and university-backed team building the so-called Godson processor.

“I was surprised we didn’t hear from [the Godson team] again this year,” said Ralph Wittig, a Hot Chips organizer. “When we got the Phytium paper we heard from ARM they were confident the startup was doing real stuff…their external memory modules are like IBM;s work on Power 8…we were highly impressed as a program committee,” Wittig said.

Adding to the mystery, a Phytium engineering manager was not able to get a U.S. visa in time for the event. He presented his slides by phone from China where the company has offices in Tainjin and Guangzhou.

One attendee familiar with Phytium said the team was not from the Godson project. The company’s Tianjin offices did suffer broken glass and shrapnel from the recent explosions there, he said.

In simulations on the SpecCPU 2006 rate benchmark, Mars hit 672 in integer and 585 in floating-point performance for a 64-core chip. However, observers noted its scaling from single-core performance was modest.

The chip is organized into eight-core panels in which four cores share a 4-MByte cache. Eight external chips provide a total of 128 Mbytes L3 cache and 16 DDR3-1600 channels.

Phytium’s custom 64-bit ARM core has 192 physical registers. A reorder buffer can hold up to 160 instructions, and about 210 instructions can be in-flight in the overall pipeline.


Phytium designed its own 64-bit ARM core code-named Xiaomi.

The chip dispatches and retires instructions in-order and executes them out-of-order. It uses an aggressive branch predictor and implements multithreading.

Mars supports MPI and Open MP interfaces for multiprocessing systems. Another processor in the works, called Earth, will be a lower cost, lower power device aimed more at today’s large data center

“I’m pretty sure [Mars] will be the first 64-core ARMv8 processor in the world,” said Charles Zhang, director of research for Phytium, speaking via a phone line to Hot Chips attendees. “It’s a good beginning…in next few years we will develop more powerful CPUs,” he said.

One of the biggest drawbacks of Mars is its size, said analysts. Achieving good yields on such a large chip will be difficult, they noted.


http://www.eetimes.com/document.asp?doc_id=1327526
 
China Shakes Up ARM Servers

64-core chip leapfrogs competition

Rick Merritt

8/25/2015 08:00 PM EDT

Hot Chips event here the most aggressive ARM-based server processor to date. In the same session, Oracle described its first Sparc processor with integrated Infiniband.

Little known Phytium Technology Co. Ltd., founded in 2012, described a processor using 64 custom ARMv8 cores that will run at up to 2 GHz at 28nm. It can issue up to four instructions per cycle to hit up to 512 GFlops. The massive chip consumes 120W and fits in a 640mm2 die with about 3,000 pins.

The so-called Mars design surpasses existing high-end ARM-based server chips such as the 48-core ThunderX now sampling from Cavium and a high-end part still in the works at Broadcom. In February EZchip said it will ship a 100-core ARMv8 made in a 28nm process, but it may not ship until 2017.

The Mars design has not yet taped out, but nevertheless impressed analysts and observers at the annual gathering of microprocessor designers here, in part because few had heard of the company.


Like IBM's Power 8, Mars uses external L3 cache and memory controllers.

“My God, who knew…this is by far the most aggressive 64-bit ARM chip to be announced – it’s just awesome, and it was definitely the surprise of this event,” said Nathan Brookwood, principal of Insight64 (Saratoga, Calif.).

Sam Naffziger, a fellow at AMD who moderated the session, called Mars a respectable design with a “good cache hierarchy and good bandwidth match.”

Hot Chips organizers were surprised to get a paper proposal from Phytium, a company they had not heard from previously. It had accepted several papers in the past from a China government- and university-backed team building the so-called Godson processor.

“I was surprised we didn’t hear from [the Godson team] again this year,” said Ralph Wittig, a Hot Chips organizer. “When we got the Phytium paper we heard from ARM they were confident the startup was doing real stuff…their external memory modules are like IBM;s work on Power 8…we were highly impressed as a program committee,” Wittig said.

Adding to the mystery, a Phytium engineering manager was not able to get a U.S. visa in time for the event. He presented his slides by phone from China where the company has offices in Tainjin and Guangzhou.

One attendee familiar with Phytium said the team was not from the Godson project. The company’s Tianjin offices did suffer broken glass and shrapnel from the recent explosions there, he said.

In simulations on the SpecCPU 2006 rate benchmark, Mars hit 672 in integer and 585 in floating-point performance for a 64-core chip. However, observers noted its scaling from single-core performance was modest.

The chip is organized into eight-core panels in which four cores share a 4-MByte cache. Eight external chips provide a total of 128 Mbytes L3 cache and 16 DDR3-1600 channels.

Phytium’s custom 64-bit ARM core has 192 physical registers. A reorder buffer can hold up to 160 instructions, and about 210 instructions can be in-flight in the overall pipeline.


Phytium designed its own 64-bit ARM core code-named Xiaomi.

The chip dispatches and retires instructions in-order and executes them out-of-order. It uses an aggressive branch predictor and implements multithreading.

Mars supports MPI and Open MP interfaces for multiprocessing systems. Another processor in the works, called Earth, will be a lower cost, lower power device aimed more at today’s large data center

“I’m pretty sure [Mars] will be the first 64-core ARMv8 processor in the world,” said Charles Zhang, director of research for Phytium, speaking via a phone line to Hot Chips attendees. “It’s a good beginning…in next few years we will develop more powerful CPUs,” he said.

One of the biggest drawbacks of Mars is its size, said analysts. Achieving good yields on such a large chip will be difficult, they noted.


http://www.eetimes.com/document.asp?doc_id=1327526

Very impressive, and this startup is just out of blue!
 
I had a strong feeling that most of these 20 satellites were nanosatellites.

Thanks for exposing the fraudulent Indian 'achievement'.

how is it fraudulent :what:

the achievement is the amount of satellites put into orbit not the weight of them:coffee:

https://en.wikipedia.org/wiki/Polar_Satellite_Launch_Vehicle#Launch_history

though I think the Falcon 9 launch with 10 Iridium-Next satellites will be more impressive showing :pop:
IRDM_IridiumNEXT_SatConfig_2015.jpg

https://en.wikipedia.org/wiki/List_of_Falcon_9_and_Falcon_Heavy_launches#2016
 
you forgot this dsp chip to be used in tianhe-2a

--
Dr Lu, who leads the design of China's Tianhe supercomputers, said homegrown digital-signal processors (DSPs) will power the upgrade to the Tianhe-2A super, our sister website The Platform reports. Dr Lu revealed the development at the International Supercomputing Conference in Germany on Wednesday.

The boosted Tianhe-2A is due to go live before the end of 2016, and is apparently expected to perform 100PFLOPs – 100,000 trillion calculations per second – at its peak. It will, according to Dr Lu, consume up to 18MW of power, pack about three petabytes of system RAM, and use Intel Xeon E5-2692 processors from the Tianhe-2 plus the new homegrown accelerators.

Today's Tianhe-2 – the world's most powerful publicly known supercomputer – uses a mix of E5-2692 CPUs and Xeon Phi accelerators. Essentially, the 2A will use the China-crafted DSPs instead of the Phis, alongside the Xeon E5 processors, it appears. The Tianhe-2A will be built from 18,000 nodes, and run off a 30PB file system, we're told.

The Matrix2000 DSPs are 1GHz 64-bit chips, they draw 200W of power, and can perform 4.8TFLOPS of single and 2.4TFLOPS of double precision math. They are interfaced using x16 PCIe 3.0 links, and performance-wise, give GPUs and the Phi a run for their money.

"The Tiahne-2 machine (and its eventual successor sporting the DSP accelerators) is housed at the National University of Defense Technology (NUDT) in China," The Platform co-editor Nicole Hemsoth reports.
Code:
http://www.theregister.co.uk/2015/07/15/china_supercomputer_chips/

Matrix2000 GPDSP
FBKBnjI.jpg
that is an accelerator.

Using 14nm process, the upcoming exascale machine is envisioned to achieve an energy efficiency ratio, measured by performance per Watt, of up to 60 Gflops/W (vs 6Gflops/W with SW26010), with new cutting-edge cooling tech supporting extreme power density of over 100KW/M3 for the whole system.



:coffee::D

Exascale硬件系统研究方面

主要从处理器结构、互连网络、整机基础架
构三个方面开展了研究。处理器研究的核心是能
效比约束,本课题提出了高性能GPDSP数据流
SPU
异构通用众核等多种不同技术路线分别开
展研究,在14nm 工艺下,处理器能效比有望达
30-60GFLOPS/W。互连网络研究主要提出了
两种不同的技术路线,分别是高维可扩展互连网
络和光电混合互连网络,基于两种不同架构分别
提出了有效支持10 万个节点规模的高速互连方
案。整机基础架构方面重点针对散热技术开展了
研究,提出了包括肋片型强化换热液冷冷板和相
变冷板等新型散热技术,可以有效满足E 级环境
下系统散热体积功耗密度达到100KW/M3以上
要求

@Bussard Ramjet
What a beast!

Who the hell is Phytium? Never heard of their chips before. Is Chinese HPC chip business about to experience a period of exponential growth and then Chinese are going to sell their HPC chips by pound? It would be fun to see Chinese supers take over top spots on top500 list in a few years with 4 different kinds of Chinese home brewed HPC chips.
It would be an exiting time to see competing cpu architectural design get put to the test at the highest level of computation. We are the only one in the world to test multiple chip design on a supercomputer thanks in part to the stupid ban. LOL
 
Both Sunway and Tianhe are not the frontrunner of the China's supercomputing technology, and those supercomputers using the microchips are obsolete and to be soon reached its bottleneck.

China's true supercomputing goal for this century is the quantum supercomputer. The quantum satellite in the coming July will be even a more exciting news than the Taihulight-1.

View attachment 312181

View attachment 312182

View attachment 312183

View attachment 312184

View attachment 312185

View attachment 312186

View attachment 312187
Holy, the quantum methodology killed my mathematics knowledge. :o::o:
 
Back
Top Bottom