@
Martian2
I really enjoyed reading your analysis on TSMC and IBM.
I am interested in your opinion regarding Chinese homegrown CPUs (to be precise, the Loongson MIPS family, and the Shenwei Alpha family). Do you think in the foreseeable future Chinese CPU can compete against the x86 processors?
Thank you Martian2 as always for your insight.
China designs and manufactures excellent computer hardware. As you have pointed out, the Loongson/Godson chip is an excellent example. There is more information about the Loongson/Godson chip in the citations below.
However, the fundamental problem in competing with the x86 processors is not matching their raw power. It is about software compatibility. Most of the world uses Microsoft Windows operating system. All of the software (e.g. Microsoft Word, Microsoft Excel, Microsoft Exchange, Microsoft PowerPoint, etc.) are written for Intel's x86 processors.
The lack of widespread software and the additional problem of most archived files in Microsoft format (ie. Microsoft Word's .doc file format) create a compatibility problem for any new non-x86 microprocessor. To run the x86-compatible software, a Loongson/Godson chip has to operate in emulation mode (e.g. pretend to be a x86 processor by creating a software emulation). Emulation significantly slows down the Loongson/Godson chip.
You have a chicken-and-egg dilemma. To promote the wider use of China's Loongson/Godson chip, you need native software programs written for the Loongson/Godson chip. To motivate software writers to create programs for the Loongson/Godson chip, you need widespread adoption of the Loongson/Godson chip.
Only the Chinese government can break the chicken-and-egg paradox. At some point, the Chinese government must mandate every single Chinese government entity to use the Loongson/Godson chip and software (by claiming a national security imperative to protect China from foreign electronic spying). From this massive base, the Loongson/Godson chip will migrate into the Chinese consumer sector. This is the only way to break the x86 monopoly, which is also known as the Wintel duopoly (for Windows software and Intel hardware).
----------
China's Homemade Supercomputer May be the Most Efficient Ever - Technology Review
"
China's Homemade Supercomputer May be the Most Efficient Ever
Technology Review
Published by MIT
Christopher Mims 03/01/2011
As processors hit the power wall, performance per watt means everything.
China's home-grown supercomputer, the Dawning 6000, finally has a launch date: Summer 2011. In terms of raw performance, the machine is not going to be a record breaker, but it will be
the first machine in the Top500 to be powered entirely by chips designed entirely by China's Institute of Computing Technology. Long term, they could be a major threat to Intel, AMD, NVIDIA and their ilk.
Dawning 6000 chassis with 10 Godson 3B-powered blade servers installed
Weiwu Hu, lead architect of the Loongson line of chips, announced the launch of the forthcoming supercomputer at the International Solid State Circuits Conference held last week. (Technology Review has been covering the development of this supercomputer for over a year, since the first intimation of its construction in January 2010.)
What's new as of Hu's latest announcement is the scale of the machine: 300 teraflops, achieved with 3,000 of the 1-Ghz, 8-core Godson 3B chips. That's a far cry from the #1 position on the world's list of the top 500 supercomputers, currently occupied by the 2.56 petaflop Tianhe-1A machine, also built in China, but with Intel CPUs and NVIDIA GPUs.
The absolute power of the machine might not matter much, however, because as the platform matures, performance per watt will become the dominant metric, as it has for all other high performance systems (and even the chips in your laptop and cell phone). As HPC Wire reports,
"[T]he Godson-3B appears to be a very power-efficient design, and the upcoming Dawning machine could rival even Blue Gene/Q systems for performance per watt supremacy."
This efficiency is achieved because of the relatively low clock speed of the Godson 3B -- only 1.0 GHz -- and the chips reliance on the MIPS architecture, which is also used in set-top boxes and is making its way into the smartphone market.
Significantly, in June 2010 the Chinese government was rumored to be contemplating the purchase of a 20 percent stake in MIPS Technologies, holder of the rights to the MIPS instruction set.
Hu also announced the launch in 2012 or 2013 of the Godson 3C chip, which at twice the clock speed and twice the cores of the 3B, will be four times as fast. This chip will be used to build a petascale supercomputer. Were such system to debut today, it would likely be among the ten fastest supercomputers on earth."
----------
Godson: China shuns US silicon with faux x86 superchip
"
Godson: China shuns US silicon with faux x86 superchip
Who needs GPU co-processors?
By Timothy Prickett Morgan
Posted in HPC, 25th February 2011 21:07 GMT
ISSCC If the Chinese government is scaring the world with its hybrid CPU-GPU clusters, what do you think the reaction will be when Chinese supercomputers shun American-made x64 processors and GPU co-processors and start using their own energy-efficient, MIPS-derived, x86-emulating Godson line of 64-bit processors?
Apoplexy? Disbelief? A polite bow of respect? A bunch of orders for Godson chips is more likely, once you see what China is up to.
One of the more interesting presentations at this week's International Solid-State Circuits Conference, hosted by the IEEE in San Francisco, was by Weiwu Hu, the lead designer of the Godson family of processors being created by Institute of Computing Technology at the Chinese Academy of Sciences.
China started developing its own processor since 2002, explained Hu, and the Godson family of chips, which is based on the MIPS architecture created by Silicon Graphics, is part of a holistic technology investment program. The Godson chip effort is one of 16 different projects, in fact, that are each funded with between $5bn and $10bn.
The massive projects focus on specific technology areas that China reckons are key for its technological independence and economic future, including processors and operating systems, chip process technology, 4G wireless networks, nuclear fission power plants, water pollution control and treatment, aircraft design and construction, high-resolution satellite imaging, and manned spaceflight and lunar exploration.
As
El Reg reported a year ago when China's ICT was bragging about its plans to build a petaflops-scale supercomputer with server maker Dawning, ICT originally got access to MIPS technology through its partnership with wafer-baker STMicroelectronics. But in June 2009, as it got serious about its Godson chips (also known by the name Loongson) it licensed the MIPS32 and MIPS64 architectures straight from MIPS Technologies, the chip-designing division of Silicon Graphics that was spun out in an initial public offering in 1998.
The initial Godson-1 processors were 32-bit chips running at a mere 266 MHz, and the Godson-2 moved to 64-bits and was revved up to 1.2 GHz. With the Godson-2F chip in 2007 and 2008, ICT came out with a design that has a four-issue core running at 800 MHz, rated at 3.2 gigaflops. The Godson-3A chip was delayed nearly a year and was aimed solely at servers. ICT shifted a four-core design and also did something else very clever: it added x64 instruction emulation right into the hardware. Hu only alluded to this emulation capability, but as El Reg explained a year ago, the Godson-3 chips have instructions added to help the QEMU hypervisor (the one that's at the heart of Red Hat's KVM hypervisor) to translate instructions from x86 to MIPS format. According to early benchmarks, the emulation penalty is about 30 per cent.
ICT's Godson family of chips for servers, PCs, and consumer electronics
The Godson-3A chip was implemented in a 65 nanometer process and ran at 1 GHz to deliver 16 gigaflops of floating point oomph. The chip has 425 million transistors, an area of 174.5 square millimeters, and burned only 10 watts under load. The chip included two 16-bit HyperTransport ports (licensed from Advanced Micro Devices), 4 MB of L2 cache, and two on-chip memory controllers that support either DDR2 or DDR3 main memory.
Oh Godson!
With the Godson-3B, which is what Hu was there to talk about in San Francisco, ICT is sticking with the same 65 nanometer CMOS process and running the chip at the same 1 GHz. But the chip is bumped up to eight cores from four and has two 256-bit vector co-processors per core. The chip has two HyperTransport ports and two DDR3 memory controllers, and weighs in at 583 million transistors in a 300 square millimeter area. Running at 1 GHz, peak performance on those vector units is 128 gigaflops, with the chip only emitting 40 watts. According to early tests, the cores burn about 28.9 watts, while the uncore parts of the chip (HT, memory controllers, and crossbar switches for linking chips together) consume 11.1 watts.
According to Hu, the vector extension unit in the Godson-3B and Godson-2H processors have 128-entry, 256-bit register files and have more than 300 SIMD instructions that have been added to the MIPS architecture.
Here's what the Godson-3B chip looks like:
ICT's Godson-3B MIPS processor, x86 emulation included
The Godson-3B processor will be used in the Dawning 6000 petaflops supercomputer, which China will be tweaking in 2012. Here's an early version of the blade equipped for the Godson-3B chips:
Dawning's two-socket Godson-3A and Godson-3B blade server
And this is what the blade server chassis looks like for the Dawning 6000:
The Dawning 6000 supercomputer blade server chassis
The Dawning 6000 blade design is used by the National Supercomputing Center in Shenzhen for its hybrid Xeon 5650-Nvidia M2050 system, which ranked number three on the Top 500 list from November 2010. That machine had an aggregate 1.27 petaflops of sustained performance running the Linpack Fortran benchmark test.
Another Dawning 6000 blade cluster with 3,000 of the Godson-3B chips, and rated at around 300 sustained teraflops, is expected to be up and running this summer, Hu said. (That would be about 384 peak theoretical teraflops just counting the vector units, not the cores.)
Those Dawning 6000 blades are by no means the highest density that ICT can come up with. Check out this system board for a 1U rack server that Hu showed off at ISSCC this week:
ICT's 1U2T Godson-3B system board
This IU2T system board packs 16 of the eight-core Godson-3B processors onto a single board, rated at 2 teraflops. So a rack of these puppies would yield 42 teraflops. So instead of hundreds of cabinets to reach 1 petaflops of raw number-crunching performance, as it can take with big x64-based machines, ICT could, in theory, do it with 24 racks.
ICT is not going to stop here. The Godson-3C design will shift to a 28 nanometer process and will come in eight-core variants like the Godson-3B as well as a 16-core variant. The Godson-3C will have faster clock speeds, too, running at between 1.5 GHz and 2 GHz. The roadmap says the chip is also capable of expanding up to 16 cores, too. ICT says the Godson-3C will deliver 512 gigaflops of raw performance on math work, and the way the math works, that is twice as much math moving from 1 GHz to 2 GHz and then a doubling again as the core count goes from 8 to 16. This chip is expected sometime around late 2012 or early 2013.
Wouldn't it be funny if Silicon Graphics started building systems with these Godson-3 chips? They could dust off Irix and take it out for a spin on some new iron and allow it to run x64-based Linux applications in emulation mode."