Friday, May 8, 2009

PSCs and Streamcomputing without FPGA

The most seductive FPGA application is in the realm of reconfigurable supercomputing.

Ages ago, in April 2005, Cray Inc. unveiled their XD1 supercomputer that utilized array of Xilinx Virtex2P FPGAs for CPU acceleration. XD1 came with an SDK that allowed programmers to move time-critical sections of 'C' code to dedicated computing hardware within FPGA (check Starbridge Viva, ImpulseC, Mentor Catapult-C, Altera C2H, Celoxica Handel-C, Cadence C2Silicon, free http://www.c-to-verilog.com /*thanks Nadav*/, Forte Cynthesizer, Nallatech Dime-C, Mitrion-C, Synfora PICO for C-to-Hardware translation tools). This ability to on-demand, per-application, per-need, create and extend the CPU instruction set is the basis of Reconfigurable Computing. Stream computing is then a natural extension, as parallelism is the inherent property of FPGA-based computing engines.

Then, in July 2006, AMD got hold of ATI and announced plans to design 'a new kind of processor', the Fusion processor that would integrate and use the GPU for its computing tasks other than 3D video rendering, shading and display-related math and data movements.

The first step however was to modify the hard-coded, inaccessible data paths and parallel compute engines in the deep core of a GPU and turn them into flexible, multi-purpose, fully exposed stream processors that C programmer had direct control of, using special SDK.

While AMD is now very close to presenting the world with their GPU-fused-to-CPU silicon, the GPU-based supercomputing has already arrived! It comes in the form of a PCIe accelerator card (AMD FireStream, Nvidia Tesla) and SDK.

Amazingly, these cards are designed for ordinary PCs and can be bought in retail for < $1000 !! This makes Cray-like supercomputing available for much, much less than $millions, on the desktop, outside the big air-conditioned, water-cooled room!!! The supercomputers are becoming very, very personal - Personal Super Computers: PSCs.

Will this fusion of technologies steal the story from the FPGAs?


Or, will the fusion continue and add some FPGA-like gate-level programmability to CPUs, in the same way as the FPUs had been and GPUs are being absorbed?

Bookmark and Share

5 comments:

Nadav said...

Hi. You forgot to mention http://www.c-to-verilog.com ; Much like Impulse, they compile C to gates.

~cool substance with hot flavor~ said...

Thanks for pointing it out -- I honestly have never heard nor seen it before. Looks interesting from the first glimpse at the site.

What's your experience with it?

Jasmin said...

I have followed up on this question with ImpulseC and asked them to comment on their advantages compared to the free http://www.c-to-verilog.com

Here are some of their thoughts:

1) We (ImpulseC) offer customizable platform extensions, platform support that includes custom HW/SW interfaces so you can make C function calls to hardware

2) We offer either VHDL or Verilog

3) We offer pipeline and system level visualization tools for debug
and iterative improvement

4) We offer a VHDL test bench generator that produces C and HDL
test vectors and then opens up ModelSim™ for you to be able to make sure the C and HDL versions are performing the same

5) We offer full engineer based customer support

Nadav said...

Hi Jasmin, Nadav here.
You are correct about ImpulseC. Unlike C-to-Verilog.com, they provide better software wrappers. They support VHDL and they provide ModelSim Integration. However, C-to-Verilog.com provides much better performance. C-to-Verilog.com implements state of the art pipelining algorithms to create optimized designs.

~cool substance with hot flavor~ said...

Nadav,
It sounds like you tested and compared ImpulseC against C-to-Verilog.com

Could you please share more detail on the performance advantage of the latter and ideally provide some examples.