Nvidia releases a Cuda 4.0 toolkit
Open source C++ algorithms included
CHIP DESIGNER Nvidia has announced the latest version of its Cuda toolkit for developing parallel applications using the firm's graphical processing units (GPUs).
New features of the Cuda tools 4.0 include auto performance analysis in the visual profiler, added support for Mac OS X, C++ with virtual functions and a new GPU binary disassembler. A release candidate of the Cuda toolkit 4.0 will be available free of charge from 4 March for those registered with the developer program.
It is not entirely clear if they are new, but according to Nvidia the three main features of Cuda 4.0 are support for peer-to-peer communication among GPUs within a single server or workstation, unified virtual addressing for main system memory and GPU memories, and open source C++ parallel algorithms.
According to the Green Goblin the open source C++ algorithms mean that "routines such as parallel sorting are 5X to 100X faster than with Standard Template Library and Threading Building Blocks".
As well as open source C++, Cuda 4.0 has OpenMPI to automatically move data from and to the GPU memory over Infiniband when an application does an MPI send or receive call.
It also has multiple CPU host threads that can share contexts on a single GPU and, according to Nvidia, a single CPU host thread can access all GPUs in a system. µ
INQ Latest
US couple sues after IP address fingers them for thousands of crimes
They're in Kansas. A pair of ruby slippers would be the solution, surely?
Russia slaps Google with £5.2m fine for Android antitrust violations
Firm has also been ordered to change its agreements with device makers
Apple Maps shame still haunts Eddy Cue and Craig Federighi
Execs admit 'embarassment' over app's failure in 2012
Tor creates 'social contract' promising never to harm users
Tor's enemies just had their backdoors kicked in








