Innovation is a lot like love, everyone knows when it happens, but nobody really knows what it is - Dean 'Mr Segway' Kamen
SAN JOSE: AT THE GPU TECHNOLOGY CONFERENCE (GTC) today, Nvidia said its graphics cards are usually IO bound by field programmable gate array (FPGA) video capture boards.
Nvidia often promotes its work to the video industry claiming its GPUs accelerate the workflow for video editors. However the firm said that the performance of its GPUs is limited by the amount of data that can be fed from video capture devices that typically use FPGA boards.
Shailendra Mathur, video chief architect at Avid said during a panel discussion that video workflow performance is IO bound, and workloads must balance their use of both CPU and GPU.
He said, "The transfers are important. For video processing you are IO bound primarily. It's not just a question about just throwing everything at the GPU you have to balance it all across a heterogeneous architecture. So that's another thing that prompts us to primarily put stuff on the CPU and use the GPU as a co-processor. There's certain things you have to do on the GPU, anything with motion graphics should be going to the GPU, you don't want to do anything on the CPU."
Andrew Page, senior product manager of Advanced Technology at Nvidia, confirmed that Nvidia's GPUs are IO bound and said that this was due to the FPGA video capture boards that feed the raw video to the GPU. Page said,
"Part of the IO bound-ness actually comes from the PCI [Express] bus. [...] Since we're an ASIC, we can run PCI-Express Gen 2 16x and in the future PCI-Express Gen 3 16x. FPGA based solutions, which are what most [video] capture boards are based off, are not capable of going beyond PCI-Express 8x, the bulk capture boards at PCI-Express 4x," he said.
Following on, Mathur said, "Codecs are still [running] on the CPU. So you go from storage host memory, CPU uncompressed and pass that information around. What if the codecs are on the board?"
Page injected, "Don't move the full frame, move the compressed frame and then either decompress by CUDA or dedicated silicon."
Mathur said this could be mitigated by having a heterogeneous architecture and redesign video codecs to make use of the GPU rather than use the CPU. Page agreed, adding that a pragmatic way of mitigating the effects of FPGA bandwidth restrictions is to work with compressed video feeds.
Mathur and Page's comments highlight the chief problem that Nvidia has with its video cards and GPGPU accelerators, feeding the GPU with enough data. For Nvidia the frustrating thing is that as the video industry moves towards 4K resolution and higher frame rates, the bandwidth choke between the FPGA and the GPU might limit the performance improvements it can achieve. µ
Sign up for INQbot – a weekly roundup of the best from the INQ