The Inquirer-Home

IBM to open source speech recognition

Not quite a full Viavoice for Linux, but close
Tue Sep 14 2004, 08:48
" I hate to see great software technology and many thousand man-hours of work frozen forever and buried. Hope you understand... " - July 28 2004 email from myself to Scott Handy, VP Linux at IBM, asking why didn't IBM release great but forgotten technologies like speech recognition, Hot.Media, VisualAge for Basic, and IBM PEN (handwriting recognition) to the open source community.

IT'S NICE when your comments are taken seriously. The NY Times is reporting that IBM will open up the source code of their speech recognition engine, and donate such code, worth $10 million greenbacks in development cost, to two open source groups, the Apache Software Foundation and the Eclipse project, each receiving different chunks of code. At least one analyst from Opus Research was enthusiastic, and is quoted in the NYT article saying "that should drastically reduce the cost of building speech applications".

A bit of History
Speech recognition technology is not something new for IBM. The company introduced its first software-only speech recognition product, Voicetype, back in 1996 by bundling it with the company's then-flagship 32-bit operating system, IBM OS/2 Warp version 4.0. It was even integrated with the operating system and the OS/2 port of Netscape Navigator, thus making it the first 32-bit desktop OS that shipped with voice recognition. Ahh, the opportunities missed...

Lots of water went under the bridge. First Big Blue dumped Voicetype, which was a discrete-speech engine (meaning that you had to pause briefly between words to get your speech recognised), and created Viavoice, the first "continuous speech" engine. By the time, the product ran on Windows, and faced strong competition from other windows players like Lernout & Hauspie.

Five years ago, IBM released the "Viavoice Toolkit for Linux" but only in binary form. That particular effort didn't fare well and the project quickly was forgotten, ignored, and ultimately abandoned.

Linux to get a competitive edge?
IBM has currently an agreement with US-based firm Scansoft, to market Big Blue's Viavoice product for windows end-users on the retail market. The latest Viavoice version for windows is retaling for $160 when bundled with an USB microphone, and $69 without it. A version for Mac OS-X is also available. In the end, this move will give Linux a competitive edge, when and if the speech recognition engine is bundled and integrated into popular Linux distros.

However, from what can be learned from this report, the source code contributed is just the speech recognition engine, that is, the text-mode "back-end", without any graphical user interface. And is not quite clear if the released source will be capable of taking continuous speech, or just provide basic "navigation" (simple words and phrases). In the end, it will be the task of open source groups like Gnome and KDE to build the hooks between this voice recognition engine and the most popular Linux graphical desktops, allowing for direct dictation into applications, for instance. µ

Share this:

Comments

There are no comments submitted yet. Do you have an interesting opinion? Then be the first to post a comment.

aboutus
Advertisement
Subscribe to INQ newsletters
Advertisement
INQ Poll

Authorities in several countries raided Megaupload recently, shut down all of its services, seized hundreds of servers and arrested several of its executives on criminal charges.

Do you think the move was justified?