Dragon Naturally Speaking 9 Preferred Edition £150
30th Sep 2006 | 23:00
Claims to be more accurate than ever before - as usual
Dragon Naturally Speaking 9 Preferred is designed to transcribe your spoken words into text. It works with most popular applications, including word processors, email clients and spreadsheets. It also has basic navigation and control functions for a limited number of applications, although for full computer control you'll need to look to the Professional version.
Television and Hollywood movies portray a digital world where people communicate with computers using conversational speech, but the reality is not quite so rosy.
The problem is an average speaker may have a vocabulary of 10 to 20,000 words or more - each of which may be pronounced with a regional dialect, as well as personal inflections and adaptations (for example, "fursday" or "nucular").
In the past, one way for dictation software to narrow down the vast range of permutations was to force you to speak using stilted speech, with clear spaces between words. It was also achieved by making you spend anything from 30 minutes to a few hours building individual user profiles, reading out loud so that the system could identify how he or she pronounces words.
Naturally Speaking 9 claims to be the first program that you can use straight out of the box, without training. Well, there's 'use', and there's 'use'... Dragon's marketing proudly proclaims: "No Training Required." The instructions say using it without training is "not recommended". Isn't that like a car manufacturer advertising "New convertible with fold-back roof - not recommended for use with roof folded back"?
The disappointing truth is that the program is still almost unusable until you spend the time to train it. Fortunately, training is a much faster process than previously, requiring perhaps five minutes of easy reading. It's impressive how high the accuracy can leap after such a modest amount of training, but there will be lots of words and phrases that the program steadfastly refuses to recognise.
Teaching it to listen
There are numerous ways to teach the program to recognise unknown words: high accuracy depends upon taking the trouble to correct each and every mistake. Because Naturally Speaking is an adaptive learning system, correcting each unrecognised word will also improve the program's overall grasp of idiosyncrasies in the way that you speak.
To increase the accuracy still further, the program can analyse every document you've ever written, including emails, to understand your grammatical and sentence construction habits. It then uses a probability table to work out the most likely word groupings that you use.
Thus, if you use the phrase "power play" often, then dictate a phrase that the system could alternately interpret as "pour play", the former phrase has a much higher rating in Naturally Speaking's recognition engine.
We were initially tremendously disappointed with the program, but it's amazing how quickly it improves its accuracy. Nuance claims 99 per cent accuracy (as it has for every release since version 3). The big differences now are that you can use more conversational speech styles and, if you are disciplined and vocally nimble enough, dictate sentences at up to 160 words per minute.
We say "disciplined" because although Naturally Speaking 9 will automatically add commas and full stops, you still need to dictate every other character, symbol, tab stop, table definition and item of punctuation. That takes a lot of practice, because you need to visualise your documents with much greater clarity than when you type.
Furthermore, the program still responds best when you enunciate well. We also found that it struggled with the differences between morning and afternoon tones - it's not unusual for a voice to be deeper and more gravelly in the morning, which can make the system recognise you as two different people.
Naturally Speaking 9 will also transcribe audio files from just about every popular audio format, including MP3 and WMA, and will autotranscribe the contents of entire folders if required. The problem is that it needs to profile the person speaking in the recordings, which reduces its effectiveness for multispeaker archives.
This version has more options than ever for teaching it about the way that you speak, as well as teaching additional words that you commonly use, such as scientific or medical terminology. It is still a significant way short of true out-ofthe- box recognition, but it's the first version that works well enough (once trained) for us to seriously consider using it professionally. Mat Broomfield