Speech recognition technology has reached the point where the skill of the user has become as big a factor as the power of the software. Basically, the user has to learn to dictate. That may sound trivial, but for people who have spent years "thinking with their fingers" when it comes to writing, the transition can be rough.
The common advice is that the user should speak naturally and should not over-enunciate.
I disagree.
My experience is that if you speak naturally you will run phrases together and the articles, pronouns, and other short words (the, he, and, etc.) will lose distinctiveness. Recognition will suffer. Worse yet, those short words are the ones your eyes will probably skip over during proofreading, and you won't catch the resulting mistakes. (Using text-to-speech to proofread will cause these problems to leap out at you, but text-to-speech is much slower than visual reading.)
Meanwhile, speaking words aloud apparently involves different neural connections than does typing. The flow of thought must be grabbed and turned into words at a different (probably earlier) point than is done during the typing process -- that's my impression, anyway.
So I have come up with two different techniques that seem to help the dictation process, which I will hereby name Arch Pronunciation and Textual Visualization. (They are not mutually exclusive, incidentally.)
Arch Pronunciation
Instead of just speaking naturally, you should speak as if you were talking to someone whom you do not quite trust to understand you, and to whom you are willing to fully display this mistrust. You should adopt verbal mannerisms that would be rather insulting if used with another person.
And in this case you really cannot trust the listener to understand you. The listener is just a machine with a lot of flashing lights. Ultimately it has no more common sense or feeling than a slide rule. If you want it to heed each separate word you say, you must speak as if every separate word should be heeded.
This does not mean you should not speak continuously. You don't have to pause between words. But every word has to be distinctly punched out, or you will lose a few of them. The result may be that you end up talking slower, but there will be fewer mistakes to correct, putting you ahead in the long run.
(Admittedly, there are people whose natural speech is highly "arch." They should do well with speech recognition, although you have to wonder about their social lives.)
Textual Verbalization
Instead of thinking about what you are going to type with your fingers, think of the words that you want to appear on the screen. Then begin saying them, with an appropriately arch pronunciation. Remember, what matters are not the words that are winging through your head, but the words that you want to appear on the screen.
At this point there are two ways to proceed:
Your choice should be governed by the results you are getting on the screen. With a few adjustments and a few adaptations you should be able to make text appear on the screen at a rate at least twice as fast as you could make it appear if you were keyboarding. The process -- even making corrections -- will seem almost effortless.
Other Considerations