I've been experimenting with Dragon Dictate software. This is usually recognized to be the best commercially available speech recognition system and you can buy it for Windows or Mac for under $100. I wanted to see how difficult it would be to dictate my novel. Wouldn't it be great to lay on the couch with a headset?
I was extremely impressed with the quality right out of the box. The software took me through a handful of short known sentences so it can tune to your voice. I could keep doing this for as long as I wanted and even go back later and train it in phrases that it repeatedly got wrong, but for the sake of laziness I decided to do the minimum amount of training. It did remarkably well with my English accent and the fact that I'm not a clear speaker.
I spoke at a normal conversational speed, just as if it were another person. It's very natural. Typically nothing appeared on the screen until I said comma or period, or some other punctuation. The recognition engine likes to obtain the full context of a sentence or clause before translating, A fraction of a second later, the entire text appeared. I didn't have to pause – I just kept going.
Let's look at a couple of samples. I actually recorded myself speaking these snippets from my upcoming book, but chose not to include them since I sound horrible when recorded. Ugh! Maybe I need elocution lessons – or a better mic
Here's the text I read:
Moving faster than a Djinn out of a bottle, one of the creatures leapt up onto the roof of a dormer window that overhung the street. Worn tiles slid and crashed to the ground. It sprang again, pushed off the wall and landed beside the man. Talon-like fingernails flashed in the lantern light, and the wight raked the man's forearm, shredding it.
And here's how it emerged from Dragon Dictate:
Moving faster than a gene out of a bottle, one of the creatures leapt up onto the roof of a dorm window that overhung the street. One tile Slate and crashed to the ground. It sprang again, pushed off the wall and landed beside the man. Talent like fingernails flashed in the lantern light, and the white rate to the man's forearm, shredding it.
Not bad! You can see exactly why it went wrong, largely because of a lack of knowledge of a creature called a wight and the pronunciation of a couple of words.
Here's another sample:
“I want to be a necromancer.” Her eyes locked on mine.
“Right. Do you even know what one is?”
She rolled her eyes. “Everyone knows what you do, though I bet only half of the stories are true.”
“It's dirty and dangerous and not at all becoming for a girl.”
And how it came out:
“I want to be a necromancer.” Her eyes locked on mine.
“Right. Do you even know what one is?”
She rolled her eyes. “Everyone knows what to do, though I bet only half of the stories are true.”
“It's dirty and dangerous and not at all coming for a girl.”
Almost completely perfect.
That second piece was trickier because I had to say open quote and close quote, and this is one of the things that made it awkward to use. After hours of practice, I remembered most of the time (and went back and added the missing ones later), but it definitely broke my concentration. I had to say new line for paragraph breaks too. I could edit by telling it to select a word/phrase and then to replace or insert, but since I had to proofread it anyway, I found it easier to make fixes using the keyboard. It was fun dictating a page or so and then going back to clean it up, but it definitely took discipline.
This leads to my final point. Apart from the overhead of the extra words (which I think could be overcome after days or weeks of practice), I just couldn't think verbally. Neural pathways have been strengthened between the creative parts of my brain and my fingers, and that's how I have trained my body to write. It just wasn't natural to dictate. I had expected it to be like having a conversation, but I suspect that during the act of transcribing our creativity, our eyes are subtly scanning the paragraph and lines we have written to retain context – sort of keep our mental place. Dictating took more conscious effort (perhaps because it's unnatural) and I regularly lost my place or forgot what I had just said. I suspect this would be even worse if I had attempted to dictate into my iPhone away from my computer, without the visual cue of the screen.
So much for my dream of dictating my novel on my drive to work.
I have however found the perfect use for it. For me, it works great when I want to describe setting and mood. I put on my headset, close my eyes and just say what I picture in my mind. It works great for description like that. Dialog, not so much.