The system can transcribe conversational language as accurate as professional transcriptionists
Microsoft researchers have announced details on the company’s latest speech recognition technology, which is claimed can transcribe conversational speech as accurately as a human.
The team of researchers and engineers at Microsoft’s Artificial Intelligence (AI) and Research division noted that the speech recognition system they developed makes the same or fewer errors than professional transcriptionists.
They reported a word error rate of 5.9 percent, about equal to that of people asked to transcribe the same conversation the system was tested against.
It’s a bold claim, but when speech recognition with the likes of virtual assistants such as Cortana and Apple’s Siri can be hit and miss, such improvements can take speech recognition tools and smart software from being gimmicks and nice-to-have features into genuinely useful day-to-day tools.
It is also indicative of the rapid evolution of AI and smart systems, which makes concerns that the impact of intelligent machines and software needs to be considered sooner than later.
“Even five years ago, I wouldn’t have thought we could have achieved this. I just wouldn’t have thought it would be possible,” said Harry Shum, the executive vice president who heads the Microsoft Artificial Intelligence and Research group.
To get transcription parity with humans, Microsoft made use of deep learning neural networks, which replicates in part how the human brain learns, to train the system to recognise patterns in sounds rather than be trained manually to make sense of each sound.
Using Microsoft’s Computational Network Toolkit the researchers were able to process deep learning algorithms across multiple computers running graphics processing chips for parallel processing, an important technique needed for crunching the vast amount of information a neural network needs to ingest. This allowed the researchers to carry out their testing and training at quite a lick.
While the researchers have some way to go before they can make sure the speech recognition technology works well in real-world settings with background noise, the current discoveries are likely to find their way into existing speech recognition features found in Windows and Xbox platforms.
“This will make Cortana more powerful, making a truly intelligent assistant possible,” Shum said.
Microsoft’s AI efforts are timely given how Google is looking to make waves with its AI-powered Assistant found in its new Pixel smartphones, and the search company has figured out how to make its speech-based technology replicate human speech.
What do you know about Windows 10? Try our quiz?