Microsoft researchers report that have created a new speech recognition technology that transcribes colloquial speech, just as a human being does. The system's error rate per word is reported to be 5,9 percent., which is roughly the same as professional transcriptionists who were asked to work on the same recordings, according to Microsoft.
"We have reached parity with the human," said scientist Xuedong Huang, who gave the main information in a statement, calling this milestone a 'historic achievement'.
To reach the milestone, the team used a computer network and a Microsoft toolkit, as well as a homegrown deep learning system that the research team made available on GitHub through an open source license. The system uses a neural network technology on groups of similar words, allowing models to work efficiently word for word.
Neural networks are based on large amounts of data called "training data." and they are established to teach transcribing computers to recognize syntactic patterns in sounds. Microsoft plans to use the technology in Cortana, your personal voice assistant on Windows and Xbox One, as well as speech-to-text transcription software.
However technology still has a long way to go Before it can process the main meaning (the semantics) and contextual knowledge, key characteristics in everyday language use that need to be grasped by personal assistants, such as Siri, to process requests and act on them in a useful way.
"We are moving away from a world where people have to understand the world's computers while computers still don't understand us," said Harry Shum, who heads Microsoft's AI research group. However, it will be a long time before computers can understand the true meaning of what is being said, he warned. "True artificial intelligence is still on a distant horizon".
I think it is a tremendous step, the day we can interact with devices without using peripherals will totally change the way of understanding the relationship of man with machines.