Living creatures are gifted with an auditory system that helps them obtain valuable information about the world such as locating and interpreting sound sources.
For humans, having auditory capabilities means being able to hear conversations, a train, a bus, music, ringtone, the list is endless. For a person who does not have any hearing impediment, it is difficult to imagine life without being able to hear.
It is believed that providing robots with auditory capabilities such as microphones will help in improving the life of the hearing impaired and also be able to recognize speech and converse with humans.
Microphones are transducers that detect sound signals and convert them into electric signals. Each microphone comprises a capsule that acts as a sensitive transducer element.
It also includes a housing for the transfer of signals from the element to other devices, and an electronic circuit that modifies the signal output from the capsule for a transmission to other devices.
Most of the microphones employ methods like light modulation, piezoelectric generation, capacitance change and electromagnetic induction for producing electrical voltage signals from mechanical vibration.
Microphones function based on their directional characteristics or transducer principle like dynamic, condenser, etc., and also with respect to other characteristics like the principal sound input orientation to the microphone's principal axis and diaphragm size.
In addition, application-specific microphone designs are developed to accomplish specific purpose like detecting sounds of small objects or insects as in contact microphone or achieving hands-free operation as in lavalier microphone.
In 2005, Jean Marc Valin from the Université de Sherbrooke proposed an artificial auditory system that provides a robot the capability to identify and monitor sounds as well as separate simultaneous sound sources and recognize simultaneous speech.
The research team showed that it was possible to implement these capabilities by using a microphone array without imitating the human auditory system.
The team developed a localization and tracking algorithm for the sound source that makes use of a steered beam former for source location that were then tracked with a multi-source particle filter.
Simultaneous sound sources were separated using a small variation of the geometric source separation algorithm (GSS) in combination with a multi-source post filter that brings down interference, noise and reverberation further.
The study showed that the robot can track four simultaneous sound sources even in reverberant and noisy environments. The study also demonstrated real-time control of a robot following a sound source.
The sound source separation approach proposed by the team was capable of attaining a 13.7 dB enhancement in signal-to-noise ratio when compared to a single microphone when there are three speakers. In these conditions, the system shows over 80% accuracy on digit recognition more than most human listeners can achieve.
Takahashi T et al (2010) conducted a study improving the listening capability for a humanoid robot HRP-2. This study explains enhancement of sound source separation in a humanoid robot. A separation error and interferences of other sources causes a recognition error in the system.
Convention separability methods use a geometric source separation (GSS) that uses a simulated HRTF, which is determined based on a distance between the sound source and the microphone causing a significant mismatch between the measured and simulated transfer functions.
The research team evaluated an approach wherein a nearer initial separation matrix is obtained based on a transfer function determined from an optimal separation matrix instead of a simulated one.
These novel features were incorporated in HARK, which is an open source robot audition software. The HARK is installed on the humanoid with an 8-element microphone array.
The listening potential of the HRP-2 is studied by the recognition of a target speech signal, which is separated from a simultaneous speech signal by three talkers.
Robots that have auditory capabilities will find applications in helping the hearing impaired. They will be able to recognize several sounds simultaneously and are capable of alerting if someone is crying for help.
One of the latest developments in the world of robots with auditory capabilities is the HEARBO, a robot developed at the Honda Research Institute, Japan, to understand the world of sound:
The future of hearing robots is one that can hear and understand our voice commands by using a system that captures the sound, reduces the noise and with the help of automatic speech recognition understand what the person is saying.
Normally, the beam forming technique is used but HEARBO is a step ahead. Imagine a scenario where there are a number of sounds at the same time, for instance, the doorbell rings, music is on and children are playing on one side of the room.
According to researchers, HEARBO can detect that by using the approach of localization, separation and recognition.
HEARBO has been taught the concepts of human voice, music and environmental sounds. It can differentiate between a person talking and singing by capturing all sounds, recognize the same and determine from where they are coming.
Sources and Further Reading