摘要:
The invention is directed to an auditory robot for a human or animal like robot, e.g., a human like robot (10) having a noise generating source such as a driving system in its interior. The apparatus includes a sound insulating cover (14) with which at least a head part (13) of the robot is covered; a pair of outer microphones (16; 16a and 16b) installed outside of the cover and located at a pair of positions where a pair of ears may be provided spaced apart for the robot, respectively, for collecting an external sound primarily; at least one inner microphone (17; 17a and 17b) installed inside of the cover for primarily collecting a noise from the noise generating source in the robot interior; and a processing module (18) on the basis of signals from the outer and inner microphones for removing from sound signals from the outer microphones (16a and 16b), a noise signal from the internal noise generating source. Thus, the robot auditory apparatus of the invention is made capable of effecting active perception by permitting an external sound from a target to be collected unaffected by a noise in the inside of the robot such as from the driving system.
摘要:
A robot visuoauditory system that makes it possible to process data in real time to track vision and audition for an object, that can integrate visual and auditory information on an object to permit the object to be kept tracked without fail and that makes it possible to process the information in real time to keep tracking the object both visually and auditorily and visualize the real-time processing is disclosed. In the system, the audition module (20) in response to sound signals from microphones extracts pitches therefrom, separate their sound sources from each other and locate sound sources such as to identify a sound source as at least one speaker, thereby extracting an auditory event (28) for each object speaker. The vision module (30) on the basis of an image taken by a camera identifies by face, and locate, each such speaker, thereby extracting a visual event (39) therefor. The motor control module (40) for turning the robot horizontally. extracts a motor event (49) from a rotary position of the motor. The association module (60) for controlling these modules forms from the auditory, visual and motor control events an auditory stream (65) and a visual stream (66) and then associates these streams with each other to form an association stream (67). The attention control module (6) effects attention control designed to make a plan of the course in which to control the drive motor, e.g., upon locating the sound source for the auditory event and locating the face for the visual event, thereby determining the direction in which each speaker lies. The system also includes a display (27, 37, 48, 68) for displaying at least a portion of auditory, visual and motor information. The attention control module (64) servo-controls the robot on the basis of the association stream or streams.
摘要:
Robotics visual and auditory system is provided which is made capable of accurately conducting the sound source localization of a target by associating a visual and an auditory information with respect to a target. It is provided with an audition module (20), a face module (30), a stereo module (37), a motor control module (40), an association module (50) for generating streams by associating events from said each module (20, 30, 37, and 40), and an attention control module (57) for conducting attention control based on the streams generated by the association module (50), and said association module (50) generates an auditory stream (55) and a visual stream (56) from a auditory event (28) from the auditory module (20), a face event (39) from the face module (30), a stereo event (39a) from the stereo module (37), and a motor event (48) from the motor control module (40), and an association stream (57) which associates said streams, as well as said audition module (20) collects sub-bands having the interaural phase difference (IPD) or the interaural intensity difference (IID) within the preset range by an active direction pass filter (23a) having a pass range which, according to auditory characteristics, becomes minimum in the frontal direction, and larger as the angle becomes wider to the left and right, based on an accurate sound source directional information from the association module (50), and conducts sound source separation by restructuring the wave shape of the sound source.
摘要:
A robot auditory apparatus and system are disclosed which are made capable of attaining active perception upon collecting a sound from an external target with no influence received from noises generated interior of the robot such as those emitted from the robot driving elements. The apparatus and system are for a robot having a noise generating source in its interior, and include: a sound insulating cladding (14) with which at least a portion of the robot is covered; at least two outer microphones (16 and 16) disposed outside of the cladding (14) for collecting an external sound primarily; at least one inner microphone (17) disposed inside of the cladding (14) for primarily collecting noises from the noise generating source in the robot interior; a processing section (23, 24) responsive to signals from the outer and inner microphones (16 and 16; and 17) for canceling from respective sound signals from the outer microphones (16 and 16), noises signal from the interior noise generating source and then issuing a left and a right sound signal; and a directional information extracting section (27) responsive to the left and right sound signals from the processing section (23, 24) for determining the direction from which the external sound is emitted. The processing section (23, 24) is adapted to detect burst noises owing to the noise generating source from a signal from the at least one inner microphone (17) for removing signal portions from the sound signals for bands containing the burst noises.
摘要:
Robotics visual and auditory system is provided which is made capable of accurately conducting the sound source localization of a target by associating a visual and an auditory information with respect to a target. It is provided with an audition module (20), a face module (30), a stereo module (37), a motor control module (40), an association module (50) for generating streams by associating events from said each module (20, 30, 37, and 40), and an attention control module (57) for conducting attention control based on the streams generated by the association module (50), and said association module (50) generates an auditory stream (55) and a visual stream (56) from a auditory event (28) from the auditory module (20), a face event (39) from the face module (30), a stereo event (39a) from the stereo module (37), and a motor event (48) from the motor control module (40), and an association stream (57) which associates said streams, as well as said audition module (20) collects sub-bands having the interaural phase difference (IPD) or the interaural intensity difference (IID) within the preset range by an active direction pass filter (23a) having a pass range which, according to auditory characteristics, becomes minimum in the frontal direction, and larger as the angle becomes wider to the left and right, based on an accurate sound source directional information from the association module (50), and conducts sound source separation by restructuring the wave shape of the sound source.
摘要:
It is a robotics visual and auditory system provided with an auditory module (20), a face module (30), a stereo module (37), a motor control module (40), and an association module (50) to control these respective modules. The auditory module (20) collects sub-bands having interaural phase difference (IPD) or interaural intensity difference (IID) within a predetermined range by an active direction pass filter (23a) having a pass range which, according to auditory characteristics, becomes minimum in the frontal direction, and larger as the angle becomes wider to the left and right, based on an accurate sound source directional information from the association module (50), and conducts sound source separation by restructuring a wave shape of a sound source, conducts speech recognition of separated sound signals from respective sound sources using a plurality of acoustic models (27d), integrates speech recognition results from each acoustic model by a selector, and judges the most reliable speech recognition result among the speech recognition results.
摘要:
A voice recognition system (10) for improving the toughness of voice recognition for a voice input for which a deteriorated feature amount cannot be completely identified. The system comprises at least two sound detecting means (16a, 16b) for detecting a sound signal, a sound source localizing unit (21) for determining the direction of a sound source based on the sound signal, a sound source separating unit (23) for separating a sound by the sound source from the sound signal based on the sound source direction, a mask producing unit (25) for producing a mask value according to the reliability of the separation results, a feature extracting unit (27) for extracting the feature amount of the sound signal, and a voice recognizing unit (29) for applying the mask to the feature amount to recognize a voice from the sound signal.
摘要:
A system capable of reducing the influence of sound reverberation or reflection to improve sound-source separation accuracy. An original signal X(ω,f) is separated from an observed signal Y(ω,f) according to a first model and a second model to extract an unknown signal E(ω,f). According to the first model, the original signal X(ω,f) of the current frame f is represented as a combined signal of known signals S(ω,f−m+1) (m=1 to M) that span a certain number M of current and previous frames. This enables extraction of the unknown signal E(ω,f) without changing the window length while reducing the influence of reverberation or reflection of the known signal S(ω,f) on the observed signal Y(ω,f).
摘要:
A musical score position estimating device includes an audio signal acquiring unit, a musical score information acquiring unit acquiring musical score information corresponding to an audio signal acquired by the audio signal acquiring unit, an audio signal feature extracting unit extracting a feature amount of the audio signal, a musical score feature extracting unit extracting a feature amount of the musical score information, a beat position estimating unit estimating a beat position of the audio signal, and a matching unit matching the feature amount of the audio signal with the feature amount of the musical score information using the estimated beat position to estimate a position of a portion in the musical score information corresponding to the audio signal.
摘要:
In a sound source localization system using a light emitting device for visualizing sound information, including: a light emitting device (40) including a microphone for receiving sound from a sound source (1, 2) and a light emitting means for emitting light based on the sound from the microphone; a generating section for generating light emitting information for the light emitting device (40); and a sound source localization section (60) for determining a position of the sound source based on the light emitting information from the generating section.