摘要:
A robot includes: a sound collecting unit collecting and converting a musical sound into a musical acoustic signal; a voice signal generating unit generating a self-vocalized voice signal; a sound outputting unit converting the self-vocalized voice signal into a sound and outputting the sound; a self-vocalized voice regulating unit receiving the musical acoustic signal and the self-vocalized voice signal; a filtering unit performing a filtering process; a beat interval reliability calculating unit performing a time-frequency pattern matching process and calculating a beat interval reliability; a beat interval estimating unit estimating a beat interval; a beat time reliability calculating unit calculating a beat time reliability; a beat time estimating unit estimating a beat time on the basis of the calculated beat time reliability; a beat time predicting unit predicting a beat time before the current time; and a synchronization unit synchronizing the self-vocalized voice signal.
摘要:
A beat tracking apparatus includes: a filtering unit configured to perform a filtering process on an input acoustic signal and to accentuate an onset; a beat interval reliability calculating unit configured to perform a time-frequency pattern matching process employing a mutual correlation function on the acoustic signal of which the onset is accentuated and to calculate a beat interval reliability; and a beat interval estimating unit configured to estimate a beat interval on the basis of the calculated beat interval reliability.
摘要:
A system capable of reducing the influence of sound reverberation or reflection to improve sound-source separation accuracy. An original signal X(ω,f) is separated from an observed signal Y(ω,f) according to a first model and a second model to extract an unknown signal E(ω,f). According to the first model, the original signal X(ω,f) of the current frame f is represented as a combined signal of known signals S(ω,f−m+1) (m=1 to M) that span a certain number M of current and previous frames. This enables extraction of the unknown signal E(ω,f) without changing the window length while reducing the influence of reverberation or reflection of the known signal S(ω,f) on the observed signal Y(ω,f).
摘要:
A system capable of reducing the influence of sound reverberation or reflection to improve sound-source separation accuracy. An original signal X(ω,f) is separated from an observed signal Y(ω,f) according to a first model and a second model to extract an unknown signal E(ω,f). According to the first model, the original signal X(ω,f) of the current frame f is represented as a combined signal of known signals S(ω,f−m+1) (m=1 to M) that span a certain number M of current and previous frames. This enables extraction of the unknown signal E(ω,f) without changing the window length while reducing the influence of reverberation or reflection of the known signal S(ω,f) on the observed signal Y(ω,f).
摘要:
A result of a sound source direction measurement based on an output of an REMA (first microphone array) (11) and a result of a sound source position measurement based on an output of an IRMA (second microphone array) (12) are integrated through a particle filter or in space. Thus, the different microphones, i.e., the REMA (11) and the IRMA (12) can cancel mutual defects or ambiguities with each other. Therefore, from views of improvement in accuracy and robustness a performance of sound source localization can be improved.
摘要:
A result of a sound source direction measurement based on an output of an REMA (first microphone array) (11) and a result of a sound source position measurement based on an output of an IRMA (second microphone array) (12) are integrated through a particle filter or in space. Thus, the different microphones, i.e., the REMA (11) and the IRMA (12) can cancel mutual defects or ambiguities with each other. Therefore, from views of improvement in accuracy and robustness a performance of sound source localization can be improved.
摘要:
A voice recognition system (10) for improving the toughness of voice recognition for a voice input for which a deteriorated feature amount cannot be completely identified. The system comprises at least two sound detecting means (16a, 16b) for detecting a sound signal, a sound source localizing unit (21) for determining the direction of a sound source based on the sound signal, a sound source separating unit (23) for separating a sound by the sound source from the sound signal based on the sound source direction, a mask producing unit (25) for producing a mask value according to the reliability of the separation results, a feature extracting unit (27) for extracting the feature amount of the sound signal, and a voice recognizing unit (29) for applying the mask to the feature amount to recognize a voice from the sound signal.
摘要:
In a sound source localization system using a light emitting device for visualizing sound information, including: a light emitting device (40) including a microphone for receiving sound from a sound source (1, 2) and a light emitting means for emitting light based on the sound from the microphone; a generating section for generating light emitting information for the light emitting device (40); and a sound source localization section (60) for determining a position of the sound source based on the light emitting information from the generating section.
摘要:
A voice recognition system (10) for improving the toughness of voice recognition for a voice input for which a deteriorated feature amount cannot be completely identified. The system comprises at least two sound detecting means (16a, 16b) for detecting a sound signal, a sound source localizing unit (21) for determining the direction of a sound source based on the sound signal, a sound source separating unit (23) for separating a sound by the sound source from the sound signal based on the sound source direction, a mask producing unit (25) for producing a mask value according to the reliability of the separation results, a feature extracting unit (27) for extracting the feature amount of the sound signal, and a voice recognizing unit (29) for applying the mask to the feature amount to recognize a voice from the sound signal.
摘要:
A moving object 1 equipped with ultra-directional speaker is provided with an emitter 44 for measuring a distance to a target 11 to which it is to provide a voice by using an ultrasonic transmit sensor 45 and an ultrasonic receive sensor 46 thereof, and for emitting an output signal having a predetermined sound level which is adjusted by an amplifier 34 with sound level adjusting function. The moving object 1 can thus transmit a voice having an optimal volume only to the specific target through parametric action.