Abstract:
The audio signal processing method in accordance with one embodiment receives an audio signal, obtains a first image, estimates room information based on the obtained first image, sets an acoustic parameter according to the estimated room information, applies sound processing to the audio signal according to the set acoustic parameter, and outputs the audio signal subjected to the sound processing.
Abstract:
A sound collection device includes a sensor, a database, a microphone, and an electronic controller. The sensor detects a state of at least one of the sound collection device or a device equipped with the sound collection device, or both. The database is a database of noise sounds. The electronic controller includes a signal processing unit configured to read at least one noise sound from the database based on a detection value of the sensor and carry out a noise reduction process to reduce noise from a sound signal acquired by the microphone based on the at least one noise sound read from the database.
Abstract:
An audio signal processing method includes receiving an audio signal corresponding to a voice of a talker, obtaining an image of the talker, estimating position information of the talker using the image of the talker, generating, according to the estimated position information, a correction filter configured to compensate for an attenuation of the voice of the talker, performing filter processing on the audio signal using the generated correction filter, and outputting the audio signal on which the filter processing has been performed.
Abstract:
A filtering method includes: receiving a first audio signal and a second audio signal that include sound emitted from a same sound source at different volumes; generating a filter signal by convoluting adaptive filter coefficients into the second audio signal; removing components of the filter signal from the first audio signal; and limiting a gain of the adaptive filter coefficients to 1.0 or less.
Abstract:
A signal processing method includes obtaining, by a signal processing apparatus, a network delay time with respect to a device connected to the signal processing apparatus via a network, obtaining an input signal, determining an allowable upper limit of a delay time for an output signal corresponding to the obtained input signal based on the obtained network delay time and a total allowable delay time, selecting a signal processing having a longest delay time that is less than or equal to the allowable upper limit of the delay time, performing the selected signal processing on the obtained input signal, and transmitting the obtained input signal on which the selected signal processing has been performed, as the output signal, to the device connected to the signal processing apparatus via the network.
Abstract:
A talker prediction method obtains a voice from a plurality of talkers, records a conversation history of the plurality of talkers, identifies a talker of the obtained voice, and predicts a next talker among the plurality of talkers based on the identified talker and the conversation history.
Abstract:
A sound emission and collection device includes a speaker, a filter processing a sound emission signal, microphones, echo cancellers cancelling regression sound signals of the sound emitted by the speaker from the sound collection signals of the corresponding microphones, a first integration section integrating adaptive filter coefficients taken out from the plurality of echo cancellers, a reverberation time estimation section estimating the reverberation time for each frequency band in the space in which the speaker and the plurality of microphones are present on the basis of the integrated adaptive filter coefficient, and an arithmetic operation section specifying a frequency band having a long reverberation time from the sound emission signal based on the estimated reverberation time, calculating a filter coefficient for suppressing power of the specified frequency band, and setting the filter coefficient to the filter.
Abstract:
A first calculating portion calculates a model sound index value which is an index value of the maximum value of power for each frequency band of the model sound which is a model of a target sound. A second calculating portion calculates a source sound index value which is an index value of power for each frequency band with respect to each of frames extracted by a predetermined time length from a source sound signal. A third calculating portion calculates a performance index value indicating performance of masking the model sound by a sound represented by a block formed of a predetermined number of consecutive frames extracted from the source sound signal, by using the model sound index value and the source sound index value. A frame selecting portion determines a block to be used for generating the masker sound based on the performance index value.
Abstract:
A sound processing method includes: receiving, using the communication device, from the remote apparatus, a first sound signal representing first sound generated by a user of the remote apparatus; emitting, using the sound emitting apparatus, the first sound represented by the first sound signal; receiving, using the sound receiving apparatus, sound that includes second sound generated by a user of the sound processing system; generating a second sound signal by sound processing, using processing parameters, a reception sound signal generated by the sound receiving apparatus; transmitting, using the communication device, the second sound signal to the remote apparatus; updating the processing parameters based on the first sound signal or the reception sound signal; and stopping the updating of the processing parameters in a state where musical sound is included in at least one of the first sound or the second sound.
Abstract:
A sound processing apparatus includes sound collection circuity that collects a sound and generates a first sound signal, and processing circuitry that estimates an estimated noise, controls a gain of the first sound signal and outputs a second sound signal based on the estimated noise, performs filter processing to reduce a component of a predetermined frequency band of the second sound signal based at least in part on the estimated noise.