TARGET SOURCE SIGNAL GENERATION APPARATUS, TARGET SOURCE SIGNAL GENERATION METHOD, AND PROGRAM

    公开(公告)号:US20240038253A1

    公开(公告)日:2024-02-01

    申请号:US18265909

    申请日:2020-12-14

    摘要: A sound source signal generation technology based on an optimization algorithm that enables high-speed processing of sound source extraction is provided. A sound source signal generation device includes an optimization unit that optimizes a separation matrix W(f)=[w1(f), . . . , wK(f), WZ(f)] using an observed signal x(f, t), the optimization unit includes an auxiliary function calculation unit that calculates an auxiliary function Vi(f) (i=1, . . . , K) according to a predetermined equation, a first separation filter calculation unit that calculates a separation filters wi(f) (i=1, . . . , K) using auxiliary functions Vi(f) (i=1, . . . , K) and Vz(f), and a second separation filter calculation unit that calculates a separation filter WZ(f) according to a predetermined equation when a convergence condition is satisfied.

    TARGET SOUND SIGNAL GENERATION APPARATUS, TARGET SOUND SIGNAL GENERATION METHOD, AND PROGRAM

    公开(公告)号:US20230239616A1

    公开(公告)日:2023-07-27

    申请号:US18010790

    申请日:2020-06-19

    摘要: Provided is a target sound extraction technique based on a steering vector generation method enabling instability in a calculation to be prevented when a neural network is trained by using an error back propagation method to reduce an estimation error of a beamformer. A target sound signal generation apparatus generates a target sound signal yt,f corresponding to a target sound included in an observed sound from an observed signal vector xt,f corresponding to the observed sound collected by using a plurality of microphones. The target sound signal generation apparatus includes a mask generation unit, a steering vector generation unit, a beamformer vector generation unit, and a target sound signal generation unit. The mask generation unit is configured as a neural network trained by using an error back propagation method. The steering vector generation unit generates a steering vector hf by determining an eigenvector corresponding to a maximum eigenvalue of a predetermined matrix generated from the observed signal vector xt,f and a mask γt,f by using a power method.

    SIGNAL PROCESSING APPARATUS, SIGNAL PROCESSING METHOD, AND PROGRAM

    公开(公告)号:US20230087982A1

    公开(公告)日:2023-03-23

    申请号:US17802090

    申请日:2020-02-26

    摘要: A signal processing device applies a convolutional separation filter, which is a combined filter of: a rear reverberation removal filter for suppressing a rear reverberation component from a mixed acoustic signal obtained by converting an observed mixed acoustic signal obtained by observing a source signal into a time-frequency domain; and a sound source separation filter for emphasizing components corresponding to source signals from the mixed acoustic signal, to a mixed acoustic signal string including the mixed acoustic signal and a delay signal of the mixed acoustic signal and estimates model parameters of a model for obtaining information corresponding to signals in which the rear reverberation component is suppressed and target signals emitted from target sound sources in the source signal are emphasized.

    SIGNAL PROCESSING DEVICE, SIGNAL PROCESSING METHOD, AND SIGNAL PROCESSING PROGRAM

    公开(公告)号:US20230067132A1

    公开(公告)日:2023-03-02

    申请号:US17794266

    申请日:2020-02-14

    摘要: A signal processing apparatus includes a neural network (“NN”), a sorting unit, and a spatial covariance matrix calculation unit. The NN converts a mixed signal, in which sounds of a plurality of sound sources input by a plurality of channels are mixed, into a separated signal separated into a signal for each sound source as a signal in a time domain as it is and outputs the separated signal. The sorting unit sorts, for the separated signal of each channel output from the NN, the separated signal of each channel such that the plurality of sound sources of a plurality of the separated signals are aligned among the plurality of channels. The spatial covariance matrix calculation unit calculates a spatial covariance matrix corresponding to each sound source in accordance with the separated signal for each channel output from the sorting unit and sorted.

    MASK ESTIMATION APPARATUS, MASK ESTIMATION METHOD, AND MASK ESTIMATION PROGRAM

    公开(公告)号:US20190267019A1

    公开(公告)日:2019-08-29

    申请号:US15998742

    申请日:2016-12-20

    摘要: A feature extraction unit in a mask estimation apparatus extracts, from a plurality of observation signals obtained by observing a plurality of acoustic signals at different positions, feature vectors obtained by collecting time-frequency components of the observation signals for each time-frequency point. A mask update unit uses the feature vectors, a mixture weight of each component distribution, and a shape parameter that is a model parameter capable of controlling a shape of each component distribution, where a probability distribution of the feature vectors is modeled by a mixture distribution consisting of a plurality of component distributions, to estimate masks indicating a proportion in which each component distribution contributes to each time-frequency point. A mixture weight update unit updates the mixture weight based on the updated masks. A parameter update unit updates the shape parameter by using the feature vectors and the masks.