-
公开(公告)号:US10777209B1
公开(公告)日:2020-09-15
申请号:US16499935
申请日:2018-04-17
Inventor: Hiroyuki Ehara , Akihisa Kawamura , Kai Wu , Srikanth Nagisetty , Sua Hong Neo
IPC: G10L19/008 , H04R1/40 , G10L19/032
Abstract: A sound source estimation unit (101) estimates, in a space as a target of sparse sound field decomposition, an area where a sound source is present at second granularity that is coarser than first granularity of a position where a sound source is assumed to be present in the sparse sound field decomposition. A sparse sound field decomposition unit (102) decomposes an acoustic signal observed by a microphone array into a sound source signal and an ambient noise signal by performing a sparse sound field decomposition process at the first granularity for the acoustic signal in the area at the second granularity where the sound source is estimated to be present in the space.
-
公开(公告)号:US10555107B2
公开(公告)日:2020-02-04
申请号:US16341861
申请日:2017-10-11
Inventor: Hiroyuki Ehara , Kai Wu , Sua Hong Neo
IPC: G10L19/008 , H04S7/00
Abstract: The present disclosure relates to the design of a fast binaural rendering for multiple moving audio sources. This disclosure takes the audio source signals which can be object-based, channel-based or a mixture of both, associated metadata, user head tracking data and binaural room impulse response (BRIR) database to generate the headphone playback signals. The present disclosure applies a frame-by-frame binaural rendering module which takes parameterized components of BRIRs for rendering moving sources. In addition, the present disclosure applies hierarchical source clustering and downmixing in the rendering process to reduce computational complexity.
-
公开(公告)号:US11653171B2
公开(公告)日:2023-05-16
申请号:US17725097
申请日:2022-04-20
Inventor: Hiroyuki Ehara , Kai Wu , Sua Hong Neo
IPC: H04S7/00 , H04S1/00 , G10L19/008
CPC classification number: H04S7/304 , G10L19/008 , H04S1/005 , H04S7/305 , H04S2400/01 , H04S2420/01
Abstract: A method that generates binaural headphone playback signals given multiple audio source signals with an associated metadata and binaural room impulse response (BRIR) database, wherein the audio source signals are channel-based, object-based, or a mixture of both channel-based and object-based signals. The method includes parameterizing BRIR to be used for rendering, dividing each audio source signal to be rendered into a number of blocks and frames, and averaging the parameterized BRIR sequences. The method also includes downmixing the divided audio source signals using the diffuse blocks of BRIRs, and performing late reverberation processing on the downmixed version of the previous blocks of the audio source signals.
-
公开(公告)号:US11337026B2
公开(公告)日:2022-05-17
申请号:US17097829
申请日:2020-11-13
Inventor: Hiroyuki Ehara , Kai Wu , Sua Hong Neo
IPC: G10L19/008 , H04S7/00 , H04S1/00
Abstract: A method generates binaural headphone playback signals given multiple audio source signals with associated metadata and a binaural room impulse response (BRIR) database, where the audio source signals can be channel-based, object-based, or a mixture of both signals. The method groups the audio source signals according to positions of the audio sources, divides BRIR into blocks and frames, where the BRIR is divided into a direct block and diffuse blocks, and divides each audio source signal into blocks and frames, wherein the source signal is divided into a current block and previous blocks, and the current block is further divided into the frames. The method further averages, for each of previous frames of the source signals, the divided BRIR identified with the grouping result by downmixing the previous frames of the source signals according to the grouping result, and performs a convolution with the downmixed previous frame.
-
公开(公告)号:US10735886B2
公开(公告)日:2020-08-04
申请号:US16724921
申请日:2019-12-23
Inventor: Hiroyuki Ehara , Kai Wu , Sua Hong Neo
IPC: G10L19/008 , H04S7/00 , H04S1/00
Abstract: A method of generating binaural headphone playback signals given multiple audio source signals with an associated metadata and binaural room impulse response (BRIR) database, wherein the multiple audio source signals can be channel-based, object-based, or a mixture of both signals. The method includes grouping the multiple audio source signals according to positions of the audio sources in a hierarchical manner, and parameterizing BRIR to be used for rendering. The method also includes dividing each audio source signal to be rendered into a number of blocks and frames, averaging the parameterized BRIR sequences identified with a hierarchically grouping result, and downmixing the divided audio source signals identified with the hierarchically grouping result.
-
-
-
-