-
公开(公告)号:US12131744B2
公开(公告)日:2024-10-29
申请号:US18487232
申请日:2023-10-16
Inventor: Dirk Jeroen Breebaart , David Matthew Cooper , Leif Jonas Samuelsson , Jeroen Koppens , Rhonda J. Wilson , Heiko Purnhagen , Alexander Stahlmann
CPC classification number: G10L19/008 , G06F3/16 , H04L65/70 , H04L65/75 , H04S1/007 , H04S7/305 , H04S2400/01 , H04S2400/03 , H04S2400/07
Abstract: A method for encoding an input audio stream including the steps of obtaining a first playback stream presentation of the input audio stream intended for reproduction on a first audio reproduction system, obtaining a second playback stream presentation of the input audio stream intended for reproduction on a second audio reproduction system, determining a set of transform parameters suitable for transforming an intermediate playback stream presentation to an approximation of the second playback stream presentation, wherein the transform parameters are determined by minimization of a measure of a difference between the approximation of the second playback stream presentation and the second playback stream presentation, and encoding the first playback stream presentation and the set of transform parameters for transmission to a decoder.
-
公开(公告)号:US20240355342A1
公开(公告)日:2024-10-24
申请号:US18763087
申请日:2024-07-03
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Xingtao ZHANG , Haiting LI , Zexin LIU , Lei MIAO
IPC: G10L19/008 , G10L19/032 , H04S3/00
CPC classification number: G10L19/008 , G10L19/032 , H04S3/008 , H04S2400/03 , H04S2420/03
Abstract: The present disclosure discloses an inter-channel phase difference parameter encoding method, where a current frame is obtained; a signal type and a previous IPD parameter encoding scheme of a previous frame are obtained; a current IPD parameter encoding scheme is obtained at least based on the signal type of the previous frame and the previous IPD parameter encoding scheme; and an IPD parameter of the current frame is processed based on the current IPD parameter encoding scheme.
-
公开(公告)号:US12089033B2
公开(公告)日:2024-09-10
申请号:US18108663
申请日:2023-02-13
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Kuan-Chieh Yen , Dirk Jeroen Breebaart , Grant A. Davidson , Rhonda Wilson , David M. Cooper , Zhiwei Shuang
IPC: H04S7/00 , G10L19/008 , H04S3/00
CPC classification number: H04S7/306 , G10L19/008 , H04S3/004 , H04S7/307 , H04S2400/03 , H04S2400/13 , H04S2420/01
Abstract: In some embodiments, virtualization methods for generating a binaural signal in response to channels of a multi-channel audio signal, which apply a binaural room impulse response (BRIR) to each channel including by using at least one feedback delay network (FDN) to apply a common late reverberation to a downmix of the channels. In some embodiments, input signal channels are processed in a first processing path to apply to each channel a direct response and early reflection portion of a single-channel BRIR for the channel, and the downmix of the channels is processed in a second processing path including at least one FDN which applies the common late reverberation. Typically, the common late reverberation emulates collective macro attributes of late reverberation portions of at least some of the single-channel BRIRs. Other aspects are headphone virtualizers configured to perform any embodiment of the method.
-
公开(公告)号:US12087310B2
公开(公告)日:2024-09-10
申请号:US17148638
申请日:2021-01-14
Inventor: Arne Borsum , Stephan Schreiner , Harald Fuchs , Michael Kratz , Bernhard Grill , Sebastian Scharrer
CPC classification number: G10L19/008 , G10L19/02 , G10L19/173 , H04S3/002 , H04S3/02 , H04S5/005 , H04S2400/03 , H04S2400/11 , H04S2420/03
Abstract: An apparatus for downmixing three or more audio input channels to obtain two or more audio output channels is provided. The apparatus includes a receiving interface for receiving the three or more audio input channels and for receiving side information. Moreover, the apparatus includes a downmixer for downmixing the three or more audio input channels depending on the side information to obtain the two or more audio output channels. The number of the audio output channels is smaller than the number of the audio input channels. The side information indicates a characteristic of at least one of the three or more audio input channels, or a characteristic of one or more sound waves recorded within the one or more audio input channels, or a characteristic of one or more sound sources which emitted one or more sound waves recorded within the one or more audio input channels.
-
5.
公开(公告)号:US12028701B2
公开(公告)日:2024-07-02
申请号:US18106261
申请日:2023-02-06
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Grant A. Davidson , Kuan-Chieh Yen , Dirk Jeroen Breebaart
IPC: H04S7/00
CPC classification number: H04S7/304 , H04S7/306 , H04S2400/03 , H04S2420/01 , H04S2420/07
Abstract: Methods and systems for designing binaural room impulse responses (BRIRs) for use in headphone virtualizers, and methods and systems for generating a binaural signal in response to a set of channels of a multi-channel audio signal, including by applying a BRIR to each channel of the set, thereby generating filtered signals, and combining the filtered signals to generate the binaural signal, where each BRIR has been designed in accordance with an embodiment of the design method. Other aspects are audio processing units configured to perform any embodiment of the inventive method. In accordance with some embodiments, BRIR design is formulated as a numerical optimization problem based on a simulation model (which generates candidate BRIRs) and at least one objective function (which evaluates each candidate BRIR), and includes identification of a best one of the candidate BRIRs as indicated by performance metrics determined for the candidate BRIRs by each objective function.
-
公开(公告)号:US11985179B1
公开(公告)日:2024-05-14
申请号:US17101108
申请日:2020-11-23
Applicant: Amazon Technologies, Inc.
Inventor: Berkant Tacer , Nikhil Shankar
IPC: H04L65/75 , G06N3/045 , G06N3/048 , G06N3/067 , G06N3/08 , G10L19/00 , G10L21/00 , H04L65/403 , H04S3/00
CPC classification number: H04L65/75 , G06N3/045 , G06N3/048 , G06N3/08 , G10L19/00 , G10L21/00 , H04L65/403 , H04S3/008 , G06N3/0675 , H04S2400/03 , H04S2400/05 , H04S2420/07
Abstract: A system configured to improve a voice quality during a communication session by performing bandwidth extension on a narrowband speech signal to generate a wideband speech signal with higher audio quality. For example, a system can extend a speech bandwidth from a narrowband signal having a first bandwidth (e.g., 4 kHz) to a wideband signal having a second bandwidth (e.g., 8 kHz or higher). To perform bandwidth extension, the system may include cascaded neural networks, such as two or more sub-pixel convolutional neural networks (CNNs) connected in series. In some examples, a first sub-pixel CNN may extend the speech bandwidth from 4 kHz to 6 kHz and a second sub-pixel CNN may extend the speech bandwidth from 6 kHz to 8 kHz. Alternatively, the system may use three or more cascaded neural networks and/or may extend the speech bandwidth above 8 kHz without departing from the disclosure.
-
公开(公告)号:US20240144941A1
公开(公告)日:2024-05-02
申请号:US18505996
申请日:2023-11-09
Applicant: DOLBY INTERNATIONAL AB
Inventor: Tobias FRIEDRICH , Alexander MUELLER , Karsten LINZMEIER , Claus-Christian SPENGER , Tobias R. WAGENBLASS
IPC: G10L19/008 , G10L19/16 , H04S3/00
CPC classification number: G10L19/008 , G10L19/167 , H04S3/008 , H04S2400/01 , H04S2400/03 , H04S2420/03
Abstract: The present document relates to audio coding systems. In particular, the present document relates to efficient methods and systems for parametric multi-channel audio coding. An audio encoding system configured to generate a bitstream indicative of a downmix signal and spatial metadata for generating a multi-channel upmix signal from the downmix signal is described. The system comprises a downmix processing unit configured to generate the downmix signal from a multi-channel input signal; wherein the downmix signal comprises m channels and wherein the multi-channel input signal comprises n channels; n, m being integers with m
-
公开(公告)号:US11962997B2
公开(公告)日:2024-04-16
申请号:US17883440
申请日:2022-08-08
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Charles Q. Robinson , Nicolas R. Tsingos , Christophe Chabanne
CPC classification number: H04S7/308 , G10L19/008 , G10L19/20 , H04R5/02 , H04R5/04 , H04R27/00 , H04S3/008 , H04S5/005 , H04S7/30 , H04S7/305 , H04S5/00 , H04S7/302 , H04S2400/01 , H04S2400/03 , H04S2400/11 , H04S2420/01 , H04S2420/03 , H04S2420/11 , H04S2420/13
Abstract: Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.
-
公开(公告)号:US20240121567A1
公开(公告)日:2024-04-11
申请号:US18525910
申请日:2023-12-01
Applicant: KONINKLIJKE PHILIPS N.V.
Inventor: ERIK G.P. SCHUIJERS
IPC: H04S5/00 , G10L19/008 , H04S3/02
CPC classification number: H04S5/00 , G10L19/008 , H04S3/02 , H04S2400/03 , H04S2420/03
Abstract: A parametric stereo upmix method for generating a left signal and a right signal from a mono downmix signal based on spatial parameters includes predicting a difference signal comprising a difference between the left signal and the right signal based on the mono downmix signal scaled with a prediction coefficient. The prediction coefficient is derived from the spatial parameters. The method further includes deriving the left signal and the right signal based on a sum and a difference of the mono downmix signal and said difference signal.
-
公开(公告)号:US20240114307A1
公开(公告)日:2024-04-04
申请号:US18465636
申请日:2023-09-12
Inventor: Stefan BRUHN
IPC: H04S3/02
CPC classification number: H04S3/02 , H04S2400/03
Abstract: There is provided encoding and decoding methods for representing spatial audio that is a combination of directional sound and diffuse sound. An exemplary encoding method includes inter alia creating a single- or multi-channel downmix audio signal by downmixing input audio signals from a plurality of microphones in an audio capture unit capturing the spatial audio; determining first metadata parameters associated with the downmix audio signal, wherein the first metadata parameters are indicative of one or more of: a relative time delay value, a gain value, and a phase value associated with each input audio signal; and combining the created downmix audio signal and the first metadata parameters into a representation of the spatial audio.
-
-
-
-
-
-
-
-
-