-
公开(公告)号:US20240274137A1
公开(公告)日:2024-08-15
申请号:US18568526
申请日:2021-06-10
Applicant: NOKIA TECHNOLOGIES OY
IPC: G10L19/008 , H04S3/00
CPC classification number: G10L19/008 , H04S3/008 , H04S2420/03
Abstract: An apparatus (317) comprising means configured to: receive a spatial audio signal, the spatial audio signal comprising at least one audio signal and spatial metadata (122) associated with the at least one audio signal; generate a mixing value (320) based on the spatial metadata (122) and a predefined parameter (322) which imparts effects of a rendering of a multichannel audio signal having a multichannel configuration to a further multichannel audio signal having a further multichannel configuration on generated output signals; and generating the output audio signals having the further multichannel configuration based on the mixing value (320) and the spatial audio signal.
-
公开(公告)号:US20240171927A1
公开(公告)日:2024-05-23
申请号:US18283493
申请日:2022-02-25
Applicant: Nokia Technologies Oy
Inventor: Mikko-Ville LAITINEN , Juha Tapio VILKAMO
IPC: H04S7/00
CPC classification number: H04S7/30 , H04S2420/03
Abstract: An apparatus for processing at least two audio signals and associated metadata, the apparatus including circuitry configured to: obtain the audio signals, the audio signals including at least one audio object portion and at least one non-audio object portion; obtain the associated metadata, wherein the associated metadata is configured to define at least one audio object position and at least one audio object energy proportion; obtain object position control information; determine mixing information based on the object position control information and the at least one audio object position and at least one audio object energy proportion; and process the at least two audio signals based on the mixing information, wherein the processing is configured to enable the at least one object portion of a first of the at least two audio signals to be at least partially moved to a second of the at least two audio signals.
-
公开(公告)号:US20250008285A1
公开(公告)日:2025-01-02
申请号:US18576795
申请日:2022-06-20
Applicant: Nokia Technologies Oy
Inventor: Juha Tapio VILKAMO , Miikka Tapani VILERMO , Mikko Tapio TAMMI
Abstract: Examples of the disclosure relate to an apparatus (201) for determining whether one or more microphones (207) within a plurality of microphones is blocked. In examples of the disclosure correlation between at least two microphones is estimated so as to provide an indication of whether or not incoherent noise, such as wind noise (205), is present. This can be used to avoid incorrectly identifying a microphone as being blocked and so can help to maintain a higher quality level for the audio signals captured by the microphones.
-
公开(公告)号:US20240284134A1
公开(公告)日:2024-08-22
申请号:US18571311
申请日:2022-05-16
Applicant: Nokia Technologies Oy
Inventor: Juha Tapio VILKAMO , Mikko Johannes HONKALA
IPC: H04S7/00 , G10L19/008
CPC classification number: H04S7/302 , G10L19/008 , H04S2400/11 , H04S2400/15 , H04S2420/07
Abstract: Examples of the disclosure relate to obtaining spatial metadata for use in rendering, or otherwise processing spatial audio. In examples of the disclosure a machine learning model can be used to process microphone signals, or data obtained from microphone signals, to obtain the spatial metadata. The machine learning model can be trained to enable high quality spatial metadata to be obtained from sub-optimal or low-quality microphone arrays. Examples of the disclosure include an apparatus including circuitry for: accessing a trained machine learning model; determining input data for the machine learning model based on two or more microphone signals; enabling using the machine learning model to process the input data to obtain spatial metadata; and associating the obtained spatial metadata with at least one signal based on the two or more microphone signals to enable processing of the at least one signal based on the obtained spatial metadata.
-
公开(公告)号:US20240087589A1
公开(公告)日:2024-03-14
申请号:US18367510
申请日:2023-09-13
Applicant: Nokia Technologies Oy
Inventor: Juha Tapio VILKAMO , Mikko-Ville LAITINEN , Sampo VESA
IPC: G10L21/0364 , H04R5/00
CPC classification number: G10L21/0364 , H04R5/00
Abstract: Examples of the disclosure relate to apparatus, methods and computer programs for spatial processing audio scenes with improved intelligibility for speech or other key sounds. In examples of the disclosure at least one audio signal including two or more channels is obtained. The audio signal is processed with program code to identify at least a first portion of the audio signal wherein the first portion predominantly includes audio of interest. The first portion is processed using a first process. The second portion is processed using a second process including spatial audio processing. The first process includes no spatial audio processing or a low level of spatial audio processing compared to the second process and the second portion predominantly includes a remainder. The processed first portion and second portion can be played back using two or more loudspeakers.
-
公开(公告)号:US20250071497A1
公开(公告)日:2025-02-27
申请号:US18723930
申请日:2022-12-09
Applicant: Nokia Technologies Oy
Inventor: Mikko-Ville LAITINEN , Juha Tapio VILKAMO
Abstract: Examples of the disclosure enable spatial audio rendering in a different format to the format that is used for the spatial audio coding. In examples of the disclosure spatial audio and first spatial metadata in a first format are obtained. The first spatial metadata enables rendering of spatial audio in a first audio format. In order to enable rendering of the spatial audio in a different format the spatial metadata is converted to second spatial metadata corresponding to a second audio format. The spatial audio can then be rendered for the second format using the second spatial metadata.
-
公开(公告)号:US20240267678A1
公开(公告)日:2024-08-08
申请号:US18420157
申请日:2024-01-23
Applicant: Nokia Technologies OY
Inventor: Mikko Olavi HEIKKINEN , Matti Sakari HÄMÄLÄINEN , Juha Petteri OJANPERÄ , Juha Tapio VILKAMO
CPC classification number: H04R5/04 , H04R3/005 , H04R5/027 , H04R5/033 , H04S7/30 , H04S2400/11 , H04S2400/15
Abstract: A method comprising capturing, using a first capturing mode, immersive audio using a first capturing device comprising a first microphone and a second capturing device comprising a second microphone, recognizing, based on obtaining data from one or more sensors, movement of the first capturing device, wherein the movement is with respect to the second capturing device, recognizing the movement as movement for changing from the first capturing mode to a second capturing mode, wherein the second capturing mode is for capturing immersive audio, and capturing the immersive audio using the second capturing mode.
-
公开(公告)号:US20240236611A9
公开(公告)日:2024-07-11
申请号:US18490359
申请日:2023-10-19
Applicant: Nokia Technologies Oy
Inventor: Mikko-Ville Laitinen , Juha Tapio VILKAMO , Jussi Kalevi VIROLAINEN
CPC classification number: H04S7/306 , H04R5/027 , H04S1/007 , H04S7/304 , H04S2400/11 , H04S2400/13 , H04S2400/15 , H04S2420/03
Abstract: A method for generating a parametric spatial audio stream, the method including: obtaining at least one mono-channel audio signal from at least one close microphone; obtaining at least one of: at least one reverberation parameter; at least one control parameter configured to control spatial features of the parametric spatial audio stream; generating, based on the at least one reverberation parameter, at least one reverberated audio signal from a respective at least one mono-channel audio signal; generating at least one spatial metadata parameter based on at least one of: the at least one mono-channel audio signal; the at least one reverberated audio signal; the at least one control parameter; and the at least one reverberation parameter; and encoding the at least one reverberated audio signal and the at least one spatial metadata parameter to generate the spatial audio stream.
-
公开(公告)号:US20240137728A1
公开(公告)日:2024-04-25
申请号:US18490359
申请日:2023-10-18
Applicant: Nokia Technologies Oy
Inventor: Mikko-Ville Laitinen , Juha Tapio VILKAMO , Jussi Kalevi VIROLAINEN
CPC classification number: H04S7/306 , H04R5/027 , H04S1/007 , H04S7/304 , H04S2400/11 , H04S2400/13 , H04S2400/15 , H04S2420/03
Abstract: A method for generating a parametric spatial audio stream, the method including: obtaining at least one mono-channel audio signal from at least one close microphone; obtaining at least one of: at least one reverberation parameter; at least one control parameter configured to control spatial features of the parametric spatial audio stream; generating, based on the at least one reverberation parameter, at least one reverberated audio signal from a respective at least one mono-channel audio signal; generating at least one spatial metadata parameter based on at least one of: the at least one mono-channel audio signal; the at least one reverberated audio signal; the at least one control parameter; and the at least one reverberation parameter; and encoding the at least one reverberated audio signal and the at least one spatial metadata parameter to generate the spatial audio stream.
-
10.
公开(公告)号:US20230110257A1
公开(公告)日:2023-04-13
申请号:US17960459
申请日:2022-10-05
Applicant: Nokia Technologies Oy
Inventor: Mikko-Ville Laitinen , Archontis POLITIS , Lauros Anton PAJUNEN , Juha Tapio VILKAMO , Antti Johannes ERONEN
Abstract: An apparatus for generating a spatialized audio output based on a listener position, the apparatus including circuitry configured to: obtain two or more audio signal sets; obtain a listener position within an audio environment, wherein the audio environment includes one or more area having one or more inside and outside regions in relation to the respective audio signal set positions; obtain metadata based on a processing of the at least two audio signals; determine, for the listener position within an audio environment outside the inside region, a second listener position; determine modified metadata for the second listener position based on the metadata; determine at least two modified audio signals for the second listener position based on the at least two audio signals; determine spatial metadata for the listener position; and output the at least two modified audio signals and the spatial metadata.
-
-
-
-
-
-
-
-
-