Abstract:
Embodiments of the present disclosure relate to processing audio or video signals captured by multiple devices. An apparatus for processing video and audio signals includes an estimating unit and a processing unit. The estimating unit may estimate at least one aspect of an array at least based on at least one video or audio signal captured respectively by at least one of portable devices arranged in an array. The processing unit may apply the aspect at least based on video to a process of generating a surround sound signal via the array, or apply the aspect at least based on audio to a process of generating a combined video signal via the array. With cross-referencing visual or acoustic hints, an improvement can be achieved in generating an audio or video signal.
Abstract:
Example embodiments disclosed herein relate to filter coefficient updating in time domain filtering. A method of processing an audio signal is disclosed. The method includes obtaining a predetermined number of target gains for a first portion of the audio signal by analyzing the first portion of the audio signal. Each of the target gains is corresponding to a subband of the audio signal. The method also includes determining filter coefficients for time domain filtering the first portion of the audio signal so as to approximate a frequency response given by the target gains. The filter coefficients are determined by iteratively selecting at least one target gain from the target gains and updating the filter coefficient based on the selected at least one target gain. Corresponding system and computer program product for processing an audio signal are also disclosed.
Abstract:
Example embodiments disclosed herein relate to orientation-aware surround sound playback. A method for processing audio on an electronic device that includes a plurality of loudspeakers is disclosed, the loudspeakers arranged in more than one dimension of the electronic device. The method includes, responsive to receipt of a plurality of received audio streams, generating a rendering component associated with the plurality of received audio streams, determining an orientation dependent component of the rendering component, processing the rendering component by updating the orientation dependent component according to an orientation of the loudspeakers and dispatching the received audio streams to the plurality of loudspeakers for playback based on the processed rendering component. Corresponding system and computer program products are also disclosed.
Abstract:
A method, an apparatus, and logic to post-process raw gains determined by input processing to generate post-processed gains, comprising using one or both of delta gain smoothing and decision-directed gain smoothing. The delta gain smoothing comprises applying a smoothing filter to the raw gain with a smoothing factor that depends on the gain delta: the absolute value of the difference between the raw gain for the current frame and the post-processed gain for a previous frame. The decision-directed gain smoothing comprises converting the raw gain to a signal-to-noise ratio, applying a smoothing filter with a smoothing factor to the signal-to-noise ratio to calculate a smoothed signal-to-noise ratio, and converting the smoothed signal-to-noise ratio to determine the second smoothed gain, with smoothing factor possibly dependent on the gain delta.
Abstract:
Example embodiments disclosed herein relate to orientation-aware surround sound playback. A method for processing audio on an electronic device that includes a plurality of loudspeakers is disclosed, the loudspeakers arranged in more than one dimension of the electronic device. The method includes, responsive to receipt of a plurality of received audio streams, generating a rendering component associated with the plurality of received audio streams, determining an orientation dependent component of the rendering component, processing the rendering component by updating the orientation dependent component according to an orientation of the loudspeakers and dispatching the received audio streams to the plurality of loudspeakers for playback based on the processed rendering component. Corresponding system and computer program products are also disclosed.
Abstract:
A method, an apparatus, and logic to post-process raw gains determined by input processing to generate post-processed gains, comprising using one or both of delta gain smoothing and decision-directed gain smoothing. The delta gain smoothing comprises applying a smoothing filter to the raw gain with a smoothing factor that depends on the gain delta: the absolute value of the difference between the raw gain for the current frame and the post-processed gain for a previous frame. The decision-directed gain smoothing comprises converting the raw gain to a signal-to-noise ratio, applying a smoothing filter with a smoothing factor to the signal-to-noise ratio to calculate a smoothed signal-to-noise ratio, and converting the smoothed signal-to-noise ratio to determine the second smoothed gain, with smoothing factor possibly dependent on the gain delta.
Abstract:
The present application relates to packet loss concealment apparatus and method, and audio processing system. According to an embodiment, the packet loss concealment apparatus is provided for concealing packet losses in a stream of audio packets, each audio packet comprising at least one audio frame in transmission format comprising at least one monaural component and at least one spatial component. The packet loss concealment apparatus may comprises a first concealment unit for creating the at least one monaural component for a lost frame in a lost packet and a second concealment unit for creating the at least one spatial component for the lost frame. According to the embodiment, spatial artifacts such as incorrect angle and diffuseness may be avoided as far as possible in PLC for multi-channel spatial or sound field encoded audio signals.
Abstract:
A method of determining a near optimal forward error correction scheme for the transmission of audio data over a lossy packet switched network having preallocated estimated bandwidth, delay and packet losses, between at least a first and second communications devices, the method including the steps of: determining a first coding rate for the audio data; determining a peak redundancy coding rate for redundant versions of the audio data; determining an average redundancy coding rate over a period of time for redundant versions of the audio data; determining an objective function which maximizes a bitrate-perceptual audio quality mapping of the transmitted audio data including a playout function formulation; and optimizing the objective function to produce a forward error correction scheme providing a high bitrate perceptual audio quality.
Abstract:
Example embodiments disclosed herein relate to spatial congruency adjustment. A method for adjusting spatial congruency in a video conference is disclosed. The method in unwarping a visual scene captured by a video endpoint device into at least one rectilinear scene, the video endpoint device being configured to capture the visual scene in an omnidirectional manner, detecting spatial congruency between the at least one rectilinear scene and an auditory scene captured by an audio endpoint device that is positioned in relation to the video endpoint device. The spatial congruency being a degree of alignment between the auditory scene and the at least one rectilinear scene and in response to the detected spatial congruency being below the threshold, adjusting the spatial congruency. Corresponding system and computer program products are also disclosed.
Abstract:
Embodiments are described for harmonicity estimation, audio classification, pitch determination and noise estimation. Measuring harmonicity of an audio signal includes calculation a log amplitude spectrum of audio signal. A first spectrum is derived by calculating each component of the first spectrum as a sum of components of the log amplitude spectrum on frequencies. In linear frequency scale, the frequencies are odd multiples of the component's frequency of the first spectrum. A second spectrum is derived by calculating each component of the second spectrum as a sum of components of the log amplitude spectrum on frequencies. In linear frequency scale, the frequencies are even multiples of the component's frequency of the second spectrum. A difference spectrum is derived subtracting the first spectrum from the second spectrum. A measure of harmonicity is generated as a monotonically increasing function of the maximum component of the difference spectrum within predetermined frequency range.