Abstract:
Example embodiments disclosed herein relate to orientation-aware surround sound playback. A method for processing audio on an electronic device that includes a plurality of loudspeakers is disclosed, the loudspeakers arranged in more than one dimension of the electronic device. The method includes, responsive to receipt of a plurality of received audio streams, generating a rendering component associated with the plurality of received audio streams, determining an orientation dependent component of the rendering component, processing the rendering component by updating the orientation dependent component according to an orientation of the loudspeakers and dispatching the received audio streams to the plurality of loudspeakers for playback based on the processed rendering component. Corresponding system and computer program products are also disclosed.
Abstract:
A method, an apparatus, and logic to post-process raw gains determined by input processing to generate post-processed gains, comprising using one or both of delta gain smoothing and decision-directed gain smoothing. The delta gain smoothing comprises applying a smoothing filter to the raw gain with a smoothing factor that depends on the gain delta: the absolute value of the difference between the raw gain for the current frame and the post-processed gain for a previous frame. The decision-directed gain smoothing comprises converting the raw gain to a signal-to-noise ratio, applying a smoothing filter with a smoothing factor to the signal-to-noise ratio to calculate a smoothed signal-to-noise ratio, and converting the smoothed signal-to-noise ratio to determine the second smoothed gain, with smoothing factor possibly dependent on the gain delta.
Abstract:
Some implementations involve controlling a jitter buffer size during a teleconference according to a jitter buffer size estimation algorithm based, at least in part, on a cumulative distribution function (CDF). The CDF may be based, at least in part, on a network jitter parameter. The CDF may be initialized according to a parametric model. At least one parameter of the parametric model may be based, at least in part, on legacy network jitter information.
Abstract:
In a conference call having a plurality of participants interacting in a conference exchange of information in a digital transmission environment, the interaction being across a variable network transmission resource, a method of allocating the level of transmission resource, the methods including the steps of: (a) monitoring predetermined aspects of the participant's behavior during the conference call; (b) determining a divergence of participants behavior from normative values; (c) utilising any divergence as an indicator of aberrant operation of the participants; and (d) allocating the resource determinative on the divergence of participants behavior from normative values.
Abstract:
Embodiments of the present invention relate to speaker identification using spatial information. A method of speaker identification for audio content being of a format based on multiple channels is disclosed. The method comprises extracting, from a first audio clip in the format, a plurality of spatial acoustic features across the multiple channels and location information, the first audio clip containing voices from a speaker, and constructing a first model for the speaker based on the spatial acoustic features and the location information, the first model indicating a characteristic of the voices from the speaker. The method further comprises identifying whether the audio content contains voices from the speaker based on the first model. Corresponding system and computer program product are also disclosed.
Abstract:
Example embodiments disclosed herein relate to orientation-aware surround sound playback. A method for processing audio on an electronic device that includes a plurality of loudspeakers is disclosed, the loudspeakers arranged in more than one dimension of the electronic device. The method includes, responsive to receipt of a plurality of received audio streams, generating a rendering component associated with the plurality of received audio streams, determining an orientation dependent component of the rendering component, processing the rendering component by updating the orientation dependent component according to an orientation of the loudspeakers and dispatching the received audio streams to the plurality of loudspeakers for playback based on the processed rendering component. Corresponding system and computer program products are also disclosed.
Abstract:
In an audio processing system (300), a filtering section (350, 400): receives subband signals (410, 420, 430) corresponding to audio content of a reference signal (301) in respective frequency subbands; receives subband signals (411, 421, 431) corresponding to audio content of a response signal (304) in the respective subbands; and forms filtered inband references (412, 422, 432) by applying respective filters (413, 423, 433) to the subband signals of the reference signal. For a frequency subband: filtered crossband references (424, 425) are formed by multiplying, by scalar factors (426, 427), filtered inband references of other subbands; a composite filtered reference (428) is formed by summing the filtered inband reference of the subband (422) and the filtered crossband references; a residual signal (429) is computed as a difference between the composite filtered reference and the subband signal of the response signal corresponding to the subband; and the scalar factors and the filter applied to the subband signal of the reference signal corresponding to the subband are adjusted based on the residual signal.
Abstract:
Example embodiments disclosed herein relate to filter coefficient updating in time domain filtering. A method of processing an audio signal is disclosed. The method includes obtaining a predetermined number of target gains for a first portion of the audio signal by analyzing the first portion of the audio signal. Each of the target gains is corresponding to a subband of the audio signal. The method also includes determining filter coefficients for time domain filtering the first portion of the audio signal so as to approximate a frequency response given by the target gains. The filter coefficients are determined by iteratively selecting at least one target gain from the target gains and updating the filter coefficient based on the selected at least one target gain. Corresponding system and computer program product for processing an audio signal are also disclosed.
Abstract:
A method, an apparatus, and logic to post-process raw gains determined by input processing to generate post-processed gains, comprising using one or both of delta gain smoothing and decision-directed gain smoothing. The delta gain smoothing comprises applying a smoothing filter to the raw gain with a smoothing factor that depends on the gain delta: the absolute value of the difference between the raw gain for the current frame and the post-processed gain for a previous frame. The decision-directed gain smoothing comprises converting the raw gain to a signal-to-noise ratio, applying a smoothing filter with a smoothing factor to the signal-to-noise ratio to calculate a smoothed signal-to-noise ratio, and converting the smoothed signal-to-noise ratio to determine the second smoothed gain, with smoothing factor possibly dependent on the gain delta.
Abstract:
Various disclosed implementations involve processing and/or playback of a recording of a conference involving a plurality of conference participants. Some implementations disclosed herein involve receiving speech recognition results data, including a plurality of speech recognition lattices and a word recognition confidence score for each of a plurality of hypothesized words of the speech recognition lattices, for a conference recording. A primary word candidate and alternative word hypotheses may be determined for hypothesized words in the speech recognition lattices. A term frequency metric may be calculated for sorting the primary word candidates and the alternative word hypotheses. Hypothesized words may be rescored according to an alternative hypothesis list.