Abstract:
Some disclosed teleconferencing methods may involve detecting a howl state during a teleconference. The teleconference may involve two or more teleconference client locations and a teleconference server. The teleconference server may be configured for providing full-duplex audio connectivity between the teleconference client locations. The howl state may be a state of acoustic feedback involving two or more teleconference devices in a teleconference client location. Detecting the howl state may involve an analysis of both spectral and temporal characteristics of teleconference audio data. Some disclosed teleconferencing methods may involve determining which client location is causing the howl state. Some such methods may involve mitigating the howl state and/or sending a howl state detection message.
Abstract:
Some implementations involve analyzing audio packets received during a time interval that corresponds with a conversation analysis segment to determine network jitter dynamics data and conversational interactivity data. The network jitter dynamics data may provide an indication of jitter in a network that relays the audio data packets. The conversational interactivity data may provide an indication of interactivity between participants of a conversation represented by the audio data. A jitter buffer size may be controlled according to the network jitter dynamics data and the conversational interactivity data. The time interval may include a plurality of talkspurts.
Abstract:
Some implementations involve controlling a jitter buffer size during a teleconference according to a jitter buffer size estimation algorithm based, at least in part, on a cumulative distribution function (CDF). The CDF may be based, at least in part, on a network jitter parameter. The CDF may be initialized according to a parametric model. At least one parameter of the parametric model may be based, at least in part, on legacy network jitter information.
Abstract:
Embodiments of client device and method for audio or video conferencing are described. An embodiment includes an offset detecting unit, a configuring unit, an estimator and an output unit. The offset detecting unit detects an offset of speech input to the client device. The configuring unit determines a voice latency from the client device to every far end. The estimator estimates a time when a user at the far end perceives the offset based on the voice latency. The output unit outputs a perceivable signal indicating that a user at the far end perceives the offset based on the time estimated for the far end. The perceivable signal is helpful to avoid collision between parties.
Abstract:
A system for real-time monitoring of user-generated audio content for audio anomaly and a related method are disclosed. In some embodiments, the system is programmed to receive, in real time, audio data generated by a first mobile device, such as a smartphone. The system is programed to determine, in real time, whether an audio anomaly has occurred from the audio data. The system is programmed to cause, in real time, a presentation of an alert to a second mobile device, which could be the same smartphone, in response to detecting an occurrence of audio anomaly.
Abstract:
Some disclosed teleconferencing methods may involve detecting a howl state during a teleconference. The teleconference may involve two or more teleconference client locations and a teleconference server. The teleconference server may be configured for providing full-duplex audio connectivity between the teleconference client locations. The howl state may be a state of acoustic feedback involving two or more teleconference devices in a teleconference client location. Detecting the howl state may involve an analysis of both spectral and temporal characteristics of teleconference audio data. Some disclosed teleconferencing methods may involve determining which client location is causing the howl state. Some such methods may involve mitigating the howl state and/or sending a howl state detection message.
Abstract:
Example embodiments disclosed herein relate to a estimation of reverberant energy components from audio sources. A method of estimating a reverberant energy component from an active audio source (100) is disclosed. The method comprises determining a correspondence between the active audio source and a plurality of sample sources by comparing one or more spatial features of the active audio source with one or more spatial features of the plurality of sample sources, each of the sample sources being associated with an adaptive filtering model (101); obtaining an adaptive filtering model for the active audio source based on the determined correspondence (102); and estimating the reverberant energy component from the active audio source over time based on the adaptive filtering model (103). Corresponding system (800) and computer program product (900) are also disclosed.
Abstract:
A method of encoding audio information for forward error correction reconstruction of a transmitted audio stream over a lossy packet switched network, the method including the steps of: (a) dividing the audio stream into audio frames; (b) determining a series of corresponding audio frequency bands for the audio frames; (c) determining a series of power envelopes for the frequency bands; (d) encoding the envelopes as a low bit rate version of the audio frame in a redundant transmission frame.
Abstract:
Embodiments of client device and method for audio or video conferencing are described. An embodiment includes an offset detecting unit, a configuring unit, an estimator and an output unit. The offset detecting unit detects an offset of speech input to the client device. The configuring unit determines a voice latency from the client device to every far end. The estimator estimates a time when a user at the far end perceives the offset based on the voice latency. The output unit outputs a perceivable signal indicating that a user at the far end perceives the offset based on the time estimated for the far end. The perceivable signal is helpful to avoid collision between parties.
Abstract:
A method for channel identification of a multi-channel audio signal comprising X>1 channels is provided. The method comprises the steps of: identifying, among the X channels, any empty channels, thus resulting in a subset of Y≤X non-empty channels; determining whether a low frequency effect (LFE) channel is present among the Y channels, and upon determining that an LFE channel is present, identifying the determined channel among the Y channels as the LFE channel; dividing the remaining channels among the Y channels not being identified as the LFE channel into any number of pairs of channels by matching symmetrical channels; and identifying any remaining unpaired channel among the Y channels not being identified as the LFE channel or divided into pairs as a center channel.