Abstract:
Teleconference audio data including a plurality of individual uplink data packet streams, may be received during a teleconference. Each uplink data packet stream may corresponding to a telephone endpoint used by one or more teleconference participants. The teleconference audio data may be analyzed to determine a plurality of suppressive gain coefficients, which may be applied to first instances of the teleconference audio data during the teleconference, to produce first gain-suppressed audio data provided to the telephone endpoints during the teleconference. Second instances of the teleconference audio data, as well as gain coefficient data corresponding to the plurality of suppressive gain coefficients, may be sent to a memory system as individual uplink data packet streams. The second instances of the teleconference audio data may be less gain-suppressed than the first gain-suppressed audio data.
Abstract:
Disclosed is an apparatus and method operative to receive packets of media from a network including a receiver unit operative to receive the packets from the network, a jitter buffer data structure for receiving the packets in an ordered queue, the jitter buffer data structure having a tail into which the packets are input; a plurality of heads defining points in the jitter buffer data structure from which the ordered queue of packets are to be played back, the heads comprise an adjustable actual playback head coupled to an actual playback unit and at least one prototype head, each prototype head having associated therewith a target latency a processor having decision logic operable to determine a cost of achieving the associated target latency for each prototype head, wherein the decision logic compares the costs determined for each prototype head to identify a particular target latency and head location for the actual playback head of the buffer and a playback unit coupled to the processor for actual playback of the playback head of the buffer, such that the particular target latency of the jitter buffer data structure is determined at playback of the buffer rather than upon input of the packets into the jitter buffer data structure.
Abstract:
Various disclosed implementations involve processing and/or playback of a recording of a conference involving a plurality of conference participants. Some implementations disclosed herein involve receiving speech recognition results data, including a plurality of speech recognition lattices and a word recognition confidence score for each of a plurality of hypothesized words of the speech recognition lattices, for a conference recording. A primary word candidate and alternative word hypotheses may be determined for hypothesized words in the speech recognition lattices. A term frequency metric may be calculated for sorting the primary word candidates and the alternative word hypotheses. Hypothesized words may be rescored according to an alternative hypothesis list.
Abstract:
Embodiments are described for a soundfield system that receives a transmitting soundfield, wherein the transmitting soundfield includes a sound source at a location in the transmitting soundfield. The system determines a rotation angle for rotating the transmitting soundfield based on a desired location for the sound source. The transmitting soundfield is rotated by the determined angle and the system obtains a listener's soundfield based on the rotated transmitting soundfield. The listener's soundfield is transmitted for rendering to a listener.