Abstract:
Multiple simultaneous calls are controlled. At least one processor is used to display an interface including indicators corresponding to at least two audio presentation devices. Each interface includes call contextual controls which change according to the state of a call. A swap control swaps an active call between the audio presentation devices corresponding to the indicators.
Abstract:
The present document relates to audio conference systems. In particular, the present document relates to improving the perceptual continuity within an audio conference system. According to an aspect, a method for multiplexing first and second continuous input audio signals is described, to yield a multiplexed output audio signal which is to be rendered to a listener. The first and second input audio signals (123) are indicative of sounds captured by a first and a second endpoint (120, 170), respectively. The method comprises determining a talk activity (201, 202) in the first and second input audio signals (123), respectively; and determining the multiplexed output audio signal based on the first and/or second input audio signals (123) and subject to one or more multiplexing conditions. The one or more multiplexing conditions comprise: at a time instant, when there is talk activity (201) in the first input audio signal (123), determining the multiplexed output audio signal at least based on the first input audio signal (123); at a time instant, when there is talk activity (202) in the second input audio signal (123), determining the multiplexed output audio signal at least based on the second input audio signal (123); and at a silence time instant, when there is no talk activity (201, 202) in the first and in the second input audio signals (123), determining the multiplexed output audio signal based on only one of the first and second input audio signals (123).
Abstract:
The present document relates to audio communication systems. In particular, the present document relates to the control of the level of audio signals within audio communication systems. A method for leveling a near-end audio signal (211) using a leveling gain (214) is described. The near-end audio signal (211) comprises a sequence of segments, wherein the sequence of segments comprises a current segment and one or more preceding segments. The method comprises determining a nuisance measure (416) which is indicative of an amount of aberrant voice activity within the sequence of segments of the near-end audio signal (211); and determining the leveling gain (214) for the current segment of the near-end audio signal (211), at least based on the leveling gain (214) for the one or more preceding segments of the near-end audio signal (211), and by taking into account - according to a variable degree - an estimate of the level of the current segment of the near-end audio signal (211); wherein the variable degree is dependent on the nuisance measure (416).
Abstract:
A system, device, and method for generating an audio output includes a master computing device and a plurality of client computing devices. Each client computing device includes a microphone to record audio signals. The client computing devices generate audio data based on the audio signals and transmit the audio data to the master computing device. The master computing device generates a final, higher quality audio output as a function of the audio data received from collection of participating the client computing devices.
Abstract:
Voice communication method and apparatus and method and apparatus for operating jitter buffer are described. Audio blocks are acquired in sequence. Each of the audio blocks includes one or more audio frames. Voice activity detection is performed on the audio blocks. In response to deciding voice onset for a present one of the audio blocks, a subsequence of the sequence of the acquired audio blocks is retrieved. The subsequence precedes the present audio block immediately. The subsequence has a predetermined length and non-voice is decided for each audio block in the subsequence. The present audio block and the audio blocks in the subsequence are transmitted to a receiving party. The audio blocks in the subsequence are identified as reprocessed audio blocks. In response to deciding non-voice for the present audio block, the present audio block is cached.
Abstract:
A multipoint connection apparatus (200) includes a video/audio-signal receiving unit (201) that receives video/audio signals from video/audio terminals (100); a volume-level calculating unit (205) that calculates volume levels from the video/audio signals; a volume-display-image generating unit (207) that generates volume display images indicating volume from the volume levels; a layout-setting-information receiving unit (209) that receives layout setting information indicating information about arrangement of videos to be displayed on the video/audio terminal (100); a combined-video/audio-signal generating unit (211) that generates a combined video/audio signal by combining the video/audio signals and the volume display images based on the layout setting information; and a transmitting unit (215) that transmits the video/audio signal to the video/audio terminal (100).
Abstract:
The invention relates to a method for managing a packet switched, centralized conference call between a plurality of terminals (13). In order to enable an enhancement of the user comfort, it is proposed that the method comprises at a conference call server (12) receiving data packets from all terminals (13). Based on these data packets, then at least one terminal (13) currently providing voice data is determined. In a next step, the data received in the data packets is mixed, and the mixed data is inserted into new data packets together with at least one identifier associated to one of the terminals (13) which were determined to provide voice data, such that the at least one identifier can be distinguished from any other information in the data packets. Finally, the new data packets are transmitted to terminals (13) participating in the conference call. The invention relates equally to a corresponding server and to a corresponding terminal.
Abstract:
A method and associated apparatus for indicating the voice of each talker from a plurality of talkers to be heard by a listener. A talker indicator (Fig. 2, 32) is provided proximate to the listener. Talker identification information is generated in the talker indicator that can be used to indicate the identity of each talker who is speaking at any given time to the listener. A device (Fig. 1, 23) is coupled to the talker indicator that can transmit the voice signal from each talker to the listener. In different aspects, the talker identification information can include such varied indicators as audio, video, or an announcement combined with a temporally compressed voice signal. In another aspect, an emotographic figure is displayed to the listener that each represent a distinct talker (Fig. 12). The mood of each emotographic is somehow configured to reflect the mode of the talker, as indicated by the talker's voice (Fig. 14).
Abstract:
Numerous packet-based terminals coupled within a packet-based network can establish a voice conference without the use of a conference bridge if the packet-based terminals can support specific operations. These specific operations include receiving voice data packets from each of the other packet-based terminals within the voice conference, determining a set of talkers within the voice conference and processing the received media data packets appropriately for the selected set of talkers so as to output uncompressed voice signals corresponding to the talkers to a speaker coupled to the packet-based terminal. the removal of the conference bridge can allow the packet-based apparatus to become independent from the packet-based network administration. Further, the removal of the conference bridge allows a reduction in transcoding and hence, allows a better quality signal to be received at the individual apparatus.