摘要:
Teleconferencing and, in particular, distributed teleconferencing may use methods and systems for location grouping to reduce feedback and other audio anomalies. Terminals and users connected to the same teleconference and in the same location might not need to receive audio signals from the other terminals and users in the same location. As such, by detecting and analyzing the location of each participating terminal, the terminals (and thus, the users thereof) may be organized into location groups to provide proper audio mixing. In one example, first and second terminals in the same location might not receive each other's audio in a downstream teleconference signal. The location and grouping of terminals may be processed using context fingerprint information derived from sensor readings of each terminal. Sensors may include GPS sensors, cameras, BLUETOOTH sensors and the like. Context fingerprint information may further be synchronized to enhance location determination and grouping.
摘要:
A network entity, method and computer program product are provided for effectuating a conference session. The method may include receiving a plurality of signals representative of voice communication of the participants. In this regard, the signals may be received from a plurality of terminals of a respective plurality of participants at one of the locations, each of at least some of the terminals otherwise being configured for voice communication independent of at least some of the other terminals. The method of this aspect also includes classifying speech activity of the conference session according to a speech pause, or one or more actively-speaking participants, during the conference session. The signals of the respective participants may then be mixed into a at least one mixed signal for output to one or more other participants at one or more other locations, the signals being mixed based upon classification of the speech activity.
摘要:
A method for distinguishing speakers in a conference call of a plurality of participants, in which method speech frames of the conference call are received in a receiving unit, which speech frames include encoded speech parameters. At least one parameter of the received speech frames is examined in an audio codec of the receiving unit, and the speech frames are classified to belong to one of the participants, the classification being carried out according to differences in the examined at least one speech parameter. These functions may be carried out in a speaker identification block, which is applicable in various positions of a teleconferencing processing chain. Finally, a spatialization effect is created in a terminal reproducing the audio signal according to notified differences by placing the participants at distinct positions in an acoustical space of the audio signal.
摘要:
The invention relates to audio conferencing. Audio signals are received and transformed to a spectrum, and then modified by mel-frequency scaling and logarithmic scaling before a second-order transform. The obtained coefficients can be further processed before carrying out the similarity comparison between signals. Voice activity detection and other information like mute signalling can be used in the formation of the similarity information. The resulting similarity information can be used to form groups, and the resulting groups can be analyzed topologically. The similarity information can then be used to form a control signal for audio conferencing, e.g. to control an audio conference so that a signal of a co-located audio source is removed.
摘要:
A method comprising: receiving a plurality of audio input signals in a mixer apparatus; selecting a predetermined number of active audio input signals to be used as the basis for room effect signal generation; applying the predetermined number of dedicated room effect processing units based at least partly on the selected predetermined number of audio input signals; creating a set of spatialized signals for a plurality of audio output signals; and creating the plurality of audio output signals by combining, for each output signal m, spatialized signals created for the output signal m and room effect signals from all room effect processing units.
摘要:
A method including: obtaining phase information dependent upon a time-varying phase difference between captured audio channels; obtaining sampling information relating to time-varying spatial sampling of the captured audio channels; and processing the phase information and the sampling information to determine audio control information for controlling spatial rendering of the captured audio channels.
摘要:
A method comprising: modifying a sound stage produced by an input audio signal comprising two or more audio channels such that spatial room is relieved for one or more additional sound sources; and inserting said one or more additional sound sources in the relieved spatial room of the modified sound stage of the input audio signal without introducing spatial interference with the modified sound stage of the input audio signal.
摘要:
A method including: determining a time difference between at least a first audio channel and a second audio channel of the same acoustic space; and enabling a corrective time shift between the first audio channel and the second audio channel when the time difference exceeds a threshold.
摘要:
Provided are multichannel architectures, systems, methods, and computer program products for distributed teleconferencing using one or more master devices and/or a centralized conferencing switch. Multichannels enhance functionality of a master device in distributed teleconferencing and allow for compatibility with 3D capable teleconferencing. Multichannel distributed teleconferencing involves multichannel, monophonic, and/or a fixed number of uplink and downlink channels. A multichannel distributed teleconferencing system may perform active talker detection of near-end participants and communicate an ID signal on an uplink channel identifying the active near-end participants. A multichannel distributed teleconferencing system may also receive an ID signal on a downlink channel identifying the active far-end participants. A multichannel distributed teleconferencing system may perform various uplink and downlink processing. Uplink processing may involve multimixing and spatialization. Multimixing may be used to separate speech signals of near-end participants. Spatialization, also used in downlink processing, introduces spatial separation of active participants.
摘要:
Detection from sensors may be used to configure or modify the configuration of audio directional processing to improve user safety and/or communication by processing at least one control parameter dependent on at least one sensor input parameter, processing at least one audio signal dependent on the processed at least one control parameter, and outputting the processed at least one audio signal.