摘要:
A network entity, method and computer program product are provided for effectuating a conference session. The method may include receiving a plurality of signals representative of voice communication of the participants. In this regard, the signals may be received from a plurality of terminals of a respective plurality of participants at one of the locations, each of at least some of the terminals otherwise being configured for voice communication independent of at least some of the other terminals. The method of this aspect also includes classifying speech activity of the conference session according to a speech pause, or one or more actively-speaking participants, during the conference session. The signals of the respective participants may then be mixed into a at least one mixed signal for output to one or more other participants at one or more other locations, the signals being mixed based upon classification of the speech activity.
摘要:
A method, device, system, and computer program product expand narrowband speech signals to wideband speech signals. The method includes determining signal type information from a signal, obtaining characteristics for forming an upper band signal using the determined signal type information, determining signal noise information, using the determined signal noise information to modify the obtained characteristics for forming the upper band signal, and forming the upper band signal using the modified characteristics.
摘要:
A method, device, system, and computer program product expand narrowband speech signals to wideband speech signals. The method includes determining signal type information from a signal, obtaining characteristics for forming an upper band signal using the determined signal type information, determining signal noise information, using the determined signal noise information to modify the obtained characteristics for forming the upper band signal, and forming the upper band signal using the modified characteristics.
摘要:
A method, device, system, and computer program product calculate a gradient index as a sum of magnitudes of gradients of speech signals from a received frame at each change of direction; and provide an indication that the frame contains babble noise if the gradient index, energy information, and background noise level exceed pre-determined thresholds or a voice activity detector algorithm and sound level indicate babble noise.
摘要:
A method, device, system, and computer program product calculate a gradient index as a sum of magnitudes of gradients of speech signals from a received frame at each change of direction; and provide an indication that the frame contains babble noise if the gradient index, energy information, and background noise level exceed pre-determined thresholds or a voice activity detector algorithm and sound level indicate babble noise.
摘要:
Apparatus comprising: an input amplitude and phase calculator configured to determine at least one amplitude value and phase value dependent on a first audio signal; a synthesis amplitude calculator configured to synthesize a further amplitude value associated with each amplitude value dependent on a determined harmonic shaping function; a synthesis phase calculator configured to synthesize a further phase value associated with each phase value; and a signal synthesizer configured to generate a bandwidth extension signal dependent the further amplitude value and the further phase values.
摘要:
Provided are multichannel architectures, systems, methods, and computer program products for distributed teleconferencing using one or more master devices and/or a centralized conferencing switch. Multichannels enhance functionality of a master device in distributed teleconferencing and allow for compatibility with 3D capable teleconferencing. Multichannel distributed teleconferencing involves multichannel, monophonic, and/or a fixed number of uplink and downlink channels. A multichannel distributed teleconferencing system may perform active talker detection of near-end participants and communicate an ID signal on an uplink channel identifying the active near-end participants. A multichannel distributed teleconferencing system may also receive an ID signal on a downlink channel identifying the active far-end participants. A multichannel distributed teleconferencing system may perform various uplink and downlink processing. Uplink processing may involve multimixing and spatialization. Multimixing may be used to separate speech signals of near-end participants. Spatialization, also used in downlink processing, introduces spatial separation of active participants.
摘要:
An apparatus for extending the bandwidth of an audio signal, the apparatus being configured to: generate an excitation signal from an audio signal, wherein in the audio signal comprises a plurality of frequency components; extract a feature vector from the audio signal, wherein the feature vector comprises at least one frequency domain component feature and at least one time domain component feature; determine at least one spectral shape parameter from the feature vector, wherein the at least one spectral shape parameter corresponds to a sub band signal comprising frequency components which belong to a further plurality of frequency components; and generate the sub band signal by filtering the excitation signal through a filter bank and weighting the filtered excitation signal with the at least one spectral shape parameter.
摘要:
A network entity, method and computer program product are provided for effectuating a conference session. The method may include receiving a plurality of signals representative of voice communication of the participants. In this regard, the signals may be received from a plurality of terminals of a respective plurality of participants at one of the locations, each of at least some of the terminals otherwise being configured for voice communication independent of at least some of the other terminals. The method of this aspect also includes classifying speech activity of the conference session according to a speech pause, or one or more actively-speaking participants, during the conference session. The signals of the respective participants may then be mixed into a at least one mixed signal for output to one or more other participants at one or more other locations, the signals being mixed based upon classification of the speech activity.
摘要:
Techniques for applying artificial bandwidth expansion to a multichannel signal are described. Aspects of a system for applying artificial bandwidth expansion to a multichannel signal include an estimation component for receiving a multichannel signal and estimating delay and energy level differences for each channel of the multichannel signal. An artificial bandwidth expansion component artificially expands the bandwidth of each of the channels of the multichannel signal separately. Each one of a plurality of adjustment components are configured to modify a different one of the artificial bandwidth expanded channels of the multichannel signal based upon the estimated delay and energy level differences. The multichannel signal may be a binaural speech signal.