摘要:
Media processing of real-time protocol (RTP) packets used in time sensitive applications makes efficient use of network resources, e.g., by dropping or resizing the packets, but hinders measuring and reporting end-to-end reception quality. Because media processing causes a difference between what is sent and received, end-to-end reception quality cannot be measured validly without accounting for this difference. Accordingly, a method and corresponding apparatus are provided to track changes to RTP packets of an RTP session caused by media processing, modify RTP packet information of the RTP packets based on the tracked changes, correct RTP control protocol (RTCP) packets corresponding to the RTP session based on the tracked changes, the corrected RTCP packets being a measure of the end-to-end reception quality of the RTP session, and report the end-to-end reception quality of the RTP session by forwarding the corrected RTCP packets. Thus, end-to-end reception quality can be validly measured and reported.
摘要:
Media processing of real-time protocol (RTP) packets used in Voice over Internet Protocol (VoIP) and other time sensitive applications makes efficient use of network resources, e.g., by dropping or changing the size of certain packets, but hinders measuring and reporting end-to-end reception quality. Because media processing changes RTP packets between a sender and receiver, causing a difference between what is sent and received, end-to-end reception quality cannot be measured validly without accounting for these changes. Accordingly, a method and corresponding apparatus are provided to track changes to RTP packets of an RTP session caused by media processing of the RTP packets, modify RTP packet information of the RTP packets based on the tracked changes, correct RTP control protocol (RTCP) packets corresponding to the RTP session based on the tracked changes, the corrected RTCP packets being a measure of the end-to-end reception quality of the RTP session, and report the end-to-end reception quality of the RTP session by forwarding the corrected RTCP packets.
摘要:
Adaptive Gain Control (AGC) is performed directly in a coded domain. A Coded Domain Adaptive Gain Control (CD-AGC) system modifies at least one parameter of a first encoded signal, resulting in corresponding modified parameter(s). The CD-VQE system replaces the parameter(s) of the first encoded signal with the modified parameter(s), resulting in a second encoded signal. In a decoded state, the second encoded signal approximates a target signal that is a function of two signals, including the first encoded signal and a third encoded signal, in at least a partially decoded states. Thus, the first encoded signal does not have to go through intermediate decode/re-encode processes, which can degrade overall speech quality. Computational resources required for a complete re-encoding are not needed. Overall delay of the system is minimized. The CD-AGC system can be used in any network in which signals are communicated in a coded domain, such as a Third Generation (3G) wireless network.
摘要:
In some communications systems, unsynchronized near-end and far-end packets of communications signals can reduce or impair performance of processing of packets, such as to the case of Coded Domain Media Quality Enhancement. Therefore, a system may synchronize the incoming signals to enhance quality. A relative delay determination module according to an example embodiment of this invention determines a synchronization and relative delay between packets belonging to different packet streams arriving at a network node in a packet-based network by computing a time synchronization parameter based on a time reference of timestamps of the signals and reports the relative delay to a module making use of the relative delay such as a voice quality enhancement or an echo control module. By synchronizing the packets at the location within the network, source clocks at end or edge nodes of the network can operate with reduced synchronization, simplifying network operations and management thereof.
摘要:
A method, apparatus, system, and program, for evaluating a call communicated between communicating devices through at least one communication path. The method comprises segmenting, into first segments, at least one first communication signal traveling from a first one of the communicating devices to a second one of the communicating devices through the at least one communication path, and segmenting, into second segments, at least one second communication signal traveling from the second one of the communicating devices to the first one of the communicating devices through the at least one communication path. The method also comprises determining predetermined call characteristics based on the first and second segments, and identifying whether an echo is present in the call based on a result of the determining.
摘要:
A two-pass classification system and method that post-processes HMM scores with additional confidence scores to derive a value that may be applied to a threshold on which a keyword verses non-keyword determination may be based. The first stage comprises Generalized Probabilistic Descent (GPD) analysis which uses feature vectors of the spoken words and the HMM segmentation information (developed by the HMM detector during processing) as inputs to develop a first set of confidence scores through a linear combination (a weighted sum) of the feature vectors of the speech. The second stage comprises a linear discrimination method that combines the HMM scores and the confidence scores from the GPD stage with a weighted sum to derive a second confidence score. The output of the second stage may then be compared to a predetermined threshold to determine whether the spoken word or words include a keyword.
摘要:
A method and corresponding apparatus for coded-domain acoustic echo control is presented. An echo control problem is considered as that of perceptually matching an echo signal to a reference signal. A perceptual similarity function that is based on the coded spectral parameters produced by the speech codec is defined. Since codecs introduce a significant degree of non-linearity into the echo signal, the similarity function is designed to be robust against such effects. The similarity function is incorporated into a coded-domain echo control system that also includes spectrally-matched noise injection for replacing echo frames with comfort noise. Using actual echoes recorded over a commercial mobile network, it is shown herein that the similarity function is robust against both codec non-linearities and additive noise. Experimental results further show that the echo-control is effective at suppressing echoes compared to a Normalized Least Mean Squared (NLMS)-based echo cancellation system.
摘要:
A high reliability digit string recognizer/rejection system that processes spoken words through an HMM recognizer to determine a string of candidate digits, a filler model for each digit in the digit string, and other information. Next, a weighted sum is generated for each digit in the string and for a filler model for each digit in the string. A confidence score is generated for each digit by subtracting the filler weighted sum from the digit weighted sum. The confidence score for each digit is then compared to a threshold and, if the confidence score for any of the digits is less than the threshold, the entire digit string is rejected. If the confidence scores for all of the digits in the digit string are equal to or greater than the threshold, then the candidate digit string is accepted as a digit string.
摘要:
A system, apparatus, method, and computer-readable medium for coded-domain echo cancellation. The method includes receiving a signal including at least one packet, and replacing the at least one packet with a replacement packet. In one example, the replacement packet is a comfort noise packet (such as a SID_UPDATE packet) or a NO_DATA packet. In an example embodiment, the at least one packet included in the signal includes one or more comfort noise packets, and, prior to the replacing, the one or more comfort noise packet(s) are stored in a buffer. In another example, prior to the replacing, the at least one packet is compared to a reference packet to determine whether the at least one packet is an echo packet. The packet, in one example, is encoded based on an adaptive multi-rate (AMR) (e.g., AMR-NB or AMR-WB) codec.
摘要:
A method and corresponding apparatus for coded-domain acoustic echo control is presented. An echo control problem is considered as that of perceptually matching an echo signal to a reference signal. A perceptual similarity function that is based on the coded spectral parameters produced by the speech codec is defined. Since codecs introduce a significant degree of non-linearity into the echo signal, the similarity function is designed to be robust against such effects. The similarity function is incorporated into a coded-domain echo control system that also includes spectrally-matched noise injection for replacing echo frames with comfort noise. Using actual echoes recorded over a commercial mobile network, it is shown herein that the similarity function is robust against both codec non-linearities and additive noise. Experimental results further show that the echo-control is effective at suppressing echoes compared to a Normalized Least Mean Squared (NLMS)-based echo cancellation system.