摘要:
An exemplary multi-channel speech processor comprises a controller capable of interfacing with a plurality of channels, and at least one signal processing unit (SPU) coupled to the controller, where the multi-channel speech processor has a maximum execution time for processing all frames, one channel at a time, by processing a single frame from each of the plurality of channels. The signal processing unit encodes each of the single frames from each of the plurality of channels, one channel at a time, to generate encoded frames until the maximum execution time elapses or is about to elapse. The controller also transmits a predetermined frame for each of the plurality of channels not processed during the encoding step, due to the maximum execution time elapsing or being about to elapse, such that the predetermined frame causes a decoder which receives the predetermined frame to generate a frame erase frame.
摘要:
An exemplary multi-channel speech processor comprises a controller capable of interfacing with a plurality of channels, and at least one signal processing unit (SPU) coupled to the controller, where the multi-channel speech processor has a maximum execution time for processing all frames, one channel at a time, by processing a single frame from each of the plurality of channels. The signal processing unit encodes each of the single frames from each of the plurality of channels, one channel at a time, to generate encoded frames until the maximum execution time elapses or is about to elapse. The controller also transmits a pre-determined frame for each of the plurality of channels not processed during the encoding step, due to the maximum execution time elapsing or being about to elapse, such that the predetermined frame causes a decoder which receives the predetermined frame to generate a frame erase frame.
摘要:
An exemplary multi-channel speech processor comprises a controller capable of interfacing with a plurality of channels, and at least one signal processing unit (SPU) coupled to the controller, where the multi-channel speech processor has a maximum execution time for processing all frames, one channel at a time, by processing a single frame from each of the plurality of channels. The signal processing unit encodes each of the single frames from each of the plurality of channels, one channel at a time, to generate encoded frames until the maximum execution time elapses or is about to elapse. The controller also transmits a predetermined frame for each of the plurality of channels not processed during the encoding step, due to the maximum execution time elapsing or being about to elapse, such that the predetermined frame causes a decoder which receives the predetermined frame to generate a frame erase frame.
摘要:
Provided is a method and computer program product for producing an enhanced audio signal for an output device from audio signals received by 2 or more microphones in close proximity to each other. For example, one embodiment of the present invention comprises the steps of receiving a first input audio signal from the first microphone, digitizing the first input audio signal to produce a first digitized audio input signal, receiving a second input audio input signal from the second microphone, digitizing the second input audio input signal to produce a second digitized audio input signal, using the first digitized audio input signal as a reference signal to an adaptive prediction filter, using the second digitized audio input signal as input to said adaptive prediction filter and finally adding a prediction result signal from the adaptive prediction filter to the first digitized audio input signal to produce the enhanced audio signal. In other embodiments, any number of microphones can be used, and in all embodiments there is no requirement to detect or locate the source or direction of arrival of the input audio signals.
摘要:
There is provided a method of detecting and reporting poor voice quality for use by a gateway device. The method comprises facilitating a connection between a telephone and a remote telephone via a network, and detecting a poor voice quality indictor during the connection. The method further comprises capturing, for a pre-determined period of time, telephone voice data being exchanged between the gateway and the telephone, network voice data being exchanged between the gateway and the network, and gateway parameters. The method also comprises packetizing the telephone voice data, the network voice data and the gateway parameters into a plurality packets having a network address of a network storage, and transmitting the plurality packets destined for the network storage via the network. In one aspect, the poor voice quality indictor may be generated by a user of the telephone in response to a poor voice quality of the connection.
摘要:
There is provided a method of selecting a pitch lag value for a portion of a speech signal, the method comprising: computing a weighted correlation function of the portion of the speech signal for a range of delay times, wherein the weighting of the correlation function depends on both the delay time and a characteristic of one or more previous portions of the speech signal; and selecting the pitch lag value based on a delay time from the range of delay times that maximizes the weighted correlation function.
摘要:
There is provided a method of selecting a pitch lag value from a plurality of pitch lag candidates for coding a speech signal. The method comprises identifying the plurality of pitch lag candidates from a frame of the speech signal using correlation; classifying the speech signal to obtain a voice classification; determining whether one or more of the plurality of pitch lag candidates are in a temporal neighborhood of one or more previous pitch lag values; favoring the one or more of the plurality of pitch lag candidates determined to be in the temporal neighborhood of the one or more previous pitch lag values, by adaptive weighting, over other ones of the plurality of pitch lag candidates; and selecting the pitch lag value based on the voice classification and the one or more of the plurality of pitch lag candidates favored by the adaptive weighting.
摘要:
A multi-channel speech processor for encoding speech in a packet network environment is disclosed. In one illustrative aspect, a complexity resource manager (CRM) is executed by a controller or processor. The CRM manages the level of complexity of encoding which is used by a signal processing unit (SPU) to convert the speech signal into packet data. In general, the CRM determines the level of complexity of encoding based on a calculated complexity budget, where the complexity budget is determined based on the time required to process prior speech signal channels and the time available to process the remaining channels. In this way, the CRM is able to control the overall complexity of the speech processor through its ability to signal the SPU to encode speech signal in a complexity reduced mode based on the calculated complexity budget under certain conditions.
摘要:
There are provided speech coding methods and systems for estimating a plurality of speech parameters of a speech signal for coding the speech signal using one of a plurality of speech coding algorithms, the plurality of speech parameters includes pitch information, the plurality of speech parameters is calculated using a plurality of thresholds. An example method includes estimating a background noise level in the speech signal to determine a signal to noise ratio (SNR) for the speech signal, adjusting one or more of the plurality of thresholds based on the SNR to generate one or more SNR adjusted thresholds, analyzing the speech signal to extract the pitch information using the one or more SNR adjusted thresholds, and repeating the estimating, the adjusting and the analyzing to code the speech signal using one the plurality of speech coding algorithms.
摘要:
A flexible variable rate vocoder and related method of operation. The vocoder selects a target average data rate responsive to at least one network parameter and at least one external parameter.