摘要:
Methods and apparatus for quickly selecting an optimal excitation waveform from a codebook are presented herein. To reduce the number of computations required to choose the optimal codebook vector, a subset of codevectors are selected based upon optimal pulse locations, wherein the subset of codevectors form a subcodebook. Rather than searching the entire codebook, only the entries of the subcodebook are searched.
摘要:
In a device configurable to encode speech performing an open loop re-decision may comprise representing a speech signal by amplitude components and phase components for a current frame and a past frame. During the current frame, there may be an extraction of uncompressed amplitude components and uncompressed phase components. The amplitude components and the phase components from the past frame may then be retrieved. A set of features may be generated based on the uncompressed amplitude components from the current frame, the uncompressed phase components from the current frame, the amplitude components from the past frame, and the phase components from the past frame. The set of features may be checked as part of the open loop re-decision, and determining a final encoding decision based on the checking may be performed. The final encoding decision may be an encoding mode and/or encoding rate.
摘要:
Systems, methods, and apparatus described include waveform alignment operations in which a single set of evaluated cosines and sines is used to calculate cross-correlations of two periodic waveforms at two different phase shifts.
摘要:
Methods and apparatus are provided for achieving an arbitrary average data rate for a variable rate coder. One method includes selecting a set (e.g., a pair) of initial composite rates surrounding the arbitrary average data rate. A reallocation fraction is then calculated based on the initial composite rates. The reallocation fraction is used to reassign a number of frames from one component rate of an initial composite rate to another in order to achieve the arbitrary average data rate. Such a method may be configured such that selecting an initial composite rate on one side of (e.g., less than) the arbitrary average data rate implicitly selects the initial composite rate on the other side of the arbitrary average data rate.
摘要:
Systems, methods, and apparatus for the detection of signals having spectral peaks with narrow bandwidth are described herein. The range of described configurations includes implementations that perform such detection using parameters of a linear prediction coding (LPC) analysis scheme.
摘要:
Methods and apparatus for quickly selecting an optimal excitation waveform from a codebook are presented herein. In encoding schemes that use forward and backward pitch enhancement, storage and processor load is reduced by approximating a two-dimensional autocorrelation matrix with a one-dimensional autocorrelation vector. The approximation is possible when a cross-correlation element is configured to determine the autocorrelation matrix of an impulse response and a pulse energy determination element is configured to determine the energy of a pulse code vector that incorporates secondary pulse positions.
摘要:
A speech converter in a speech processing system modifies various aspects of input speech. The speech converter receives a formants signal representing an input speech signal. The speech converter may also receive a formant scaling command or a user selection of one of multiple control signals, each specifying a manner of modifying one or more of the received signals (i.e., formants, voicing, pitch, gain). The speech converter modifies at least one of the formants, voicing, pitch, and/or gain signals as specified by the selected voice font.
摘要:
Automatic white balance of captured images can be performed based on a gray world assumption. Initially, a flat field gray image is captured for one or more reference illuminations. The statistics of the captured gray image are determined and stored for each reference illumination during a calibration process. For each subsequent captured image, the image is filtered to determine a subset of gray pixels. The gray pixels are further divided into a one or more gray clusters. The average weight of the one or more gray clusters is determined and a distance from the average weights to the reference illuminants is determined. An estimate of the illuminant is determined depending on the distances. White balance gains are applied to the image based on the estimated illuminant.
摘要:
Methods and apparatus are presented for supporting the transmission of variable-rate vocoder frames over non-compatible communication channels. Variable-rate vocoder frames are re-formatted as cargo in multi-rate vocoder frames. At the receiver, a determination is made as to whether a received multi-rate vocoder frame carries a variable-rate vocoder frame cargo. If a variable-rate vocoder frame is cargo, then a determination of the frame type is made. Various embodiments for conveying cargo information are presented.
摘要:
A wideband speech encoder according to one embodiment includes a narrowband encoder and a highband encoder. The narrowband encoder is configured to encode a narrowband portion of a wideband speech signal into a set of filter parameters and a corresponding encoded excitation signal. The highband encoder is configured to encode, according to a highband excitation signal, a highband portion of the wideband speech signal into a set of filter parameters. The highband encoder is configured to generate the highband excitation signal by applying a nonlinear function to a signal based on the encoded narrowband excitation signal to generate a spectrally extended signal.