摘要:
In an exemplary embodiment, a wireless handset allows a user having a connection in an “on-hold” state to select one or more sources for play-out of media at a handset receiver while in the on-hold state, and then be signaled when the on-hold state is terminated. Such on-hold state might be indirectly detected, such as by detection of music-on-hold, or directly detected through on-hold notification. User selected media for play-out might be locally generated at the user's handset, or provided through a separate connection established between the wireless handset and the network.
摘要:
A CELP speech decoder includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook. The CS-ACELP decoder generates a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame of compressed speech information. The decoder does this by classifying the speech signal to be generated as periodic (voiced) or non-periodic (unvoiced) and then generating an excitation signal based on this classification. If the speech signal is classified as periodic, the excitation signal is generated based on the output signal from the first portion and not on the output signal from the second portion. If the speech signal is classified as non-periodic, the excitation signal is generated based on the output signal from said second portion and not on the output signal from said first portion.
摘要:
The present invention provides a novel method of analyzing speech signals in order to reduce the computational power required to perform both speech compression and voice recognition operations. Digital speech signals are provided to a speech analyzer which generates a linear predictive coded (LPC) speech analysis signal that is compatible for use in both the voice recognition circuit and the speech compression circuit. The speech analysis signal is then provided to the compression circuit, which further processes the signal into a form used by an encoder and then the encoder encodes the processed signal. The same speech analysis signal is also provided to a voice recognition circuit, which further processes the signal into a form used by a recognizer and then the recognizer performs recognition on the processed signal.
摘要:
In a system comprising a plurality of processors and a memory shared by at least a subset of the processors, a method for processing video data includes the steps of: (a) a first one of the processors receiving a first video frame and storing the first video frame in the memory; (b) the first one of the processors receiving at least a second video frame, receipt of the second video frame initiating a release of the first video frame from the memory; (c) the first one of the processors sending the first and second video frames to a second one of the processors together for processing by the second one of the processors; (d) the second one of the processors generating an output video frame based at least on the first and second video frames; (e) storing the output video frame in the memory by overwriting an available memory location therein, the output video frame becoming a new first video frame; and (f) repeating steps (b) through (e) until all video frames to be processed have been received.
摘要:
An apparatus for providing at least first and second representations of an audio signal for use in a communications system is described. The apparatus comprises a first quantizer for quantizing at least a portion of the signal in accordance with a first multidimensional lattice to generate a first representation. The apparatus further comprises a second quantizer for quantizing at least a portion of the signal in accordance with a second, different multidimensional lattice to generate a second representation. In an illustrative embodiment, the first representation is a core representation containing core audio information. The second representation is an enhancement representation containing enhancement audio information. The core representation is necessary for recovering the audio signal with minimal acceptable quality. Audio quality is enhanced when the core representation, together with the enhancement representation, is used to recover the audio signal. A method for use in such an apparatus is also described.
摘要:
A telephone answering device including two separate coders, a first coder for encoding/decoding fixed voice prompts spoken by a single speaker, and a second coder for encoding/decoding incoming and outgoing voice messages spoken by multiple speakers. The first coder uses a first set of codebooks trained based on a first set of utterances spoken by a single speaker, while the second coder uses a second set of codebooks trained based on a second set of utterances spoken by multiple speakers. Because the first set of utterances is significantly smaller in size than the second set of utterances, and the range of pitch period is significantly smaller in size for the first set of utterances spoken by a single speaker in comparison to that of the second set of utterances spoken by multiple speakers, the size of the first set of codebooks is significantly reduced relative to the size of the second set of codebooks. As a result, the fixed voice prompt messages may be compressed at a lower bit rate with a relatively high quality of encoding, thereby optimizing the codebook and reducing the amount of necessary memory capacity for storing the encoded fixed voice prompts. The memory required for the encoded first voice prompts is so small that they can be stored in a low cost DSP ROM.
摘要:
A base station includes a digital signal processor, the base station operable to communicate with at least first and second wireless terminals using compressed digital signals modulated onto a radio frequency carrier, the base station further operable to communicate with an external network to facilitate a call between one or more wireless terminals and another party connected to the external network. The digital signal processor is operably connected to receive compressed signals from and provide compressed signals to the first and second wireless terminals, and further operably connected to communicate uncompressed signals with the external network. The digital signal processor is programmed to execute instructions to perform the following functions: monitor a first signal, the first signal received from the first wireless terminal; monitor a second signal, the second signal received from the second wireless terminal; monitor a network signal, the network signal received from the external network; select using predetermined criteria a priority signal, the priority signal comprising one of the first signal, the second signal and the network signal; and perform either a compression or decompression process on the priority signal and then provide the processed priority signal to at least one of the first wireless terminal, second wireless terminal and external network.
摘要:
A radio communication device may be provided. The radio communication device may include: a receiver configured to receive data; a buffer configured to buffer a variable amount of the data; a reception condition determiner configured to determine a reception condition indicating a condition under which the receiver receives the data; and a buffer amount setter configured to set the amount of the data based on the determined reception condition.
摘要:
A multi-pulse excitation linear-predictive speech coder operates in accordance with an analysis-by-synthesis method for determining the excitation. The coder (10) comprises an LPC-analyzer (11), a multi-phase excitation generator (13), means (12, 14) for forming an error signal representative of the difference between an original speech signal (s(n)) and a synthetic speech signal (s(n)), a filter (15) for perceptually weighting the error signal and means (16) responsive to the weighted error signal (e(n)) for generating pulse parameters controlling the excitation generator (13) so as to minimize a predetermined measure of the weighted error signal. The LPC-parameters and the pulse parameters of the excitation signal (x(n)) are encoded for efficient storage or transmission. The bit capacity required for pulse position encoding of the excitation signal (x(n)) is considerably reduced by arranging the excitation generator (16) for an excitation signal (x(n)) which in each excitation interval (L) consists of a pulse pattern having a grid of a predetermined number (q) of equidstant pulses and by arranging the control means (16) for generating pulse parameters characterizing the grid position (k) relative to the beginning of the excitation interval (L) and the variable amplitudes (b.sub.k (j), 1.ltoreq.j.ltoreq.q) of the pulses of the grid.
摘要:
Part of the spectrum of two or more input signals is encoded using conventional coding techniques, while encoding the rest of the spectrum using binaural cue coding (BCC). In BCC coding, spectral components of the input signals are downmixed and BCC parameters (e.g., inter-channel level and/or time differences) are generated. In a stereo implementation, after converting the left and right channels to the frequency domain, pairs of left- and right-channel spectral components are downmixed to mono. The mono components are then converted back to the time domain, along with those left- and right-channel spectral components that were not downmixed, to form hybrid stereo signals, which can then be encoded using conventional coding techniques. For playback, the encoded bitstream is decoded using conventional decoding techniques. BCC synthesis techniques may then apply the BCC parameters to synthesize an auditory scene based on the mono components as well as the unmixed stereo components.