摘要:
A method and apparatus providing a user interface within a phone that responds to a limited vocabulary of user trained voice commands. The interface allows users to perform all phone handset dialing functions using voice commands. Additionally, users will be able to create and modify entries within a voice recognition phonebook, whereby a number within the voice recognition phonebook can be called by saying the name associated with the number. The user interface provides a combination of voice and LCD displayed user prompts and responses to voice input. The interface responds to user voice commands and performs the command functions based upon matches to previously user trained voice command vocabulary words stored in memory.
摘要:
A method for selecting a code vector in an algebraic codebook wherein the analysis window for the coder is extended beyond the length of the target speech frame. By extending the analysis window, the two dimensional impulse response matrix can be stored as a one dimensional autocorrelation matrix greatly saving on the computational complexity and memory required for the search.
摘要:
A speech processing system modifies various aspects of input speech according to a user-selected one of various preprogrammed voice fonts. Initially, the speech converter receives a formants signal representing an input speech signal and a pitch signal representing the input signal's fundamental frequency. One or both of the following may also be received: a voicing signal comprising an indication of whether the input speech signal is voiced, unvoiced, or mixed, and/or a gain signal representing the input speech signal's energy. The speech converter also receives user selection of one of multiple preprogrammed voice fonts, each specifying a manner of modifying one or more of the received signals (i.e., formants, voicing, pitch, gain). The speech converter modifies at least one of the formants, voicing, pitch, and/or gain signals as specified by the selected voice font.
摘要:
A method and system for speech recognition combines different types of engines in order to recognize user-defined digits and control words, predefined digits and control words, and nametags. Speaker-independent engines are combined with speaker-dependent engines. A Hidden Markov Model (HMM) engine is combined with Dynamic Time Warping (DTW) engines.
摘要:
A method of instructing a user how to use a wireless communication device is disclosed and may include determining whether a feature on the wireless communication device has been utilized. When it is determined that the feature has not been utilized, the method may include notifying a presence of the feature, providing information associated with the feature, and prompting the user to use the feature. The information associated with the feature is configured to teach the user how the feature may be executed, the availability of additional features, how to utilize the feature better, or a combination thereof. Further, the feature is a number saving feature, a speed dialing feature, a voice dialing feature, or a combination thereof.
摘要:
A method and apparatus for enhancing coding efficiency by reducing illegal or other undesirable packet generation while encoding a signal. The probability of generating illegal or other undesirable packets while encoding a signal is reduced by first analyzing a history of the frequency of codebook values selected while quantizing speech parameters. Codebook entries are then reordered so that the index/indices that create illegal or other undesirable packets contain the least frequently used entry/entries. Reordering multiple codebooks for various parameters further reduces the probability that an illegal or other undesirable packet will be created during signal encoding. The method and apparatus may be applied to reduce the probability of generating illegal null traffic channel data packets while encoding eighth rate speech.
摘要:
A system and method for preparing and sending e-mail communications using a wireless communications device are disclosed. In one embodiment, input data comprising audio, image, and/or video data is encoded and transmitted to a cellular network. An integrated e-mail processor connected to the cellular network processes the coded data into a composite e-mail message and forwards the message to a server. The server then forwards the composite e-mail message to the indicated recipient(s). In another embodiment, the wireless communications device processes the coded data into a composite e-mail message and forwards it to a server via a cellular network. In either embodiment, the server may be dedicated to the cellular network. The invention thus enables users of handheld wireless communications devices, users of other devices lacking typing keyboards, or users presently unable to use a typing keyboard to prepare and send e-mail messages.
摘要:
It is an objective of the present invention to provide an optimized method of selection of the encoding mode that provides rate efficient coding of the input speech. It is a second objective of the present invention to identify and provide a means for generating a set of parameters ideally suited for this operational mode selection. Third, it is an objective of the present invention to provide identification of two separate conditions that allow low rate coding with minimal sacrifice to quality. The two conditions are the coding of unvoiced speech and the coding of temporally masked speech. It is a fourth objective of the present invention to provide a method for dynamically adjusting the average output data rate of the speech coder with minimal impact on speech quality.
摘要:
It is a first objective of the present invention to provide a method by which to reduce the probability of coding low energy unvoiced speech as background noise. The present invention determines an encoding rate by examining subbands of the input signal, by this method unvoiced speech can be distinguished from background noise. A second objective of the present invention is to provide a means by which to set the threshold levels that takes into account signal energy as well as background noise energy. In the present invention, the background noise is not used to determine threshold values, rather the signal to noise ratio of an input signal is use to determine the threshold values. A third objective of the present invention is to provide a method for coding music passing through a variable rate vocoder. The present invention examines the periodicity of the input signal to distinguish music from background noise.
摘要:
A method and apparatus are described for controlling the data rates for communications to and from a base station and a plurality of remote users. The usage of the communications resource whether the forward link resource (from base station to remote users) or reverse link resource (from remote users to base station) is measured. The measured usage value is compared against at least one predetermined threshold value and the data rates of communications or a subset of communications on said communications resource is modified in accordance with said comparisons.