摘要:
A method, apparatus, and electronic device for voice navigation are disclosed. A voice input mechanism 310 may receive a verbal input from a user to a voice user interface program invisible to the user. A processor 104 may identify in a graphical user interface (GUI) a set of GUI items. The processor 104 may convert the set of GUI items to a set of voice searchable indices 400. The processor 104 may correlate a matching GUI item of the set of GUI items to a phonemic representation of the verbal input.
摘要:
An electronic device (300) for speech dialog includes functions that receive (305, 105) a speech phrase that comprises a request phrase that includes an instantiated variable (215), generate (335, 115) pitch and voicing characteristics (315) of the instantiated variable, and performs voice recognition (319, 125) of the instantiated variable to determine a most likely set of acoustic states (235). The electronic device may generate (335, 140) a synthesized value of the instantiated variable using the most likely set of acoustic states and the pitch and voicing characteristics of the instantiated variable. The electronic device may use a table of previously entered values of variables that have been determined to be unique, and in which the values are associated with a most likely set of acoustic states and the pitch and voicing characteristics determined at the receipt of each value to disambiguate (425, 430) a newly received instantiated variable.
摘要:
A method and apparatus is provided for identifying an input sequence entered by a user of a communication unit. The method includes the steps of providing a database containing a plurality of partial sequences from the user of the communication unit, recognizing an identity of at least some information items of the input sequence entered by the user, comparing the recognized sequence of information items with the plurality of partial sequences within the database and selecting a partial sequence of the plurality of sequences within the database with a closest relative match to the recognized sequence as the input sequence intended by the user.
摘要:
A method and apparatus for textual searching of a database is provided herein. During operation a user will input a letter into a search engine. The search engine will score words based on the letter and display results of the highest-scored words. Another letter will again be received and the process repeated. In situations where titles are returned to the user, additional steps of associating the words with a title and scoring the title take place. The highest-scored titles are provided to the user as the displayed results.
摘要:
A method and apparatus for ordering results from a query is provided herein. During operation, a spoken query is received and converted to a textual representation, such as a word lattice. Search strings are then created from the word lattice. For example a set search strings may be created from the N-grams, such as unigrams and bigrams, of the word lattice. The search strings may be ordered and truncated based on confidence values assigned to the n-grams by the speech recognition system. The set of search strings are sent to at least one search engine, and search results are obtained. The search results are then re-arranged or reordered based on a semantic similarity between the search results and the word lattice.
摘要:
A method, system and communication device for enabling voice-to-voice searching and ordered content retrieval via audio tags assigned to individual content, which tags generate uniterms that are matched against components of a voice query. The method includes storing content and tagging at least one of the content with an audio tag. The method further includes receiving a voice query to retrieve content stored on the device. When the voice query is received, the method completes a voice-to-voice search utilizing uniterms of the audio tag, scored against the phoneme latent lattice model generated by the voice query to identify matching terms within the audio tags and corresponding stored content. The retrieved content(s) associated with the identified audio tags having uniterms that score within the phoneme lattice model are outputted in an order corresponding to an order in which the uniterms are structured within the voice query.
摘要:
A method and apparatus for enabling multimodal tags in a communication device is disclosed. The method comprises receiving a first training signal and receiving a second training signal in conjunction with the first training signal. A multimodal tag is created to represent a combination of the first training signal and the second training signal and a function is associated with the created multimodal tag.
摘要:
A portable electronic communication device, designed for voice and data communication is utilized as a peripheral input device for transmitting/providing character inputs, entered in the first device's touch input mechanism, to a second electronic device. The first device has a mode switching utility that switches the first device between a first standard communication mode and a second peripheral input device mode. When the first device is in the second peripheral input device mode, the first device operates as a peripheral input device for the second device. A character input recognition utility executes on the first device to provide the functions of: detecting an input on the touch screen input mechanism; generating an electronic representation of the input; establishing a communication link between the second communication transmitter and an identified second device; and forwarding the electronic representation of the character input to the communication transmitter for transmission to the identified second device.
摘要:
A voice toolkit (100) and a method (700) for managing pronunciation dictionaries are provided. The visual toolkit can include a user-interface (110) for entering in a text and a corresponding spoken utterance, a text-to-speech system (120) for synthesizing a pronunciation from the text, a talking speech recognizer (132) for generating pronunciations of the spoken utterance, and a voice processor (130) for validating at least one pronunciation. A developer can type a text of a word into the toolkit and listen to the pronunciation to determine whether the pronunciation is acceptable. If the pronunciation is incorrect the developer can speak the word for providing a spoken utterance having a correct pronunciation.
摘要:
A communication device includes: (1) a wireless adapter at which a wireless headset is communicatively connected to the communication device and at which is received a first acoustic input that includes a speech input and a first ambient noise input; (2) a microphone that receives a second acoustic input, which includes a second ambient noise input; and (3) a dual-channel adaptive noise canceller that utilizes the second ambient noise input to filter the first ambient noise input out of the first acoustic input to generate an acoustic output that primarily comprises the speech input.