摘要:
The system retrieves information from the internet using multiple search engines that are simultaneously launched by the search engine commander. The commander is responsive to a speech-enabled system including a speech recognizer and natural language parser. The user speaks to the system in natural language requests, and the parser extracts the semantic content from the user's speech, based on a set of goal oriented grammars. The preferred system includes a fixed grammar and an updatable or downloaded grammar, allowing the system to be used without extensive training and yet capable of being customized for a particular user's purposes. Results obtained from the search engines are filtered based on information extracted from an electronic program guide and from prestored user profile data. The results may be displayed on screen or through synthesized speech.
摘要:
A set of speaker dependent models is trained upon a comparatively large number of training speakers, one model per speaker, and model parameters are extracted in a predefined order to construct a set of supervectors, one per speaker. Principle component analysis is then performed on the set of supervectors to generate a set of eigenvectors that define an eigenvoice space. If desired, the number of vectors may be reduced to achieve data compression. Thereafter, a new speaker provides adaptation data from which a supervector is constructed by constraining this supervector to be in the eigenvoice space based on a maximum likelihood estimation. The resulting coefficients in the eigenspace of this new speaker may then be used to construct a new set of model parameters from which an adapted model is constructed for that speaker. Environmental adaptation may be performed by including environmental variations in the training data.
摘要:
Program content, recorded to a storage medium such as disk recorder, optical recorder or random access memory, is indexed by the replay file system. The file system maintains a storage location and program I.D. record for each recorded program. The file system further maintains other data obtained from an electronic program guide that may be accessed by downloading from the cable or satellite infrastructure or over the internet. The file system also may store additional user data, such as the date and time the program was last viewed, together with any user-recorded indexes. The file system may be accessed through natural language input speech. The system includes a speech recognizer and natural language parser, coupled to a dialog system that engages the user in a dialog to determine what the user is interested in accessing from the storage medium. The natural language parser operates with a task-based grammar that is keyed to the electronic program guide data and user data maintained by the file system.
摘要:
Speech input supplied by the user is evaluated by the speaker verification/identification module, and based on the evaluation, parameters are retrieved from a user profile database. These parameters adapt the speech models of the speech recognizer and also supply the natural language parser with customized dialog grammars. The user's speech is then interpreted by the speech recognizer and natural language parser to determine the meaning of the user's spoken input in order to control the television tuner. The parser works in conjunction with a command module that mediates the dialog with the user, providing on-screen prompts or synthesized speech queries to elicit further input from the user when needed. The system integrates with an electronic program guide, so that the natural language parser is made aware of what programs are available when conducting the synthetic dialog with the user. Speech can be input through either a microphone or over the telephone. In addition, the user can interact with the system using a suitable computer attached via the Internet. Regardless of the mode of access, the unified access controller interprets the semantic content of the user's request and supplies the appropriate control signals to the television tuner and/or recorder.
摘要:
Program content, recorded to a storage medium such as disk recorder, optical recorder or random access memory, is indexed by the replay file system. The file system maintains a storage location and program I.D. record for each recorded program. The file system further maintains other data obtained from an electronic program guide that may be accessed by downloading from the cable or satellite infrastructure or over the internet. The file system also may store additional user data, such as the date and time the program was last viewed, together with any user-recorded indexes. The file system may be accessed through natural language input speech. The system includes a speech recognizer and natural language parser, coupled to a dialog system that engages the user in a dialog to determine what the user is interested in accessing from the storage medium. The natural language parser operates with a task-based grammar that is keyed to the electronic program guide data and user data maintained by the file system.
摘要:
A speech understanding system (16) for receiving a spoken request from a user (12) and processing the request against a multimedia database of audio/visual (A/V) programming information (20) for automatically recording and/or retrieving an A/V program is disclosed. The system includes a database (20) of program records representing A/V programs which are available for recording. The system also includes an A/V recording device (40-42) for receiving a recording command and recording the A/\/ program. A speech recognizer (48) is provided for receiving the spoken request and translating the spoken request into a text stream having a plurality of words. A natural language processor (50) receives the text stream and processes the words for resolving a semantic content of the spoken request. The natural language processor (50) places the meaning of the words into a task frame (90) having a plurality of key word slots (92-98). A dialogue system (60) analyzes the task frame (90) for determining if a sufficient number of key word slots (92-98) have been filled and prompts the user for additional information for filling empty slots. The dialogue system (60) searches the database of program records (20) using the key words placed within the task frame (90) for selecting the A/V program and generating the recording command for use by the A/V recording device (40-42).