摘要:
A spoken language interface between a user and at least one application or system includes a dialog manager operatively coupled to the application or system, an audio input system, an audio output system, a speech decoding engine and a speech synthesizing engine; and at least one user interface data set operatively coupled to the dialog manager, the user interface data set representing spoken language interface elements and data recognizable by the application. The dialog manager enables connection between the input audio system and the speech decoding engine such that a spoken utterance provided by the user is provided from the input audio system to the speech decoding engine. The speech decoding engine decodes the spoken utterance to generate a decoded output which is returned to the dialog manager. The dialog manager uses the decoded output to search the user interface data set for a corresponding spoken language interface element and data which is returned to the dialog manager when found, and provides the spoken language interface element associated data to the application for processing in accordance therewith. The application, on processing that element, provides a reference to an interface element to be spoken. The dialog manager enables connection between the audio output system and the speech synthesizing engine such that the speech synthesizing engine which, accepting data from that element, generates a synthesized output that expresses that element, the audio output system audibly presenting the synthesized output to the user.
摘要:
A method for managing spoken language interface data structures and collections of user interface service engines in a spoken language dialog manager in a personal speech assistant. Interfaces, designed as part of applications, may by these methods be added to or removed from the set of such interfaces used by a dialog manager. Interface service engines, required by new applications, but not already present in the dialog manager, may be made available to the new and subsequently added applications.
摘要:
A Personal Speech Assistant (PSA) is a computing apparatus which provides a spoken language interface to another apparatus to which it is attached by supporting execution of a conversational dialog manager and its supporting service engines. In operation, a PSA is connected to a device which provides some service to a user. Any “appliance” is a candidate for enhancement with the PSA. Devices such as, for example, video cassette recorders (VCRs) or Personal Digital Assistants (PDAs), which offer rich, but frequently difficult interfaces, may be made more useful by the integration of a PSA according to the invention. It is a preferred feature of a dialog manager used by the PSA that the user interface properties, in terms of the vocabulary the device understands, the informative prompts it provides, and other aspects of its conversational behavior, are all easily modified to correspond to the preferences or limitations of the user.
摘要:
Techniques are disclosed for overcoming errors in speech recognition systems. For example, a technique for processing acoustic data in accordance with a speech recognition system comprises the following steps/operations. Acoustic data is obtained in association with the speech recognition system. The acoustic data is recorded using a combination of a first buffer area and a second buffer area, such that the recording of the acoustic data using the combination of the two buffer areas at least substantially minimizes one or more truncation errors associated with operation of the speech recognition system.
摘要:
Techniques are disclosed for overcoming errors in speech recognition systems. For example, a technique for processing acoustic data in accordance with a speech recognition system comprises the following steps/operations. Acoustic data is obtained in association with the speech recognition system. The acoustic data is recorded using a combination of a first buffer area and a second buffer area, such that the recording of the acoustic data using the combination of the two buffer areas at least substantially minimizes one or more truncation errors associated with operation of the speech recognition system.
摘要:
Techniques are disclosed for overcoming errors in speech recognition systems. For example, a technique for processing acoustic data in accordance with a speech recognition system comprises the following steps/operations. Acoustic data is obtained in association with the speech recognition system. The acoustic data is recorded using a combination of a first buffer area and a second buffer area, such that the recording of the acoustic data using the combination of the two buffer areas at least substantially minimizes one or more truncation errors associated with operation of the speech recognition system.
摘要:
Scaling of video is performed using area weighted averaging of input pixels to calculate coefficients to multiply with luminescence and crominence of input pixels. Such coefficients are produced for both the vertical and horizontal scaling directions of the input video stream. When scaling down or scaling up, scaling is first performed in the vertical direction to produce partially scaled pixels, which are then utilized for scaling in the horizontal direction. When scaling up, a pre-interpolation or pre-replication process is utilized to double the inputted pixel grid which doubled pixel grid is then utilized to scale down to the desired pixel grid size, which is greater than the originally inputted pixel grid size.