摘要:
The present invention relates to a method and apparatus for making it easy to understand the contents of sound during special reproduction. Herein, the MPEG multiple separation circuit separates digital data read out from the optical disk into audio data and video data, the sound recognition text conversion circuit converts audio data decoded in the MPEG audio decoder into text data by sound recognition, and the on-screen character processor generates video signals in which the characters representing text data are displayed, being overlapped with reproduced images. In case of special reproduction such as double-speed reproduction, the characters representing text data are displayed, being overlapped with special reproduced images.
摘要:
A method is provided for developing a computer-based dialogue interface for an automated or computerized system using input device technology. The dialogue interface is disposed between the automated system and an end user, with the dialogue interface receiving input from the end user and providing output to the end user in response to the input. In an illustrative embodiment, the method comprises the following steps. A system designer(s) defines a plurality of requirements applicable to the dialogue interface. The dialogue interface is then designed to meet these requirements. The automated system is simulated with at least a first person, and the end user is simulated with at least a second person. The dialogue interface is evaluated by facilitating an interaction between the first and the second persons through the dialogue interface. Based on the interaction between the first and the second persons, the dialogue interface is evaluated. Based on the evaluation of the dialogue interface, the dialogue interface is refined. After performing the above steps, the automated system is then developed based upon the dialogue interface.
摘要:
The signal processing method includes the steps of: wavelet-transforming an input signal in a computer; and extracting features of the signal by Mellin-transforming the output of the wavelet transform step in synchrony with the input signal in a computer.
摘要:
New method and means for controlling the environment of disabled individuals through their voice, which includes the operation of lights or any number of appliances and a personal computer wherein the keyboard and the mouse are separately controlled by voice commands, without interference with normal application, (including dictation programs), operating within the computer. Effectively the voice control provides parallel mouse and keyboard commands with normal mouse and keyboard commands.
摘要:
A speech recognition system includes a user interface configured to provide signals indicative of a user's speech. A speech recognizer of the system includes a processor configured to use the signals from the user interface to perform speech recognition operations to attempt to recognize speech indicated by the signals. A control mechanism is coupled to the voice recognizer and is configured to affect processor usage for speech recognition operations in accordance with a loading of the processor.
摘要:
According to the present invention network devices that can be controlled via a speech unit included in the network can send a device-document describing its functionality and its speech interface to said speech unit. The speech unit combines those documents to a general document that forms the basis to translate recognized user-commands into user-network-commands to control the connected network-devices. A device-document comprises at least the vocabulary and the commands associated therewith for the corresponding device. Furtheron, pronunciation, grammar for word sequences, rules for speech understanding and dialog can be contained in such documents as well as the same information for multiple languages or information for dynamic dialogs in speech understanding. It is possible that one device contains several documents and dynamically sends them to the speech unit in case they are needed. Furtheron, the present invention enables a device to change its functionality dynamically based on changing content, since a network device send its specifications regarding its speech capabilities to the speech unit while the speech unit is in use.
摘要:
A method for overlapping stored audio elements in a system for providing a customized radio broadcast. The method includes the steps of dividing a first audio element into a plurality of audio element components; selecting one of said audio element components; decompressing the selected audio element component; selecting a second audio element; decompressing the second audio element; mixing the decompressed audio element component with the decompressed second audio element to form a mixed audio element component; and compressing the mixed audio element component to form a compressed overlapping audio element component. The compressed overlapping audio element component may replace the selected audio component. The first audio element may be a song, while the second audio element may be a DJ introduction. Accordingly, the compressed overlapping audio element may be broadcast followed by the remaining components of the song audio element.
摘要:
A method and apparatus for adjusting a dialect for an oral presentation provided by an agent of an organization to a human target of the organization through a communications network. The method includes the steps of determining a dialect to be used by the agent for communicating with the target, modifying the dialect of the oral presentation of the agent for communicating with the target based upon the determined dialect and presenting the modified oral presentation to the target.
摘要:
A method, system and product for modifying the dynamic range of an encoded audio signal. The method includes receiving the encoded audio signal, the encoded audio signal having a first set of scale factors associated with a first dynamic range, and identifying a playback destination for the encoded audio signal, the playback destination having a second dynamic range. The method also includes mapping the first set of scale factors to a second set of scale factors associated with the second dynamic range, and replacing the first set of scale factors in the encoded audio signal with the second set of scale factors to create a modified encoded audio signal for decoding and reassembly at the playback destination. The system includes control logic for performing the method. The product includes a storage medium having computer readable programmed instructions for performing the method.
摘要:
A spoken language interface between a user and at least one application or system includes a dialog manager operatively coupled to the application or system, an audio input system, an audio output system, a speech decoding engine and a speech synthesizing engine; and at least one user interface data set operatively coupled to the dialog manager, the user interface data set representing spoken language interface elements and data recognizable by the application. The dialog manager enables connection between the input audio system and the speech decoding engine such that a spoken utterance provided by the user is provided from the input audio system to the speech decoding engine. The speech decoding engine decodes the spoken utterance to generate a decoded output which is returned to the dialog manager. The dialog manager uses the decoded output to search the user interface data set for a corresponding spoken language interface element and data which is returned to the dialog manager when found, and provides the spoken language interface element associated data to the application for processing in accordance therewith. The application, on processing that element, provides a reference to an interface element to be spoken. The dialog manager enables connection between the audio output system and the speech synthesizing engine such that the speech synthesizing engine which, accepting data from that element, generates a synthesized output that expresses that element, the audio output system audibly presenting the synthesized output to the user.