摘要:
A universal remote control adapted to receive a voice input. The voice input is received by the remote control and compared to a plurality of voice command templates that are stored in the memory of the remote control. If the voice input matches one or more of the plurality of voice command templates, a valid voice input has been received by the remote control. Valid voice input may be a remote control command or keystroke data, input as an entire word or as individual characters. In response to a valid voice input, the remote control may transmit an operational command code and/or alphanumeric symbol code corresponding to keystroke data to a consumer electronic device.
摘要:
A unified web-based voice messaging system provides voice application control between a web browser and an application server via an hypertext transport protocol (HTTP) connection on an Internet Protocol (IP) network. The application server executes the voice-enabled web application by runtime execution of extensible markup language (XML) documents that define the voice-enabled web application to be executed. Each voice application operation can be defined as any one of a user interface operation, a logic operation, or a function operation. Each XML document includes XML tags that specify the user interface operation, the logic operation and/or the function operation to be performed within a corresponding voice application operation, the XML tags being based on prescribed rule sets that specify the executable functions to be performed by the application runtime environment.
摘要:
The invention provides an information recording medium, such as an optical disk, having a large capacity and being capable of performing read/write operations at high speeds. The recording medium includes an audio stream prepared for after-recording data, and a audio attribute information having a bit rate information to the recorded audio stream as a management information. A recorder according to the invention has a check unit for checking, in advance, the possibility of after-recording operation of the recorder to the audio stream to be after-recorded with reference to the bit rate information of the audio attribute information.
摘要:
A device is provided that generates the gestures and expressions of a human image on a computer without expending a great amount of labor. The words for the system response to the input of a user and the state of the dialogue are described in a dialogue flow memory unit, a dialogue flow analysis unit analyzes the spoken text of the flow, extracts the key words associated with a movement pattern by referring to a text movement association table, and the movement expression generation unit generates the movements corresponding to the movement pattern. In the generation of the movement, movement patterns determined in advance are selected according to the state of the dialogue written in the dialogue flow, and the movement pattern is determined or modified by the key words. In addition, in a text output control unit, words are displayed by switching between the display of a “conversation balloon” or the display of a “message board” according to the state of the dialogue written in the dialogue flow.
摘要:
The operation mode control section sets an operation mode flag that authorizes the loudspeaker to replay the speech input through the microphone while the speech is being compressed and encoded into speech data by the compression/encoding section and then further processed by the encoding processing section. When an order is issued by the user to confirm the input speech by means of the replay operation section during the encoding operation, the speech output control section receives a permit signal for reproducing the speech through the loudspeaker after expanding the speech data by means of the speech data expansion processing section. The encoding operation proceeds concurrently during the speech reproducing operation.
摘要:
In one embodiment of the method and apparatus for managing multiple speech applications, a common development platform and a common environment are provided. The common environment interfaces with the speech applications, receives information from an application information storage and a plurality of speech input sources, allows the speech applications to execute simultaneously and transitions from one said speech application to another seamlessly. In addition, the speech applications are developed based on the common development platform. Thus, application developers may utilize the common development platform to design and implement the speech applications independently.
摘要:
A unified web-based voice messaging system provides voice application control between a web browser and an application server via an hypertext transport protocol (HTTP) connection on an Internet Protocol (IP) network. The web browser receives an HTML page from the application server having an XML element that defines data for an audio operation to be performed by an executable audio resource. The application server executes the voice-enabled web application by runtime execution of extensible markup language (XML) documents that define the voice-enabled web application to be executed. The application server, in response to receiving a user request from a user, accesses a selected XML page that defines at least a part of the voice application to be executed for the user. The application server then parses the XML page, and executes the operation describer by the XML page.
摘要:
An information processing apparatus including an image-sensing controller controlling image-sensing so as to take a picture upon detection of execution of a first operation, a word generator recognizing speech upon detection of execution of a second operation and generating a word or a phrase corresponding to the recognized voice, and a portion associating the word or a phrase with the picture. Accordingly a word, a generated phrase or the like can be easily associated with an image-sensed still picture (with ease).
摘要:
A system and process for voice-controlled information retrieval. A conversation template is executed. The conversation template includes a script of tagged instructions including voice prompts and information content. A voice command identifying information content to be retrieved is processed. A remote method invocation is sent requesting the identified information content to an applet process associated with a Web browser. The information content is retrieved on the Web browser responsive to the remote method invocation.
摘要:
A voice-interactive docking station is provided for use with a portable computing device. The portable computing device includes at least one information management application and a corresponding database for storing the data associated with the information management application. The docking station generally includes a speech input device for receiving speech input, a speech recognizer for translating the speech input into voice command data, and an interface application for interacting with the applications residing on the portable computing device. In particular, the interface application, in response to voice command data, accesses the data associated with the information management application residing on the portable computing device. The docking station may further include a text-to-speech synthesizer for converting output data from the interface application into speech output data, and an audio system for generating audio output from the speech output data.