摘要:
Clients connecting to a VoiceXML browser obtain a control channel. Using this channel, clients may initialize a new VoiceXML session or attach to an existing VoiceXML session. The client after obtaining a session may perform a range of actions including controlling and monitoring actions.
摘要:
Systems and methods for building speech-based applications using reusable dialog components based on VoiceXML (Voice eXtensible Markup Language). VoiceXML reusable dialog components can be used for building a voice interface for use with multi-modal, multi-channel and conversational applications that offer universal access to information anytime, from any location, using any pervasive computing device regardless of its I/O modality. In one embodiment, a framework for reusable dialog components built within the VoiceXML specifications is based on the tag and ECMAScript parameter objects to pass parameters, configuration and results. This solution is interpreted at the client side (VoiceXML browser). In another embodiment, a framework for reusable dialog components is based on JSP (Java Server Pages) and beans that generate VoiceXML subdialogs. This solution can be evaluated at the server side. These frameworks can be mixed and matched depending on the application.
摘要:
Systems and methods for multi-modal messaging that enable a user to compose, send and retrieve messages, such as SMS, MMS, IM or ordinary e-mail messages, for example, using one or more I/O (input/output) modalities (e.g., speech I/O and/or GUI I/O). A method for composing messages combines the advantages of a multi-modal interface (e.g., grammar-based speech and touchscreen or similar input devices) and message templates, which allows a user to construct a message with significantly less effort in a fraction of the time required by conventional methods. The user can dictate his/her messages using speech and/or GUI input, for example, based on a library of message templates which can be personalized by the user to fit his/her social interaction needs.
摘要:
Systems and methods for multi-modal messaging that enable a user to compose, send and retrieve messages, such as SMS, MMS, IM or ordinary e-mail messages, for example, using one or more I/O (input/output) modalities (e.g., speech I/O and/or GUI I/O). A method for composing messages combines the advantages of a multi-modal interface (e.g., grammar-based speech and touchscreen or similar input devices) and message templates, which allows a user to construct a message with significantly less effort in a fraction of the time required by conventional methods. The user can dictate his/her messages using speech and/or GUI input, for example, based on a library of message templates which can be personalized by the user to fit his/her social interaction needs.
摘要:
Systems and methods for multi-modal messaging that enable a user to compose, send and retrieve messages, such as SMS, MMS, IM or ordinary e-mail messages, for example, using one or more I/O (input/output) modalities (e.g., speech I/O and/or GUI I/O). A method for composing messages combines the advantages of a multi-modal interface (e.g., grammar-based speech and touchscreen or similar input devices) and message templates, which allows a user to construct a message with significantly less effort in a fraction of the time required by conventional methods. The user can dictate his/her messages using speech and/or GUI input, for example, based on a library of message templates which can be personalized by the user to fit his/her social interaction needs.
摘要:
Systems and methods for building multi-modal browsers applications and, in particular, to systems and methods for building modular multi-modal browsers using a DOM (Document Object Model) and MVC (Model-View-Controller) framework that enables a user to interact in parallel with the same information via a multiplicity of channels, devices, and/or user interfaces, while presenting a unified, synchronized view of such information across the various channels, devices and/or user interfaces supported by the multi-modal browser. The use of a DOM framework (or specifications similar to DOM) allows existing browsers to be extended without modification of the underling browser code. A multi-modal browser framework is modular and flexible to allow various fat client and thin (distributed) client approaches.
摘要:
A textual message processing system and method are described for use in a mobile environment. A user messaging application processes at least one user textual message during a user messaging session. A semantic annotation module identifies one or more semantically salient terms in the user textual message, and annotates the user textual message with annotation terms having a low semantic distance to the semantically salient terms. A user message history stores the annotated textual messages. The semantic annotation module may further annotate the user textual message with situational meta-data characterizing the user textual message. There may be a message search module for using one or more keywords to search the user message history including the annotation terms, and identifying as a search match any annotated textual messages within a semantic distance threshold of the one or more keywords.
摘要:
Techniques for presenting data input as a plurality of data chunks including a first data chunk and a second data chunk. The techniques include converting the plurality of data chunks to a textual representation comprising a plurality of text chunks including a first text chunk corresponding to the first data chunk and a second text chunk corresponding to the second data chunk, respectively, and providing a presentation of at least part of the textual representation such that the first text chunk is presented differently than the second text chunk to, when presented, assist a user in proofing the textual representation.
摘要:
Techniques for disambiguating at least one text segment from at least one acoustically similar word and/or phrase. The techniques include identifying at least one text segment, in a textual representation having a plurality of text segments, having at least one acoustically similar word and/or phrase which has a different spelling, annotating the textual representation with disambiguating information to help disambiguate the at least one text segment from the at least one acoustically similar word and/or phrase, and synthesizing a speech signal, at least in part, by performing text-to-speech synthesis on at least a portion of the textual representation that includes the at least one text segment, wherein the speech signal includes speech corresponding to the disambiguating information located proximate the portion of the speech signal corresponding to the at least one text segment.
摘要:
Techniques for disambiguating at least one text segment from at least one acoustically similar word and/or phrase. The techniques include identifying at least one text segment, in a textual representation having a plurality of text segments, having at least one acoustically similar word and/or phrase, annotating the textual representation with disambiguating information to help disambiguate the at least one text segment from the at least one acoustically similar word and/or phrase, and synthesizing a speech signal, at least in part, by performing text-to-speech synthesis on at least a portion of the textual representation that includes the at least one text segment, wherein the speech signal includes speech corresponding to the disambiguating information located proximate the portion of the speech signal corresponding to the at least one text segment.