摘要:
A system that incorporates teachings of the present disclosure may include, for example, an avatar engine having a controller to retrieve a user profile of a user, present the user an avatar having characteristics that correlate to the user profile, detect one or more responses of the user during a communication exchange between the avatar and the user, identify from the one or more responses a need to engage in an e-commerce transaction, engage in a commercial exchange with a merchant system according to the e-commerce transaction, identify a commercial status of the e-commerce transaction from the commercial exchange with the merchant system, and present the user by way of the avatar the commercial status of the e-commerce transaction. Other embodiments are disclosed.
摘要:
A system that incorporates teachings of the present disclosure may include, for example, an Internet Protocol Television (IPTV) system having a controller to retrieve a user profile associated with a user of the IPTV system, cause a set-top box (STB) operating in the IPTV system to present an avatar having characteristics that correlate to the user profile, receive from the STB one or more responses of the user, wherein the one or more responses are collected by the STB from a communication exchange between the avatar and the user, identify from the one or more responses a need to communicate with a content source, establish a communication session with the content source, receive from the content source an avatar profile, adapt the characteristics of the avatar to correlate at least in part to the avatar profile, and cause the STB to present the adapted avatar. Other embodiments are disclosed.
摘要:
A system that incorporates teachings of the present disclosure may include, for example, a first computing device having a controller to present an avatar having characteristics that correlate to a user profile and that conform to operating characteristics of the first computing device, and transmit to a second computing device operational information associated with the avatar for reproducing at least in part the avatar at said second computing device. Other embodiments are disclosed.
摘要:
Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating speech. One variation of the method is from a server side, and another variation of the method is from a client side. The server side method, as implemented by a network-based automatic speech processing system, includes first receiving, from a network client independent of knowledge of internal operations of the system, a request to generate a text-to-speech voice. The request can include speech samples, transcriptions of the speech samples, and metadata describing the speech samples. The system extracts sound units from the speech samples based on the transcriptions and generates an interactive demonstration of the text-to-speech voice based on the sound units, the transcriptions, and the metadata, wherein the interactive demonstration hides a back end processing implementation from the network client. The system provides access to the interactive demonstration to the network client.
摘要:
A method and system are disclosed that train a text-to-speech synthesis system for use in speech synthesis. The method includes generating a speech database of audio files comprising domain-specific voices having various prosodies, and training a text-to-speech synthesis system using the speech database by selecting audio segments having a prosody based on at least one dialog state. The system includes a processor, a speech database of audio files, and modules for implementing the method.
摘要:
Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating speech. One variation of the method is from a server side, and another variation of the method is from a client side. The server side method, as implemented by a network-based automatic speech processing system, includes first receiving, from a network client independent of knowledge of internal operations of the system, a request to generate a text-to-speech voice. The request can include speech samples, transcriptions of the speech samples, and metadata describing the speech samples. The system extracts sound units from the speech samples based on the transcriptions and generates an interactive demonstration of the text-to-speech voice based on the sound units, the transcriptions, and the metadata, wherein the interactive demonstration hides a back end processing implementation from the network client. The system provides access to the interactive demonstration to the network client.
摘要:
A method and system are disclosed that train a text-to-speech synthesis system for use in speech synthesis. The method includes generating a speech database of audio files comprising domain-specific voices having various prosodies, and training a text-to-speech synthesis system using the speech database by selecting audio segments having a prosody based on at least one dialog state. The system includes a processor, a speech database of audio files, and modules for implementing the method.
摘要:
A system, method and computer readable medium that trains a text-to-speech synthesis system for use in speech synthesis is disclosed. The method may include recording audio files of one or more live voices speaking language used in a specific domain, the audio files being recorded using various prosodies, storing the recorded audio files in a speech database; and training a text-to-speech synthesis system using the speech database, wherein the text-to-speech synthesis system selects audio selects audio segments having a prosody based on at least one dialog state and one speech act.