摘要:
An apparatus for providing efficient evaluation of feature transformation includes a training module and a transformation module. The training module is configured to train a Gaussian mixture model (GMM) using training source data and training target data. The transformation module is in communication with the training module. The transformation module is configured to produce a conversion function in response to the training of the GMM. The training module is further configured to determine a quality of the conversion function prior to use of the conversion function by calculating a trace measurement of the GMM.
摘要:
The invention relates to pre-processing of a pronunciation dictionary for compression in a data processing device, the pronunciation dictionary comprising at least one entry, the entry comprising a sequence of character units and a sequence of phoneme units. According to one aspect of the invention the sequence of character units and the sequence of phoneme units are aligned using a statistical algorithm. The aligned sequence of character units and aligned sequence of phoneme units are interleaved by inserting each phoneme unit at a predetermined location relative to the corresponding character unit.
摘要:
This invention relates to a method, a device and a software application product for correcting a pronunciation of a speech object. The speech object is synthetically generated from a text object in dependence on a segmented representation of the text object. It is determined if an initial pronunciation of the speech object, which initial pronunciation is associated with an initial segmented representation of the text object, is incorrect. Furthermore, in case it is determined that the initial pronunciation of the speech object is incorrect, a new segmented representation of the text object is determined, which new segmented representation of the text object is associated with a new pronunciation of the speech object.
摘要:
A method of multi-lingual speech recognition can include determining whether characters in a word are in a source list of a language-specific alphabet mapping table for a language, converting each character not in the source list according to a general alphabet mapping table, converting each converted character according to the language-specific alphabet mapping table, verifying that each character in the word is in a character set of the language, removing characters not in the character set of the language, and identifying a pronunciation of the word.
摘要:
An approach is provided for constructing dynamic latent models that determine consumer/social network intrinsic properties and automatically recommend user interactions with different social networks. A modeling platform determines one or more social networks associated with one or more users, one or more devices associated with the one or more users, or a combination thereof. A modeling platform processes and/or facilitates a processing of data associated with the one or more social networks to generate one or more latent models describing the one or more social networks. A modeling platform causes, at least in part, a presentation of a recommendation to interact with the one or more social networks, one or more other social networks, or a combination thereof based, at least in part, on the one or more latent models.
摘要:
An approach is provided for performing multiple and hybrid forms of communication in the same communication session. A communication manager receives a to establish a communication session using a first form of communication, wherein the communication session supports multiple and simultaneous forms of communication. Next, the communication manager selects a second form of communication to conduct the communication session. Then, the communication manager transcodes the second form of communication to the first form of communication. The different forms can be converted to facilitate and enrich the communication capability, according to an embodiment of the invention.
摘要:
An approach is provided for generating one or more recommendations to a user based on interactions the user may have with items or topics of interest. The approach involves processing and/or facilitating a processing of one or more interactions of a user with one or more content items. The approach further involves causing, at least in part, an accumulation of the one or more processed interactions of the user. The approach also involves causing, at least in part, a determination of one or more user preferences based, at least in part, on the accumulated one or more processed interactions. The approach additionally involves causing, at least in part, a generation of a rating score of the user for the topic based, at least in part, on the one or more user preferences.
摘要:
An apparatus for providing text independent voice conversion may include a first voice conversion model and a second voice conversion model. The first voice conversion model may be trained with respect to conversion of training source speech to synthetic speech corresponding to the training source speech. The second voice conversion model may be trained with respect to conversion to training target speech from synthetic speech corresponding to the training target speech. An output of the first voice conversion model may be communicated to the second voice conversion model to process source speech input into the first voice conversion model into target speech corresponding to the source speech as the output of the second voice conversion model.
摘要:
Techniques to provide a secure, shared personal map layer include determining a geographic location. The geographic location is associated with operation of a device. The techniques also include determining indication that describes a relationship between the geographic location and a first user of the device. The techniques also include determining a privacy level for the indication. Then, the first user of the device is associated with the indication and the geographic location and the privacy level. In some embodiments, the techniques also include determining a personal description vocabulary word based, at least in part, on the geographic location and a context for the device. Then it is determined to present on the device a prompt that includes the personal description vocabulary word.
摘要:
Methods and apparatuses are provided for user interest modeling. A method may include receiving an input from a user for specifying one or more topics from among a predetermined hierarchy of topics and subtopics. The method may additionally include retrieving one or more documents associated with the user and extracting language tokens from the documents based, at least in part, on the specified topics. Corresponding apparatuses are also provided.