摘要:
A method, apparatus, and electronic device for optimizing content acquisition are disclosed. A memory may store usage of a previous set of media content by the mobile device. An input/output device may receive a request for a current set of media content. A processor may create a user profile based on the usage and provides a first recommendation of a first digital rights agreement based on the user profile.
摘要:
A method (400, 500) of propagating an alert (116). The alert can be received on a first communication device (102). The alert can be associated with data indicating at least one peer-to-peer propagation parameter. Further, the alert can be automatically communicated from the first communication device to at least a second communication device in accordance with the peer-to-peer propagation parameter via peer-to-peer communications.
摘要:
A system includes a first communications device [105] to participate in a conversation with at least a second communication device [110]. An intelligent communication agent [120] monitors the conversation for at least one keyword. In response to detecting the at least one keyword, the intelligent communication agent performs a search for multimedia content corresponding to the at least one keyword and retrieves the multimedia content. A logic engine [135] determines relevant content of the multimedia content based on at least one of a conversation profile and at least one user profile for at least one of a user of the first communication device and at least a second user of the at least a second communication device. A transmission element [130] transmits the relevant content to at least one of the first communication device, the at least a second communication device, and a predetermined multimedia device [145].
摘要:
A method and apparatus for performing a voice search in a mobile communication device is disclosed. The method may include receiving a search query from a user of the mobile communication device, converting speech parts in the search query into linguistic representations, comparing the query linguistic representations to the linguistic representations of all items in the voice search database to find matches, wherein the voice search database has indexed all items that are associated with the device, displaying the matches to the user, receiving the user's selection from the displayed matches, and retrieving and executing the user's selection.
摘要:
A voice sample characterization front-end suitable for use in a distributed speech recognition context. A digitized voice sample 31 is split between a low frequency path 32 and a high frequency path 33. Both paths are used to determine spectral content suitable for use when determining speech recognition parameters (such as cepstral coefficients) that characterize the speech sample for recognition purposes. The low frequency path 32 has a thorough noise reduction capability. In one embodiment, the results of this noise reduction are used by the high frequency path 33 to aid in de-noising without requiring the same level of resource capacity as used by the low frequency path 32.
摘要:
A method and apparatus (100) for updating a voice tag comprising N stored voice tag phoneme sequences includes a function (110) for determining (205) an accepted stored voice tag phoneme sequence for an utterance, a function (140) for extracting(210) a current set of M phoneme sequences having highest likelihoods of representing the utterance, a function (160) for updating (215) a reference histogram associated with the accepted voice tag, and a function (160) for updating (225) the voice tag with N selected phoneme sequences that are selected from the current set of M phoneme sequences and the set of N voice tag phoneme sequences, wherein the N selected phoneme sequences have phoneme histograms most closely matching the reference histogram. The method and apparatus (100) also generates a voice tag using some functions (110, 140, 160) that are common with the method and apparatus to update the voice tag, such as the extracting (410) of the current set of M phoneme sequences.
摘要:
One provides (101) a plurality of frames of sampled audio content and then processes (102) that plurality of frames using a speech recognition search process that comprises, at least in part, searching for at least two of state boundaries, subword boundaries, and word boundaries using different search resolutions.
摘要:
A system for enhancing the signal-to-noise ratio of a speech signal is avoided. A plurality of local energy maximums associated with a speech signal are determined. Presumably, each of these local energy maximums defines a speech pitch period. Typically, human pitch periods are approximately 100-400 Hz depending on the sex and age of the speaker. Because human speech typically includes more energy near the beginning of a pitch period than at the end of the pitch period, and background noise tends to remain relatively constant throughout the pitch period, the speech signal may be enhanced by increasing the energy associated with the beginning of the pitch period and/or by decreasing the energy associated with the end of the pitch period. Preferably, the amount of energy increase in the earlier portion of the pitch period is approximately equal to the amount of energy reduction in the later portion of the pitch period. In this manner, the total energy remains the constant.
摘要:
A method for distributed voice searching may include receiving a search query from a user of the mobile communication device, generating a lattice of coarse linguistic representations from speech parts in the search query, extracting query features from the generated lattice of coarse linguistic representations, generating coarse search feature vectors based on the extracted query features, performing a coarse search using the generated coarse search feature vectors and transmitting the generated coarse search feature vectors to a remote voice search processing unit, receiving remote resultant web indices from the remote voice search processing unit, generating a lattice of fine linguistic representations from speech parts in the search query, generating fine search feature vectors from the lattice of fine linguistic representations, performing a fine search using the coarse search results, the remote resultant web indices and the generated fine search feature vectors, and displaying the fine search results to the user.
摘要:
One provides (101) a plurality of frames of sampled audio content and then processes (102) that plurality of frames using a speech recognition search process that comprises, at least in part, determining whether to search each subword boundary contained within each frame on a frame-by-frame basis. These teachings will also readily accommodate determining whether to search each word boundary contained within each frame on a frame-by-frame basis.