摘要:
The invention relates to a method and system for automatically identifying and presenting video clips or other media to a user at a client device. One embodiment of the invention provides a method for updating a user profile or other persistent data store based on user feedback to improve the identification of video clips or other media content responsive to the user's profile. Embodiments of the invention also provide methods for processing user feedback. Related architectures are also disclosed.
摘要:
A system and method is provided for rapidly generating a new spoken dialog application. In one embodiment, a user experience person labels the transcribed data (e.g., 3000 utterances) using a set of interactive tools. The labeled data is then stored in a processed data database. During the labeling process, the user experience person not only groups utterances in various call type categories, but also flags (e.g., 100-200) specific utterances as positive and negative examples for use in an annotation guide. The labeled data in the processed data database can also be used to generate an initial natural language understanding (NLU) model.
摘要:
Audiovisual programs are transmitted over a low-bandwidth network in a compressed form in which the number of video frames has been reduced by selecting "significant" frames. The user sees a series of still images accompanying an apparently full audio channel. The frames are downloaded to the user in an order designed to assure, first, that the next few frames necessary for viewing are present, but also so that the user can scroll throughout the program even as it is being downloaded. This is accomplished by first downloading the full audio for the program portion requested by the user's viewer software, and then downloading the video frames based on requests from the viewer software, which has access to a table indicating which video frames are associated with which audio portions. The software uses the table to download frames needed immediately, as well as to download advance frames based on their significance.
摘要:
The invention provides for a system, method, and computer readable medium storing instructions related to controlling a presentation in a multimodal system. The method embodiment of the invention is a method for the retrieval of information on the basis of its content for real-time incorporation into an electronic presentation. The method comprises receiving from a presenter a content-based request for at least one segment of a first plurality of segments within a media presentation and while displaying the media presentation to an audience, displaying to the presenter a second plurality of segments in response to the content-based request. The computing device practicing the method receives a selection from the presenter of a segment from the second plurality of segments and displays to the audience the selected segment.
摘要:
A video is summarized by determining if a video contains one or more junk frames, modifying one or more boundaries of shots of the video based at least in part on the determination of if the video contains one or more junk frames, sampling a plurality of the shots of the video into a plurality of subshots, clustering the plurality of subshots with a multiple step k-means clustering, and creating a video summary based at least in part on the clustered plurality of subshots. The video is segmented into a plurality of shots and a keyframe from each of the plurality of shots is extracted. A video summary is created based on a determined importance of the subshots in a clustered plurality of subshots and a time budget. The created video summary is rendered by displaying playback rate information for the rendered video summary, displaying a currently playing subshot marker with the rendered video summary, and displaying an indication of similar content in the rendered video summary.
摘要:
A method is disclosed that includes receiving a multimedia data stream comprising audio data, video data, and text data at a first electronic device of a plurality of electronic devices responsive to a network. A content structure of the multimedia data stream is automatically determined at least partially based on the text data. The portion of multimedia data stream is stored in a local media database and the associated content structure is stored in a local content index. A network index alert is generated to update a centralized content index of available media content via the network.
摘要:
The invention provides for a system, method, and computer readable medium storing instructions related to controlling a presentation in a multimodal system. The method embodiment of the invention is a method for the retrieval of information on the basis of its content for incorporation into an electronic presentation. The method comprises receiving from a user a content-based request for at least one segment from a first plurality of segments within a media presentation preprocessed to enable natural language content searchability; in response to the request, presenting a subset of the first plurality of segments to the user; receiving a selection indication from the user associated with at least one segment of the subset of the first plurality of segments and adding the selected at least one segment to a deck for use in a presentation.
摘要:
In an embodiment, a method of providing an on demand translation service is provided. A subscriber may be charged a reduced fee or no fee for use of the on demand translation service in exchange for displaying commercial messages to the subscriber, the commercial messages being selected based on subscriber information. A multimedia signal including information in a source language may be received. The information may be obtained as text in the source language from the multimedia signal. The text may be translated from the source language to a target language. Translated information, based on the translated text, may be transmitted to a processing device for presentation to the subscriber. The received multimedia signal may be sent to a multimedia device for viewing.
摘要:
A method is disclosed that includes receiving a multimedia data stream comprising audio data, video data, and text data at a first electronic device of a plurality of electronic devices responsive to a network. A content structure of the multimedia data stream is automatically determined at least partially based on the text data. The portion of multimedia data stream is stored in a local media database and the associated content structure is stored in a local content index. A network index alert is generated to update a centralized content index of available media content via the network.
摘要:
The invention provides a system and method for automatically indexing and retrieving multimedia content. The method may include separating a multimedia data stream into audio, visual and text components, segmenting the audio, visual and text components based on semantic differences, identifying at least one target speaker using the audio and visual components, identifying a topic of the multimedia event using the segmented text and topic category models, generating a summary of the multimedia event based on the audio, visual and text components, the identified topic and the identified target speaker, and generating a multimedia description of the multimedia event based on the identified target speaker, the identified topic, and the generated summary.