摘要:
System and method for partitioning a video into a series of semantic units where each semantic unit relates to a generally complete thematic topic. A computer implemented method for partitioning a video into a series of semantic units wherein each semantic unit relates to a theme or a topic, comprises dividing a video into a plurality of homogeneous segments, analyzing audio and visual content of the video, extracting a plurality of keywords from the speech content of each of the plurality of homogeneous segments of the video, and detecting and merging a plurality of groups of semantically related and temporally adjacent homogeneous segments into a series of semantic units in accordance with the results of both the audio and visual analysis and the keyword extraction. The present invention can be applied to generate important table-of-contents as well as index tables for videos to facilitate efficient video topic searching and browsing.
摘要:
System and method for partitioning a video into a series of semantic units where each semantic unit relates to a generally complete thematic topic. A computer implemented method for partitioning a video into a series of semantic units wherein each semantic unit relates to a theme or a topic, comprises dividing a video into a plurality of homogeneous segments, analyzing audio and visual content of the video, extracting a plurality of keywords from the speech content of each of the plurality of homogeneous segments of the video, and detecting and merging a plurality of groups of semantically related and temporally adjacent homogeneous segments into a series of semantic units in accordance with the results of both the audio and visual analysis and the keyword extraction. The present invention can be applied to generate important table-of-contents as well as index tables for videos to facilitate efficient video topic searching and browsing.
摘要:
System and method for partitioning a video into a series of semantic units where each semantic unit relates to a generally complete thematic topic. A computer implemented method for partitioning a video into a series of semantic units wherein each semantic unit relates to a theme or a topic, comprises dividing a video into a plurality of homogeneous segments, analyzing audio and visual content of the video, extracting a plurality of keywords from the speech content of each of the plurality of homogeneous segments of the video, and detecting and merging a plurality of groups of semantically related and temporally adjacent homogeneous segments into a series of semantic units in accordance with the results of both the audio and visual analysis and the keyword extraction. The present invention can be applied to generate important table-of-contents as well as index tables for videos to facilitate efficient video topic searching and browsing.
摘要:
System and method for partitioning a video into a series of semantic units where each semantic unit relates to a generally complete thematic topic. A computer implemented method for partitioning a video into a series of semantic units wherein each semantic unit relates to a theme or a topic, comprises dividing a video into a plurality of homogeneous segments, analyzing audio and visual content of the video, extracting a plurality of keywords from the speech content of each of the plurality of homogeneous segments of the video, and detecting and merging a plurality of groups of semantically related and temporally adjacent homogeneous segments into a series of semantic units in accordance with the results of both the audio and visual analysis and the keyword extraction. The present invention can be applied to generate important table-of-contents as well as index tables for videos to facilitate efficient video topic searching and browsing.
摘要:
Computer implemented method, system and computer usable program code for detecting topic shift boundaries in a multimedia stream. A computer implemented method for detecting topic shift boundaries in a multimedia stream includes receiving a multimedia stream, and performing multimodal analysis on the multimedia stream to locate a plurality of temporal positions within the multimedia stream at which topic changes have an increased likelihood of occurring to provide a sequence of multimedia portions. Characteristics for a sliding window for each multimedia portion in the sequence of multimedia portions are automatically determined, and topic shift boundaries are detected in each multimedia portion by applying a text-based topic shift detector over the media stream's text transcript using a sliding window, wherein the sliding window used with each multimedia portion has the characteristics determined from its respective multimedia portion.
摘要:
A system, method and computer program product for generating personal profiles for subjects appearing in a video media source. The method includes extracting audiovisual-related personal information related to a subject appearing in the video media source; extracting text-related personal information that are related to the subject in the video source; correlating the extracted audiovisual-related personal information and the extracted text-related personal information related to the subject; and assembling a personal profile data structure for the subject, the personal profile data structure comprising the text-related personal information and audiovisual-related personal information related to the subject. The text-related personal information forms the name identity of the subject, while the audiovisual-related personal information includes audiovisual-related features including information forming one or more of: a visual identity, a kinematic identity and, a voice identity of the subject. In an alternate embodiment, in an iterative manner, the correlated extracted audiovisual-related personal information and extracted text-related personal information may be fed back and utilized for performing an additional search from external information sources, via a search engine, to obtain additional texts relating to the subject or obtain additional video media sources having the subject. There is further enabled the updating of an assembled personal profile of a subject as a new video media source having said subject becomes available.
摘要:
Computer implemented method, system and computer program product for extracting salient keywords for videos. A computer implemented method for extracting salient keywords for videos includes extracting a set of candidate keywords from a text source of a video, assigning a salience value to each candidate keyword based on statistical information to provide a set of statistically significant keywords, exploiting additional cues that are available to the video and that can be used to further measure the significance of existing keywords or to extract new keywords, and selecting a set of salient keywords for the video based on the set of statistically significant keywords and the additional cues.
摘要:
Techniques for creating training sets for predictive modeling are provided. In one aspect, a method for generating training data from an unlabeled data set is provided which includes the following steps. A small initial set of data is selected from the unlabeled data set. Labels are acquired for the initial set of data selected from the unlabeled data set resulting in labeled data. The data in the unlabeled data set is clustered using a semi-supervised clustering process along with the labeled data to produce data clusters. Data samples are chosen from each of the clusters to use as the training data. The selecting, presenting, clustering and choosing steps are repeated with one or more additional sets of data selected from the unlabeled data set until a desired amount of training data has been obtained, wherein at each iteration an amount of the labeled data is increased.
摘要:
A system and method for automatic call segmentation including steps and means for automatically detecting boundaries between utterances in the call transcripts; automatically classifying utterances into target call sections; automatically partitioning the call transcript into call segments; and outputting a segmented call transcript. A training method and apparatus for training the system to perform automatic call segmentation includes steps and means for providing at least one training transcript with annotated call sections; normalizing the at least one training transcript; and performing statistical analysis on the at least one training transcript.
摘要:
A system, method, and computer program are disclosed for recognizing one or more words not listed in a dictionary database. One or more sequences of characters in the word are checked to determine a probability that the word is valid. A prefix removal process removes any prefixes from a word, and obtains information about the removed prefix. A suffix removal process removes any suffixes from the word, and obtains information about the removed suffix. A root process obtains information about a root word from the dictionary database. A combination process then determines if the prefix, the root, and the suffix can be combined into a valid word as defined by one or more combination rules, obtains one or more of the possible parts of speech of the valid word, and stores the parts of speech with the valid word in the dictionary database.