摘要:
Systems and methods to automatically edit a video to generate a video summary are described. In one aspect, sub-shots are extracted from the video. Importance measures are calculated for at least a portion of the extracted sub-shots. Respective relative distributions for sub-shots having relatively higher importance measures as compared to importance measures of other sub-shots are determined. Based on the determined relative distributions, sub-shots that do not exhibit a uniform distribution with respect to other sub-shots in the particular ones are dropped. The remaining sub-shots are connected with respective transitions to generate the video summary.
摘要:
Systems and methods for learning-based automatic commercial content detection are described. In one aspect, the systems and methods include a training component and an analyzing component. The training component trains a commercial content classification model using a kernel support vector machine. The analyzing component analyzes program data such as video and audio data using the commercial content classification model and one or more of single-side left neighborhood(s) and right neighborhood(s) of program data segments. Based on this analysis, each of the program data segments are classified as being commercial or non-commercial segments.
摘要:
A “music video parser” automatically detects and segments music videos in a combined audio-video media stream. Automatic detection and segmentation is achieved by integrating shot boundary detection, video text detection and audio analysis to automatically detect temporal boundaries of each music video in the media stream. In one embodiment, song identification information, such as, for example, a song name, artist name, album name, etc., is automatically extracted from the media stream using video optical character recognition (OCR). This information is then used in alternate embodiments for cataloging, indexing and selecting particular music videos, and in maintaining statistics such as the times particular music videos were played, and the number of times each music video was played.
摘要:
Systems and methods for learning-based automatic commercial content detection are described. In one aspect, the systems and methods include a training component and an analyzing component. The training component trains a commercial content classification model using a kernel support vector machine. The analyzing component analyzes program data such as video and audio data using the commercial content classification model and one or more of single-side left neighborhood(s) and right neighborhood(s) of program data segments. Based on this analysis, each of the program data segments are classified as being commercial or non-commercial segments.
摘要:
Systems and methods for learning-based automatic commercial content detection are described. In one aspect, program data is divided into multiple segments. The segments are analyzed to determine visual, audio, and context-based feature sets that differentiate commercial content from non-commercial content. The context-based features are a function of single-side left and/or right neighborhoods of segments of the multiple segments.
摘要:
Methods and apparatuses are provided for automatically generating video data based on still image data. Certain aspects of the video may also be configured to correspond to audio features identified within associated audio data.
摘要:
Systems and methods are described that implement personalized karaoke, wherein a user's personal home video and photographs are used to form a background for the lyrics during a karaoke performance. An exemplary karaoke apparatus is configured to segment visual content to produce a plurality of sub-shots and to segment music to produce a plurality of music sub-clips. Having produced the visual content sub-shots and music sub-clips, the exemplary karaoke apparatus shortens some of the plurality of sub-shots to a length of a corresponding music sub-clip from within the plurality of music sub-clips. The plurality of sub-shots is then displayed as a background to lyrics associated with the music, thereby adding interest to a karaoke performance.
摘要:
Technologies for generating a boosted tag ranking for a media instance, the boosted tag ranking based on probabilistic relevance estimation computed by a probabilistic relevance estimator and tag correlation refining performed by a tag correlation refiner. Such boosted tag rankings may be used for search result ranking, tag recommendation, and group recommendation.
摘要:
Multi-label active learning may entail training a classifier with a set of training samples having multiple labels per sample. In an example embodiment, a method includes accepting a set of training samples, with the set of training samples having multiple respective samples that are each respectively associated with multiple labels. The set of training samples is analyzed to select a sample-label pair responsive to at least one error parameter. The selected sample-label pair is then submitted to an oracle for labeling.
摘要:
Kernelized spatial-contextual image classification is disclosed. One embodiment comprises generating a first spatial-contextual model to represent a first image, the first spatial-contextual model having a plurality of interconnected nodes arranged in a first pattern of connections with each node connected to at least one other node, generating a second spatial-contextual model to represent a second image using the first pattern of connections, and estimating the distance between corresponding nodes in the first spatial-contextual model and the second spatial-contextual model based on a relationship with adjacent connected nodes to determine a distance between the first image and the second image.