摘要:
A method for obtaining and automatically classifying images into events comprising the steps of: (a) obtaining a group of images from a digital source, wherein the images are in chronological order; (b) transferring the group of images to a computer system; said computer system (c) clustering the images into smaller groups based on chronological image similarity of nearby images by computing histograms of the images and comparing histogram intersection values obtained therefrom with one or more thresholds, whereby the clustering based on chronological image similarity is done in at least one stage by comparing each image with its direct neighboring images; and (d) evaluating the clustered images against a final condition related to at least one of a predetermined group maximum for the number of smaller groups and a predetermined maximum number of isolated pictures, whereby the smaller groups are classified as events if the final condition is met.
摘要:
A method for automatically classifying images into events for composing and authoring of a multimedia image program on a recordable optical disc comprises the steps of: (a) receiving a plurality of images having either or both date and/or time of image capture; (b) determining one or more largest time differences of the plurality of images based on clustering of the images; (c) separating the plurality of images into events based on having one or more boundaries between events which one or more boundaries correspond to the one or more largest time differences; (d) specifying at least one multimedia feature that is related to each event; (e) encoding the images between event boundaries and the at least one multimedia feature associated therewith into an event bitstream; and (f) writing each event bitstream to the recordable optical disc, whereby each event is authored into a separate section of the recordable optical disc.
摘要:
A method for identifying a set of key video frames from a video sequence comprising extracting feature vectors for each video frame and applying a group sparsity algorithm to represent the feature vector for a particular video frame as a group sparse combination of the feature vectors for the other video frames. Weighting coefficients associated with the group sparse combination are analyzed to determine video frame clusters of temporally-contiguous, similar video frames. The video sequence is segmented into scenes by identifying scene boundaries based on the determined video frame clusters.
摘要:
A method for determining a scene boundary location dividing a first scene and a second scene in an input video sequence. The scene boundary location is determined responsive to a merit function value, which is a function of the candidate scene boundary location. The merit function value for a particular candidate scene boundary location is determined by representing the dynamic scene content for the input video frames before and after candidate scene boundary using sparse combinations of a set of basis functions, wherein the sparse combinations of the basis functions are determined by finding a sparse vector of weighting coefficients for each of the basis functions. The weighting coefficients determined for each of the input video frames are combined to determine the merit function value. The candidate scene boundary providing the smallest merit function value is designated to be the scene boundary location.
摘要:
Computing a scale factor to insert a first set of shapes into a second set of shapes to form a combined image includes receiving the two sets of shapes, using a processor to convert the first set of shapes into a set of rectangles and the second set of shapes into a set of intervals and computing the scale factor for either the set of intervals or the set of rectangles to generate the combined image by iteratively inserting the set of rectangles into the set of intervals and updating the scale factor in response to a residual area or an overflow area until all the rectangles in the set of rectangles have been inserted into the set of intervals and the residual area in the set of intervals is below a threshold, and storing the combined image in memory.
摘要:
Generating a tag layout from a set of tags and an ordering of the set of tags, wherein each tag includes a text label and a size for the text label, is disclosed. The method further includes receiving at least one closed shape corresponding to a space for the tag layout. A processor computes a scale factor for at least one of the closed shape or the size of the text labels in the set of tags to generate the tag layout of the set of tags within the closed shape such that all the tags in the set of tags fit within the closed shape and the tags are placed in the space based at least upon the ordering of the tags in the set of tags.
摘要:
A method for determining a scene boundary location dividing a first scene and a second scene in an input video sequence. The scene boundary location is determined responsive to a merit function value, which is a function of the candidate scene boundary location. The merit function value for a particular candidate scene boundary location is determined by representing the dynamic scene content for the input video frames before and after candidate scene boundary using sparse combinations of a set of basis functions, wherein the sparse combinations of the basis functions are determined by finding a sparse vector of weighting coefficients for each of the basis functions. The weighting coefficients determined for each of the input video frames are combined to determine the merit function value. The candidate scene boundary providing the smallest merit function value is designated to be the scene boundary location.
摘要:
A method of identifying groups of related digital images in a digital image collection, comprising: analyzing each of the digital images to generate associated feature descriptors related to image content or image capture conditions; storing the feature descriptors associated with the digital images in a metadata database; automatically analyzing the metadata database to identify a plurality of frequent itemsets, wherein each of the frequent itemsets is a co-occurring feature descriptor group that occurs in at least a predefined fraction of the digital images; determining a probability of occurrence for each the identified frequent itemsets; determining a quality score for each of the identified frequent itemsets responsive to the determined probability of occurrence; ranking the frequent itemsets based at least on the determined quality scores; and identifying one or more groups of related digital images corresponding to one or more of the top ranked frequent itemsets.
摘要:
A method for producing an audio-visual slideshow for a video sequence having an audio soundtrack and a corresponding video track including a time sequence of image frames, comprising: segmenting the audio soundtrack into a plurality of audio segments; subdividing the audio segments into a sequence of audio frames; determining a corresponding audio classification for each audio frame; automatically selecting a subset of the audio segments responsive to the audio classification for the corresponding audio frames; for each of the selected audio segments automatically analyzing the corresponding image frames to select one or more key image frames; merging the selected audio segments to form an audio summary; forming an audio-visual slideshow by combining the selected key frames with the audio summary, wherein the selected key frames are displayed synchronously with their corresponding audio segment; and storing the audio-visual slideshow in a processor-accessible storage memory.
摘要:
A method for identifying high saliency regions in a digital image, comprising: segmenting the digital image into a plurality of segmented regions; determining a saliency value for each segmented region, merging neighboring segmented regions that share a common boundary in response to determining that one or more specified merging criteria are satisfied; and designating one or more of the segmented regions to be high saliency regions. The determination of the saliency value for a segmented region includes: determining a surround region including a set of image pixels surrounding the segmented region; analyzing the image pixels in the segmented region to determine one or more segmented region attributes; analyzing the image pixels in the surround region to determine one or more corresponding surround region attributes; determining a region saliency value responsive to differences between the one or more segmented region attributes and the corresponding surround region attributes.