摘要:
Method and apparatus for summarizing video data is disclosed. In one embodiment, a method includes accessing face information that is associated with at least one identified person within frames of the video data, examining user specifications for selecting portions of the video data, comparing the user specifications with the face information to determine indicia of interest related to each of the frames, identifying at least one of the frames that are in accordance with user specifications based on the indicia of interest and forming summary video data using the at least one identified frame of the video data.
摘要:
Generating smart tags that allow a user to locate any portion of image content without viewing the image content is disclosed. Image-based processing is performed on image content to find an event of interest that is an occurrence captured by the image content. Thus, metadata is derived from analyzing the image content. The metadata is then analyzed. Different types of characteristics associated with portions of the image content as indicated by the metadata are detected. Responsive to this, tags are created, and different types of tags are applied to the portions of image content to categorize the portions into classes. Thus, a tag is associated with each portion of the image content including the event of interest. The tag describes a characteristic of that portion of the image content. Display of the different types of tags is initiated for selective viewing of the portions of the image content.
摘要:
Embodiments herein include presenting smart tags describing characteristics of image content in a hierarchy, and performing operations on the hierarchy to find particular image content within a larger amount of image content. Image content and corresponding tags are maintained. The corresponding tags associated with the image content are presented in a hierarchy. Each tag type in the hierarchy represents a characteristic associated with the image content. Each tag in the hierarchy is derived based on image-based processing applied to the image content. In response to receiving a selection of at least one tag in the hierarchy, display of the image content associated with the at least one tag is initiated. A user is able to quickly and easily find desired image content by using the hierarchy to look at tags, select a type of tag from the hierarchy, and thereafter view any content tagged with the selected tag type.
摘要:
Embodiments herein include presenting smart tags describing characteristics of image content in a hierarchy, and performing operations on the hierarchy to find particular image content within a larger amount of image content. Image content and corresponding tags are maintained. The corresponding tags associated with the image content are presented in a hierarchy. Each tag type in the hierarchy represents a characteristic associated with the image content. Each tag in the hierarchy is derived based on image-based processing applied to the image content. In response to receiving a selection of at least one tag in the hierarchy, display of the image content associated with the at least one tag is initiated. A user is able to quickly and easily find desired image content by using the hierarchy to look at tags, select a type of tag from the hierarchy, and thereafter view any content tagged with the selected tag type.
摘要:
Systems and methods for identifying, tracking, and using objects in a video or similar electronic content, including methods for tracking one or more moving objects in a video. This can involve tracking one or more feature points within a video scene and separating those feature points into multiple layers based on motion paths. Each such motion layer can be further divided into different clusters, for example, based on distances between points. These clusters can then be used as an estimate to define the boundaries of the objects in video. Objects can also be compared with one another in cases in which identified objects should be combined and considered a single object. For example, if two objects in the first two frames have significantly overlapping areas, they may be considered the same object. Objects in each frame can further be compared to determine the life of the objects across the frames.
摘要:
A method and apparatus for tracking objects within a video frame sequence. In one embodiment, a method for tracking an object within a video frame sequence is disclosed. The method includes processing each jth frame of the video frame sequence to determine a motion vector defining motion between a prior jth frame and a current jth frame. The method includes creating an object descriptor for an object being tracked followed by generating a document object model comprising motion information and the object descriptor.
摘要:
Methods and apparatus provide for a Scene Detector to optimize the location of scene breaks in a set of video frames. Specifically, the Scene Detector receives a set of video frames and a corresponding content model for each video frame. As the Scene Detector identifies a scene in the set of video frames, the Scene Detector updates statistical predictors with respect to characteristics of that scene's characteristics. The Scene Detector thereby utilizes the updated statistical predictors to identify a video frame that may be the next scene break. The Scene Detector analyzes video frames with respect to the possible next scene break in order to identify the actual second scene break that occurs after the previously identified scene break.
摘要:
Methods and apparatus provide for a clip-beat aligner that identifies musical beats in an audio file. An editing mode is provided to associate the audio file with a media segment according to a timeline. The clip-beat aligner aligns a boundary of the media segment with a musical beat on the timeline. Upon performing an editing operation, the clip-beat aligner maintains that the boundary of the media segment is aligned with any one of the musical beats. To align a boundary of each media segment with a musical beat, the clip-beat aligner identifies a musical beat that is proximate to the position of the media segment's boundary. The clip-beat aligner then aligns the media segment's boundary with the proximate musical beat by, if necessary, automatically trimming the media segment's duration such that the media segment's boundary occurs at the same moment in time as the proximate musical beat.
摘要:
Methods and apparatus provide for a Scene Detector to optimize the location of scene breaks in a set of video frames. Specifically, the Scene Detector receives a set of video frames and a corresponding content model for each video frame. As the Scene Detector identifies a scene in the set of video frames, the Scene Detector updates statistical predictors with respect to characteristics of that scene's characteristics. The Scene Detector thereby utilizes the updated statistical predictors to identify a video frame that may be the next scene break. The Scene Detector analyzes video frames with respect to the possible next scene break in order to identify the actual second scene break that occurs after the previously identified scene break.
摘要:
Methods and apparatus provide for a clip-beat aligner that identifies musical beats in an audio file. An editing mode is provided to associate the audio file with a media segment according to a timeline. The clip-beat aligner aligns a boundary of the media segment with a musical beat on the timeline. Upon performing an editing operation, the clip-beat aligner maintains that the boundary of the media segment is aligned with any one of the musical beats. To align a boundary of each media segment with a musical beat, the clip-beat aligner identifies a musical beat that is proximate to the position of the media segment's boundary. The clip-beat aligner then aligns the media segment's boundary with the proximate musical beat by, if necessary, automatically trimming the media segment's duration such that the media segment's boundary occurs at the same moment in time as the proximate musical beat.