摘要:
Techniques for generating a storyboard are disclosed. In one embodiment of the invention the storyboard is comprised of videos from one or more cameras based on the identification of activity in the video. Various embodiments of the invention include an assessment of the importance of the activity, the creation of a storyboard presentation based on importance and interaction techniques for seeing more details or alternate views of the video. In one embodiment, motion detection is used to determine activity in one or more synchronized video streams. Periods of activity are recognized and assigned importance assessments based on the activity, important locations in the video streams, and events from other sensors. In different embodiments, the interface consists of a storyboard and a map.
摘要:
Techniques for generating action keyframes for a fixed-position camera based on the identification of activity in the video, an assessment of the importance of the activity, object recognition in the video, and interaction techniques for seeing more details of the video are presented. In different embodiments of the invention, the importance of activity is determined based on the amount of activity, important locations in the video streams, detected features such as faces, and events from other sensors.
摘要:
Techniques for generating timelines and event logs from one or more fixed-position cameras based on the identification of activity in the video are presented. Various embodiments of the invention include an assessment of the importance of the activity, the creation of a timeline identifying events of interest, and interaction techniques for seeing more details of an event or alternate views of the video. In one embodiment, motion detection is used to determine activity in one or more synchronized video streams. In another embodiment, events are determined based on periods of activity and assigned importance assessments based on the activity, important locations in the video streams, and events from other sensors. In different embodiments, the interface consists of a timeline, event log, and map.
摘要:
A method for exchanging information in a shared interactive environment, comprising selecting a first physical device in a first live video image wherein the first physical device has information associated with it, causing the information to be transferred to a second physical device in a second live video image wherein the transfer is brought about by manipulating a visual representation of the information, wherein the manipulation includes interacting with the first live video image and the second live video image, wherein the first physical device and the second physical device are part of the shared interactive environment, and wherein the first physical device and the second physical device are not the same.
摘要:
An adaptive, interactive visual workspace for viewing groups of files based on their relationships. Relationships of files are visualized using iterative refinement of categories through a direct-manipulation graph-based layout. The visual workspace starts with a fully connected graph linking thumbnail images of related files that is then partitioned into neighborhoods in response to a user creating file stacks corresponding to different categories. Normalized spring lengths improve the overall quality of the layout. Different modes for membership in neighborhoods avoid confusing motion of files and help a user to manually organize the workspace. Additionally, retrieved files can be added without having to significantly move the previous files. Different visualization techniques indicate which files are related to each other. Different zoom rates are used for file location, and surrogate sizes allow users to increase the separation between files while still increasing the surrogate sizes.
摘要:
A method, information system, and computer-readable medium is provided for segmenting a plurality of data, such as multimedia data, and in particular an image document stream. Segment boundary points may be used for retrieving and/or browsing the plurality of data. Similarly, segment boundary points may be used to summarize the plurality of data. Examples of image document streams include video, PowerPoint slides, and NoteLook pages. A genetic method having a fitness or evaluation function using information retrieval concepts, such as importance and precedence, is used to obtain segment boundary points. The genetic method is able to evaluate a large amount of data in a cost effective manner. The genetic method is also able to run incrementally on streaming video and adapt to usage patterns by considering frequently accessed images.
摘要:
Methods for interactive selecting video queries consisting of training images from a video for a video similarity search and for displaying the results of the similarity search are disclosed. The user selects a time interval in the video as a query definition of training images for training an image class statistical model. Time intervals can be as short as one frame or consist of disjoint segments or shots. A statistical model of the image class defined by the training images is calculated on-the-fly from feature vectors extracted from transforms of the training images. For each frame in the video, a feature vector is extracted from the transform of the frame, and a similarity measure is calculated using the feature vector and the image class statistical model. The similarity measure is derived from the likelihood of a Gaussian model producing the frame. The similarity is then presented graphically, which allows the time structure of the video to be visualized and browsed. Similarity can be rapidly calculated for other video files as well, which enables content-based retrieval by example. A content-aware video browser featuring interactive similarity measurement is presented. A method for selecting training segments involves mouse click-and-drag operations over a time bar representing the duration of the video; similarity results are displayed as shades in the time bar. Another method involves selecting periodic frames of the video as endpoints for the training segment.
摘要:
The invention displays video search results in a form that makes it easy for users to determine which results are truly relevant. Each story returned as a search result is visualized as a collage of keyframes from the story's shots. The selected keyframes and their sizes depend on the corresponding shots' respective relevance. Shot relevance depends on the search retrieval score of the shot and, in some embodiments, also depends on the search retrieval score of the shot's parent story. Once areas have been determined, the keyframes are scaled and/or cropped to fit into the area. In one embodiment, users can mark one or more shots as being relevant to the search. In one embodiment, a timeline is created and displayed with one or more neighbor stories that are each part of the video and which are closest in time of creation to the selected story.
摘要:
In one aspect, the present invention is directed to a method and an apparatus for organizing digital media, particularly digital photos, using face recognition. According to a first aspect of the present invention, a computer-based method for organizing digital photos comprises: extracting objects of interest from a plurality of photographs; cropping said plurality of photographs to generate images of isolated objects of interest; applying a recognition algorithm to determine the similarity of isolated objects of interest with a reference; displaying a plurality of objects arranged as a function of the determined similarity; and receiving user input to associate said objects with a particular classification.
摘要:
Method for interactive selecting video consisting of training images from a video for a video similarity search and for displaying the results of the similarity search are disclosed. The user selects a time interval in the video as a query definition of training images for training an image class statistical model. Time intervals can be as short as one frame or consist of disjoint segments or shots. A statistical model of the image class defined by the training images is calculated on-the-fly from feature vectors extracted from transforms of the training images. For each frame in the video, a feature vector is extracted from the transform of the frame, and a similarity measure is calculated using the feature vector and the image class statistical model. The similarity measure is derived from the likelihood of a Gaussian model producing the frame. The similarity is then presented graphically, which allows the time structure of the video to be visualized and browsed. Similarity can be rapidly calculated for other video files as well, which enables content-based retrieval by example. A content-aware video browser featuring interactive similarity measurement is presented. A method for selecting training segments involves mouse click-and-drag operations over a time bar representing the duration of the video; similarity results are displayed as shades in the time bar. Another method involves selecting periodic frames of the video as endpoints for the training segment.