摘要:
Embodiments of the present invention introduce a novel technique to analyze and monitor video streams captured from multiple cameras. It highlights the foreground region of the video streams via local alpha blending and displays the videos in an immersive 3-D environment. The spatial arrangement of the displays can be generated by multi-dimensional scaling of the amount of simultaneous motion across different video streams. This description is not intended to be a complete description of, or limit the scope of, the invention. Other features, aspects, and objects of the invention can be obtained from a review of the specification, the figures, and the claims.
摘要:
Embodiments of the present invention introduce a user navigation interface that allows a user to monitor/navigate video streams captured from multiple cameras. It integrates video streams from multiple cameras with the semantic layout into a 3-D immersive environment and renders the video streams in multiple displays on a user navigation interface. It conveys the spatial distribution of the cameras as well as their fields of view and allows a user to navigate freely or switch among preset views. This description is not intended to be a complete description of, or limit the scope of, the invention. Other features, aspects, and objects of the invention can be obtained from a review of the specification, the figures, and the claims.
摘要:
A method for generating content links between a first digital file and a second digital file by detecting a content feature of a first digital file segment of the first digital file during playback of the first digital file segment of the first digital file, searching an index of a plurality of content features for a plurality of segments including a second digital file segment of the second digital file, and dynamically generating a link between the first digital file one segment of the first digital file and the second digital file segment of the second digital file when a content feature of the first digital file segment of the first digital file is related to the content feature of the at least one segment of the second digital file.
摘要:
Techniques for reducing the computational complexity of conventional similarity-based approaches for temporal event clustering of digital photograph collections include one or more approaches to select boundaries based on dynamic programming and the Bayes information criterion. Each method performs competitively with conventional approaches and offer significant computational savings.
摘要:
A method of extracting audio excerpts comprises: segmenting audio data into a plurality of audio data segments; setting a fitness criteria for the plurality of audio data segments; analyzing the plurality of audio data segments based on the fitness criteria; and selecting one of the plurality of audio data segments that satisfies the fitness criteria. In various exemplary embodiments, the method of extracting audio excerpts further comprises associating the selected one of the plurality of audio data segments with video data. In such embodiments, associating the selected one of the plurality of audio data segments with video data may comprise associating the selected one of the plurality of audio data segments with a keyframe.
摘要:
A method of identifying a location of a mobile device in a building includes identifying non-overlapping regions in a building. A server collects base station signal strength measurements at a plurality of distinct points in the building, with at least one point in each region. The server trains region classifiers for each region. Each region classifier is configured to compute a probability estimate that the test point is inside the region, using inputs that are signal strength differences. The server receives signal strength measurements from the base stations, taken by a mobile device at an unknown point. The server computes differences in signal strengths between pairs of base stations, and applies the region classifiers to the signal strength differences, thereby estimating the region where the mobile device is located. The server then transmits the estimated region to a user.
摘要:
Described is a system for automatic digital photo orientation detection. We leverage online public photos with great content variation to extract effective features with layout information. Classification proceeds using an approximate nearest neighbors approach which scales well to massive training sets, hardly compromising efficiency. We have tested the method successfully on the largest data set to date of nearly 30,000 Flickr photos as well as both difficult and typical consumer usage scenarios. Though limited data are available for comparison across different systems, the proposed system significantly outperforms a state of the art system on a common data set.
摘要:
Embodiments of the present invention provide a system and method for discriminatively selecting keyframes that are representative of segments of a source digital media and at the same time distinguishable from other keyframes representing other segments of the digital media. The method and system, in one embodiment, includes pre-processing the source digital media to obtain feature vectors for frames of the media. Discriminatively selecting a keyframe as a representative for each segment of a source digital media wherein said discriminative selection includes determining a similarity measure for each candidate keyframe and determining a dis-similarity measure for each candidate keyframe and selecting the keyframe with the highest goodness value computing from the similarity and dis-similarity measures.
摘要:
Described is a system for automatic digital photo orientation detection. We leverage online public photos with great content variation to extract effective features with layout information. Classification proceeds using an approximate nearest neighbors approach which scales well to massive training sets, hardly compromising efficiency. We have tested the method successfully on the largest data set to date of nearly 30,000 Flickr photos as well as both difficult and typical consumer usage scenarios. Though limited data are available for comparison across different systems, the proposed system significantly outperforms a state of the art system on a common data set.
摘要:
The present invention provides a system and method for automatically combining image and audio data to create a multimedia presentation. In one embodiment, audio and image data are received by the system. The audio data includes a list of events that correspond to points of interest in an audio file. The audio data may also include an audio file or audio stream. The received images are then matched to the audio file or stream using the time. In one embodiment, the events represent times within the audio file or stream at which there is a certain feature or characteristic in the audio file. The audio events list may be processed to remove, sort or predict or otherwise generate audio events. Images processing may also occur, and may include image analysis to determine image matching to the event list, deleting images, and processing images to incorporate effects. Image effects may include cropping, panning, zooming and other visual effects.