Abstract:
A method and apparatus for reducing the number of Intra-coded pictures (I-Picture) without any quality degradation. In one embodiment, the method takes advantage of characteristics of a heterogeneous network, such as Digital Subscription Line (DSL).
Abstract:
A method and apparatus for reconstructing digital video is provided. In one embodiment, a method for transmitting an original video stream including at least one frame includes segmenting the frame into a plurality of regions, encoding each region in accordance with one of a plurality of available interpolation algorithms that provides minimal distortion the region, and providing a signal containing information that enables a decoder to identify the interpolation algorithms corresponding to each of the regions. This information enables a decoder to enhance a base layer video stream while minimizing the amount of information that must be provided to the decoder in order to perform the enhancement.
Abstract:
A method for encoding a frame of visual data which includes the steps of encoding an original full resolution frame, storing coded data for the encoded full resolution frame, reconstructing and storing encoded full resolution frame, downsampling the original full resolution frame to render it a reduced spatial resolution frame, encoding the reduced spatial resolution frame, storing coded data for the reduced spatial resolution frame, reconstructing and storing the reduced spatial resolution frame, upsampling and storing the reconstructed reduced spatial resolution frame, comparing a characteristic in the reconstructed full resolution frame with said characteristic in the original full resolution frame to determine the deviation of the reconstructed full resolution frame from the original full resolution frame with respect to said characteristic, comparing said characteristic in the upsampled reconstructed spatial reduced resolution frame with said characteristic in the original full resolution frame to determine the deviation of the upsampled reconstructed spatial reduced resolution frame from the original full resolution frame with respect to said characteristic, selecting the frame with the lesser deviation from the original full resolution frame with respect to said characteristic, and outputting the coded data corresponding to the frame with the lesser deviation from the original full resolution frame with respect to said characteristic to the bitstream.
Abstract:
A computer implemented method for deriving an attribute entity network (AEN) from video data is disclosed, comprising the steps of: extracting at least two entities from the video data; tracking the trajectories of the at least two entities to form at least two tracks; deriving at least one association between at least two entities by detecting at least one event involving the at least two entities, said detecting of at least one event being based on detecting at least one spatio-temporal motion correlation between the at least two entities; and constructing the AEN by creating a graph wherein the at least two objects form at least two nodes and the at least one association forms a link between the at least two nodes.
Abstract:
A computer implemented method for deriving an attribute entity network (AEN) from video data is disclosed, comprising the steps of extracting at least two entities from the video data, tracking the trajectories of the at least two entities to form at least two tracks, deriving at least one association between at least two entities by detecting at least one event involving the at least two entities, where the detecting of at least one event is based on detecting at least one spatiotemporal motion correlation between the at least two entities, and constructing the AEN by creating a graph wherein the at least two objects form at least two nodes and the at least one association forms a link between the at least two nodes.
Abstract:
The present invention relates to a method and apparatus for detecting and tracking vehicles. One embodiment of a system for detecting and tracking an object (e.g., vehicle) in a field of view includes a moving object indication stage for detecting a candidate object in a series of input video frames depicting the field of view and a track association stage that uses a joint probabilistic graph matching framework to associate an existing track with the candidate object.
Abstract:
A method and system for performing automated training environment monitoring and evaluation. The training environment may include a mixed reality elements to enhance a training experience.
Abstract:
A method for transforming Video-To-Text is disclosed that automatically generates text descriptions of the content of a video. The present invention first segments an input video sequence according to predefined semantic classes using a Mixture-of-Experts blob segmentation algorithm. The resulting segmentation is coerced into a semantic concept graph and based on domain knowledge and a semantic concept hierarchy. Then, the initial semantic concept graph is summarized and pruned. Finally, according to the summarized semantic concept graph and its changes over time, text and/or speech descriptions are automatically generated using one of the three description schemes: key-frame, key-object and key-change descriptions.
Abstract:
A method and system for creating a histogram of oriented occurrences (HO2) is disclosed. A plurality of entities in at least one image are detected and tracked. One of the plurality of entities is designated as a reference entity. A local 2-dimensional ground plane coordinate system centered on and oriented with respect to the reference entity is defined. The 2-dimensional ground plane is partitioned into a plurality of non-overlapping bins, the bins forming a histogram, a bin tracking a number of occurrences of an entity class. An occurrence of at least one other entity of the plurality of entities located in the at least one image may be associated with one of the plurality of non-overlapping bins. A number of occurrences of entities of at least one entity class in at least one bin may be into a vector to define an HO2 feature.
Abstract:
A computer implemented method for retrieving video clips from a database is disclosed. The method may include retrieving in an initial query from a video collection based on a search term; receiving a user selection of at least one video clip from a first set of video clips corresponding to the search term; associating at least one visual attribute of the selected video clip with the search term; receiving the at least one search term from a user in a subsequent query; determining a set of physical concepts based on the at least one search term; mapping the set of physical concepts to a plurality of visual attributes; searching the database for at least one video clip corresponding to the plurality of visual attributes; identifying at least one video clip in the database having the plurality of visual attributes; and returning a second set of video clips having the set of visual attributes to the user, the second set including the at least one video clip.