摘要:
This invention relates to methods of feature extraction from MPEG-2 and MPEG-4 compressed video sequences. The spatio-temporal compression complexity of video sequences is evaluated for feature extraction by inspecting the compressed bitstream and the complexity is used as a descriptor of the spatio-temporal characteristics of the video sequence. The spatio-temporal compression complexity measure is used as a matching criterion and can also be used for absolute indexing. Feature extraction can be accomplished in conjunction with scene change detection techniques and the combination has reasonable accuracy and the advantage of high simplicity since it is based on entropy decoding of signals in compressed form and does not require computationally expensive inverse Discrete Cosine Transformation (DCT).
摘要:
A method describes motion activity in a video sequence. A motion activity matrix is determined for the video sequence. A threshold for the motion activity matrix is determined. Connected regions of motion vectors at least equal to the threshold are identified and measured for size. A histogram of the distribution of the sizes of the connected areas is constructed for the entire video sequence. The histogram is normalized to characterize the spatial distribution of the video sequence in a motion activity descriptor.
摘要:
A method analyzes a high-level syntax and structure of a continuous compressed video according to a plurality of states. First, a set of hidden Markov models for each of the states is trained with a training video segmented into known states. Then, a set of domain specific features are extracted from a fixed-length sliding window of the continuous compressed video, and a set of maximum likelihoods is determined for each set of domain specific features using the sets of trained hidden Markov models. Finally, dynamic programming is applied to each set of maximum likelihoods to determine a specific state for each fixed-length sliding window of frames of the compressed video.
摘要:
A method extracts high-level features from a video including a sequence of frames. Low-level features are extracted from each frame of the video. Each frame of the video is labeled according to the extracted low-level features to generate sequences of labels. Each sequence of labels is associated with one of the extracted low-level feature. The sequences of labels are analyzed using learning machine learning techniques to extract high-level features of the video.
摘要:
This invention relates to methods of abrupt scene change detection and fade detection for indexing of MPEG-2 and MPEG-4 compressed video sequences. Abrupt scene change and fade-detection techniques applied to signals in compressed form have reasonable accuracy and the advantage of high simplicity since they are based on entropy decoding and do not require computationally expensive inverse Discrete Cosine Transformation (DCT).
摘要:
A method describes activity in a video sequence. The method measures intensity, direction, spatial, and temporal attributes in the video sequence, and the measured attributes are combined in a digital descriptor of the activity of the video sequence.
摘要:
A method extracts an intensity of motion activity from shots in a compressed video. The method then uses the intensity of motion activity to segment the video into easy and difficult segments to summarize. Easy to summarize segments are represented by any frames selected from the easy to summarize segments, while a color based summarization process extracts generates sequences of frames from each difficult to summarize segment. The selected and generated frames of each segment in each shot are combined to form the summary of the compressed video.
摘要:
A method for transcoding a compressed video partitions the compressed video into hierarchical levels, and extracts features from each of the hierarchical levels. One of a number of conversion modes of a transcoder is selected dependent on the features extracted from the hierarchical levels. The compressed video is then transcoded according to the selected conversion mode.
摘要:
A system and method for temporally processing an input video including input frames. Each frame has an associated frame play time, and the input video has a total input video play time that is a sum of the input frame play times of all of the input frames. Each of the input frames is classified according to a content characteristic of each frames. An output frame play time is allocated to each of the input frames that is based on the classified content characteristic of each of the input frames to generate a plurality of output frames that form an output video.
摘要:
A compressed bit-stream represents a corresponding sequence having intra-coded frames and inter-coded frames. The compressed bit-stream includes bits associated with each of the inter-coded frames representing a displacement from the associated inter-coded frame to a closest matching of the intra-coded frames. A magnitude of the displacement of a first of the inter-coded frames is determined based on the bits in the compressed bit-stream associated with that inter-coded frame. The inter-coded frame is then identified based on the determined displacement magnitude. The inter-coded frame includes macro-blocks. Each macro-block is associated with a respective portion of the inter-coded frame bits which represent the displacement from that macro-block to the closest matching intra-coded frame. The displacement magnitude is an average of the displacement magnitudes of all the macro-blocks associated with the inter-coded frame. The displacement magnitudes of those macro-blocks which are less than the average displacement magnitude are set to zero. The number of run lengths of the zero magnitude macro-blocks is determined and also used to identify the first inter-coded frame.