摘要:
A sequence of intensity measures is determined corresponding to each of a sequence of video frames. Such can be used for selection of videos according to user preferences and for providing highlights from a video sequence. Low level video characteristics are used which may be related to arousal and valence affects conveyed to a viewer while watching the video recording.
摘要:
A method of segmenting a sequence of video images according to scene activity, the method comprising: defining a first series of nodes in a first multi-dimensional space, each node corresponding to an image of the sequence of video images; defining a transformation function that maps each of the first series of nodes to a corresponding node in a second multi-dimensional space having a lower dimensionality than the first multi-dimensional space; applying said transformation function to each of the first series of nodes to define a second series of respective nodes in the second multi-dimensional space; applying a data clustering algorithm to the second series of nodes to identify clusters of nodes within the second multi-dimensional space, the data clustering algorithm being constrained by a measure of feature distance between a pair of clusters of nodes and a measure of temporal distance between the pair of clusters of nodes; determining a representative image from each cluster of nodes and plotting each representative image with respect to a measure of the elapsed time of the sequence of video images to form an scene density curve indicating the underlying scene change activities; and segmenting the sequence of video images in accordance with local minima and/or maxima of the scene density curve.
摘要:
A distance detection system and method for use in a hand-held electronic device is described. The distance detection system comprises an input unit used to input at least one instruction; at least one processing device having a timing element and a calculating element; at least one signal device; and an air inspecting unit. The signal device is used to transmit a signal to a detecting object in response to the instruction, and to receive a response signal from the detecting object. The timing element is used to measure an interval between transmitting the signal and receiving the response signal. The air inspecting unit is used to detect a condition of air between the hand-held electronic device and the detecting object. The data of the detected air condition can be converted into a parameter to calculate distance data corresponding to a distance from both the parameter and the measured interval.
摘要:
The luminance and chrominance values of pixels encoded according to MPEG or JPEG encoding are extracted using a forward discrete cosine transform (DCT). The forward DCT used in MPEG encoding is analyzed to derive a set of equations which directly relate a pixel value in an encoded image to one or more of the DCT coefficients obtained via the forward DCT transform in the usual image encoding process. These predetermined equations are used to allow for extremely fast and computationally efficient extraction of pixel values directly from the DCT coefficients of an encoded pixel block, without having to undergo an inverse DCT transform. Original images from MPEG and JPEG encoded versions may be extracted in a fast and efficient manner.
摘要:
A video signal is analysed by deriving for each frame a plurality of parameters, said parameters including (a) at least one parameter that is a function of the difference between picture elements of that frame and correspondingly positioned picture elements of a reference frame; (b) at least one parameter that is a function of the difference between picture elements of that frame and correspondingly positioned picture elements of a previous frame; and (c) at least one parameter that is a function of the difference between estimated velocities of picture elements of that frame and the estimated velocities of the correspondingly positioned elements of an earlier frame. Based on these parameters, each frame is assigned one or more of a plurality of predetermined classifications. Scene changes may then be identified as points at which changes occur in these classification assignments.
摘要:
A video surveillance system (10) comprises a camera (25), a personal computer (PC) (27) and a video monitor (29). Video processing software is provided on the hard disk drive of the PC (27). The software is arranged to perform a number of processing operations on video data received from the camera, the video data representing individual frames of captured video. In particular, the software is arranged to identify one or more foreground blobs in a current frame, to match the or each blob with an object identified in one or more previous frames, and to track the motion of the or each object as more frames are received. In order to maintain the identity of objects during an occlusion event, an appearance model is generated for blobs that are close to one another in terms of image position. Once occlusion takes place, the respective appearance models are used to segment the resulting group blob into regions which are classified as representing one or other of the merged objects.
摘要:
This invention provides an object tracking method and system for tracking objects in video frames which takes into account the scaling and variance of each matching feature. This provides for some latitude in the choice of matching feature, whilst ensuring that as many matching features as possible can be used to determine matches between objects, thus giving increased accuracy in the matching thus determined. A parallel matching approach is used, and heuristic rules employed to account for occlusions between objects.
摘要:
The invention relates to estimating the global motion between frames of a motion-compensated inter-frame encoded video sequence, directly from the motion vectors encoded within the frames. For any particular frame, the motion vectors are first decoded, and a finite number of sets of vectors are selected. An affine or other geometrical transform is then used to generate a motion estimation for each set, and then the least median squared error present in each motion estimation is calculated for each estimation. The motion estimation with the smallest least median squared error is then selected as being representative of the global motion in the image of the frame. A panoramic image generating method and system which makes uses of the global motion estimations thus obtained is also described
摘要:
The invention relates to estimating the global motion between frames of a motion-compensated inter-frame encoded video sequence, directly from the motion vectors encoded within the frames. For any particular frame, a motion estimation is determined from motion vectors direct from the frame's anchor frame to the frame in question. This motion estimation is then checked against pre-determined criteria, and where the criteria are not met, re-estimation along a different route is performed, using the bi-directional motion vectors contained within B-frames. A panoramic image generating method and system which makes uses of the global motion estimations thus obtained is also described.
摘要:
Audio/Visual data is classified into semantic classes such as News, Sports, Music video or the like by providing class models for each class and comparing input audio visual data to the models. The class models are generated by extracting feature vectors from training samples, and then subjecting the feature vectors to kernel discriminant analysis or principal component analysis to give discriminatory basis vectors. These vectors are then used to obtain further feature vector of much lower dimension than the original feature vectors, which may then be used directly as a class model, or used to train a Gaussian Mixture Model or the like. During classification of unknown input data, the same feature extraction and analysis steps are performed to obtain the low-dimensional feature vectors, which are then fed into the previously created class models to identify the data genre.