摘要:
To improve a cropping system by obtaining coverage of a wide range of contents for smaller sized displays of handheld devices, a method starts from a metadata aggregation and corresponding video, e.g. in post-production, program exchange and archiving, wherein (a) video is passed to a video analysis to deliver video, e.g. by use of motion detection, morphology filters, edge detection, etc., (b) separated video and metadata are combined to extract important features in a context wherein important information from the metadata is categorized and used to initialize a dynamically fitted chain of feature extraction steps adapted to the delivered video content, (c) extracted important features are combined to define regions of interest (ROI) which are searched in consecutive video frames by object tracking, the object tracking identifies a new position and deformation of each initialized ROI in consecutive video frames and returns this information to the feature extraction thereby obtaining a permanent communication between the feature extraction and the object tracking, (d) one or plural ROIs are extracted and input video frame by video frame into a cropping step (e); based on weighting information a well composed image part is cropped by classifying the supplied ROIs by importance, and (f) the cropped image area(s) are scaled to the desired small screen size.
摘要:
To improve a cropping system by obtaining coverage of a wide range of contents for smaller sized displays of handheld devices, a method starts from a metadata aggregation and corresponding video, e.g. in post-production, program exchange and archiving, wherein (a) video is passed to a video analysis to deliver video, e.g. by use of motion detection, morphology filters, edge detection, etc., (b) separated video and metadata are combined to extract important features in a context wherein important information from the metadata is categorized and used to initialize a dynamically fitted chain of feature extraction steps adapted to the delivered video content, (c) extracted important features are combined to define regions of interest (ROI) which are searched in consecutive video frames by object tracking, the object tracking identifies a new position and deformation of each initialized ROI in consecutive video frames and returns this information to the feature extraction thereby obtaining a permanent communication between the feature extraction and the object tracking, (d) one or plural ROIs are extracted and input video frame by video frame into a cropping step (e); based on weighting information a well composed image part is cropped by classifying the supplied ROIs by importance, and (f) the cropped image area(s) are scaled to the desired small screen size.
摘要:
In order to convert multi-channel sound formats, in particular five-channel sound formats with the following sound channels:- left channel (L) - right channel (R) - centre channel (C) - rear left channel (Ls) - rear right channel (Rs), into downward-compatible sound formats, in particular into two-channel sound formats with a right channel and a left channel, the following steps are proposed according to ITU-R BS.775: the level of the centre channel (C) is lowered (for example - 3 dB), the centre channel (C), the level of which has been lowered, is distributed to the left channel (L) so as to form a first sum signal (L), the level of the rear left channel (Ls) is lowered (for example by - 3 dB), the rear left channel (Ls), the level of which has been lowered, is distributed to the first sum signal so as to form the third sum signal which corresponds to the left channel (L IRT ) of the two-channel sound format, the centre channel (C), the level of which has been lowered, is distributed to the right channel (R) so as to form a second sum signal (R'), the level of the rear right channel (Rs) is lowered (for example by - 3dB), the rear right channel (Rs), the level of which has been lowered, is distributed to the second sum signal so as to form a fourth sum signal which corresponds to the right channel (R IRT ) of the two-channel sound format. In order to largely compensate for a shift in the phantom sound sources, a change in the level difference between coherent and incoherent signal components and timbre changes, the invention provides for the spectral values of overlapping time windows to each be dynamically corrected with k samples of the left channel (L) and right channel (R) when forming the first sum signal (L') and the second sum signal (R') and for the spectral values of overlapping time windows to each be dynamically corrected with k samples of the first sum signal (L') and the second sum signal (R') when forming the third and fourth sum signals, and for each sum of the spectral values to be compared with a desired value (A soll , where A soll ∈ R) before each dynamic correction of spectral values of the left channel (L) and right channel (R).