摘要:
Pour transmettre des signaux audio stéréophoniques numérisés par des voies soumises à des parasites, notamment des voies radiophoniques, les erreurs de transmission sont détectées du côté décodeur et, le cas échéant, corrigées ou éliminées. les calculs requis pour la correction ou l'élimination des erreurs sont exécutés, dans une large mesure, du côté codeur. Les résultats de ces calculs sont transmis, sous forme codée, en tant qu'informations complémentaires, au décodeur qui les utilise pour corriger ou éliminer les erreurs.
摘要:
Afin de déterminer le seuil global d'écoute (6) lors d'un codage à la source de signaux audio numérisés (1) qui en réduit le débit binaire, une prescription de requantification et de codage des valeurs temporelles ou spectrales de balayage (2) du signal audio est déduite de l'effet de masque de tous les masqueurs et masqueurs de bruit concernés (100, 200, 300), ainsi que du seuil d'écoute permanent (400). A cet effet, les flancs de masquage (101, 102, 201, 202, ,301, 302) des masqueurs, le cas échéant sélectionnés, sont segmentés et une approximation est calculée dans les segments individuels au moyen de polymômes d'ordre inférieur, les coefficients des polynômes d'ordre inférieur étant calculés. Pour le calcul des coefficients des polynômes d'ordre inférieur, les intensités des circuits de masquage converties en niveaux logarithmiques sont utilisées. Le seuil global d'écoute (6) est déterminé par étapes, c'est-à-dire pour chaque marqueur l'un après l'autre, à des points d'appui individuels, le cas échéant sélectionnés, à partir des polynômes décrivant les flancs des masqueurs éventuellement sélectionnés.
摘要:
In order to convert multi-channel sound formats, in particular five-channel sound formats with the following sound channels:- left channel (L) - right channel (R) - centre channel (C) - rear left channel (Ls) - rear right channel (Rs), into downward-compatible sound formats, in particular into two-channel sound formats with a right channel and a left channel, the following steps are proposed according to ITU-R BS.775: the level of the centre channel (C) is lowered (for example - 3 dB), the centre channel (C), the level of which has been lowered, is distributed to the left channel (L) so as to form a first sum signal (L), the level of the rear left channel (Ls) is lowered (for example by - 3 dB), the rear left channel (Ls), the level of which has been lowered, is distributed to the first sum signal so as to form the third sum signal which corresponds to the left channel (L IRT ) of the two-channel sound format, the centre channel (C), the level of which has been lowered, is distributed to the right channel (R) so as to form a second sum signal (R'), the level of the rear right channel (Rs) is lowered (for example by - 3dB), the rear right channel (Rs), the level of which has been lowered, is distributed to the second sum signal so as to form a fourth sum signal which corresponds to the right channel (R IRT ) of the two-channel sound format. In order to largely compensate for a shift in the phantom sound sources, a change in the level difference between coherent and incoherent signal components and timbre changes, the invention provides for the spectral values of overlapping time windows to each be dynamically corrected with k samples of the left channel (L) and right channel (R) when forming the first sum signal (L') and the second sum signal (R') and for the spectral values of overlapping time windows to each be dynamically corrected with k samples of the first sum signal (L') and the second sum signal (R') when forming the third and fourth sum signals, and for each sum of the spectral values to be compared with a desired value (A soll , where A soll ∈ R) before each dynamic correction of spectral values of the left channel (L) and right channel (R).
摘要:
To improve a cropping system by obtaining coverage of a wide range of contents for smaller sized displays of handheld devices, a method starts from a metadata aggregation and corresponding video, e.g. in post-production, program exchange and archiving, wherein (a) video is passed to a video analysis to deliver video, e.g. by use of motion detection, morphology filters, edge detection, etc., (b) separated video and metadata are combined to extract important features in a context wherein important information from the metadata is categorized and used to initialize a dynamically fitted chain of feature extraction steps adapted to the delivered video content, (c) extracted important features are combined to define regions of interest (ROI) which are searched in consecutive video frames by object tracking, the object tracking identifies a new position and deformation of each initialized ROI in consecutive video frames and returns this information to the feature extraction thereby obtaining a permanent communication between the feature extraction and the object tracking, (d) one or plural ROIs are extracted and input video frame by video frame into a cropping step (e); based on weighting information a well composed image part is cropped by classifying the supplied ROIs by importance, and (f) the cropped image area(s) are scaled to the desired small screen size.
摘要:
In order to transmit digitized stereophonic sound signals over channels, in particular radio channels, subject to interference, the invention proposes that transmission errors are detected at the decoder end and, if necessary, corrected or screened out. Most of the computer calculations necessary for the correction or screening out of the errors are carried out at the encoder end. The result of the computer calculations is transmitted in coded form as supplementary information to the decoder, which uses the transmitted result in the correction or screening out of errors.
摘要:
In oder to find the overall monitoring threshold (6) during a bit-rate-reducing source coding of digitised audio signals (1), a requantization and coding regulation for the time or spectral scanning values (2) of the audio signal is found from the masking effect of all relevant maskers and noise maskers (100, 200, 300) and from the steady audio threshold (400). To this end the masking sides (101, 102, 201, 202, 301, 302) of the selected maskers are segmented and approximated in the individual segments by low-order polynomials, whereby the coefficients of the low-order polynomials are found. The intensities of the maskers converted into a logarithmic level are used to find the coefficients of the low-order polynomials. The overall monitoring threshold (6) is found at individual or eventually selected support points in steps, masker by masker, from the polynomials describing the masking sides of the eventually selected maskers.
摘要:
In order to transmit or to store digitalized, multi-channel audio signals which are digitally represented by a plurality of spectral partial band signals, at the encoder, the partial band signals of different audio channels but having the same frequency position are combined across channels according to a dynamic control signal. This control signal is derived from several audio channels by an audio signal analysis based on a psycho-acoustic binaural model. On the side of the decoder, the partial band signals of several audio channels but having the same frequency position are dissociated across channels according to a control value derived from the dynamic control signal, transmitted or stored therewith.
摘要:
To improve a cropping system by obtaining coverage of a wide range of contents for smaller sized displays of handheld devices, a method starts from a metadata aggregation and corresponding video, e.g. in post-production, program exchange and archiving, wherein (a) video is passed to a video analysis to deliver video, e.g. by use of motion detection, morphology filters, edge detection, etc., (b) separated video and metadata are combined to extract important features in a context wherein important information from the metadata is categorized and used to initialize a dynamically fitted chain of feature extraction steps adapted to the delivered video content, (c) extracted important features are combined to define regions of interest (ROI) which are searched in consecutive video frames by object tracking, the object tracking identifies a new position and deformation of each initialized ROI in consecutive video frames and returns this information to the feature extraction thereby obtaining a permanent communication between the feature extraction and the object tracking, (d) one or plural ROIs are extracted and input video frame by video frame into a cropping step (e); based on weighting information a well composed image part is cropped by classifying the supplied ROIs by importance, and (f) the cropped image area(s) are scaled to the desired small screen size.