摘要:
A method for encoding frames of input video signals, including the following steps: implementing a learning/configuring stage that includes the following steps: providing frames of training video signals; determining training statistical parameters for groups of pixels of the frames of training video signals, and also encoding the frames of training video signals to obtain training modes; configuring a decision tree in response to the training statistical parameters and the training modes; and implementing an operating/encoding stage that includes the following steps: determining operating statistical parameters for groups of pixels of the frames of input video signals, and applying the operating statistical parameters to the configured decision tree to obtain operating modes; and encoding the frames of input video signals using the frames of input video signals and the operating modes.
摘要:
A video coding for machines (VCM) encoder includes a first video encoder, the first video encoder configured to encode an input video into a bitstream. The VCM encoder includes a feature extractor, the feature extractor configured to detect at least a feature in the input video. The VCM encoder includes a second encoder, the second encoder configured to encode a feature bitstream as a function of the input video and at least a feature.
摘要:
File formats systems and methods are disclosed that provide a framework that integrates concepts, such as objects based audio-visual representation, meta-data and object oriented programming, to achieve a flexible and generic representation of the audiovisual information and the associated methods to operate on the audiovisual information. A system and method are disclosed for storing data processed from presentation data. The data is stored according to a method comprising coding input presentation data by identifying objects from within the presentation data, coding each object individually and organizing the coded data into access layer data units. The access layer data units are stored throughout a plurality of segments, each segment comprising a segment table in a header portion thereof and those access layer data units that are members of the respective segment, there being one entry in the segment table for each access layer data unit therein. A plurality of extended segments are also stored, each of the extended segments further comprising one or more of the access layer data units that include protocol specific data, the extended segments each represented by a extended segment header. The data of an accessible object is also stored, including an accessible object header and identifiers of the plurality of extended segments, each of the extended segments being a member of the same object.
摘要:
In an interactive communication system based on MPEG-4, Command descriptors along with Command Route nodes or Server Routes in the scene description can be used to support application-specific interactivity. Content selection can be supported by specifying the presentation in command parameters, with the command ID indicating that the command is a content selection command. An initial scene can be created with several images and with text that describes a presentation associated with an image. Associated with each image and the corresponding text is a content selection descriptor. When a user clicks on an image, the client transmits the command containing the selected presentation and the server starts a new presentation. The technique can be used in any application context, as generally as HTTP and CGI can be used to implement any server-based application functionality.
摘要:
As information to be processed at an object-based video or audio-visual (AV) terminal, an object-oriented bitstream includes objects, composition information, and scene demarcation information. Such bitstream structure allows on-line editing, e.g. cut and paste, insertion/deletion, grouping, and special effects. In the interest of ease of editing, AV objects and their composition information are transmitted or accessed on separate logical channels (LCs). Objects which have a lifetime in the decoder beyond their initial presentation time are cached for reuse until a selected expiration time. The system includes a de-multiplexer, a controller which controls the operation of the AV terminal, input buffers, AV objects decoders, buffers for decoded data, a composer, a display, and an object cache.
摘要:
A method for compressing and transmitting a sequence of video frames represented by arrays of digital pixel values includes the following steps: transmitting a representation of a first frame (I1) of the sequence; deriving a sorting permutation P1 of the first frame; using the sorting permutation of the first frame, P1, to approximately sort a second frame (I2) of the sequence, to obtain approximately sorted frame P1(I2); and compressing and transmitting the approximately sorted frame P1(I2).
摘要:
A method for receiving input video having a sequence of input video frames, and producing a compact video signature as an identifier of the input video, includes the following steps: generating a processed video tomograph using an arrangement of corresponding lines of pixels from the respective frames of the sequence of video frames; measuring characteristics of the processed video tomograph; and producing the video signature from the measured characteristics.
摘要:
A video coding for machines (VCM) encoder for combined lossless and lossy encoding includes a feature encoder, the feature encoder configured to encode a sub-picture containing a feature in an input video and provide an indication of the sub-picture, and a video encoder, the video encoder configured to receive an indication of the sub-picture from the feature encoder and encode the sub-picture using a lossy encoding protocol.
摘要:
An interactive video/multimedia application (IVM application) may specify one or more media assets for playback. The IVM application may define the rendering, composition, and interactivity of one or more the assets, such as video. Video multimedia application data (IVMA data may) be used to define the behavior of the IVM application. The IVMA data may be embodied as a standalone file in a text or binary, compressed format. Alternatively, the IVMA data may be embedded within other media content. A video asset used in the IVM application may include embedded, content-aware metadata that is tightly coupled to the asset. The IVM application may reference the content-aware metadata embedded within the asset to define the rendering and composition of application display elements and user-interactivity features. The interactive video/multimedia application (defined by the video and multimedia application data) may be presented to a viewer in a player application.
摘要:
A method for receiving encoded H.264 video signals and transcoding the received encoded signals to encoded MPEG-2 video signals, including the following steps: decoding the encoded H.264 video signals to obtain uncompressed video signals and to also obtain H.264 feature signals; deriving MPEG-2 feature signals from the H.264 feature signals; and producing the encoded MPEG-2 video signals using the uncompressed video signals and the MPEG-2 feature signals. The H.264 feature signals include H.264 macro block modes and include H.264 motion vectors.