Abstract:
There is provided herein a system for video encoding and decodingthat uses short-term and long-term buffers. Reconstruction of each block within an image may be performed with reference to one of the buffers, so that different portions of an image, or different images in a sequence, may be reconstructed using different buffers. There is also provided herein systems for signaling, between an encoder and a decoder, the use of the above buffers and related address information. The encoder may, for example, transmit information identifying video data as corresponding to a particular one of the buffers, and the decoder may transmit information relating to the size of the short-term and the long-term buffer. The buffer sizes may be changed during transmission of video data by including buffer allocation information in the video data. Also disclosed herein are methods and apparatuses according to the above.
Abstract:
A remote view station (26) is communicatively coupled to an image server (24) and receives a compressed version of source medical images. The remote view station (26) uncompresses and displays the received medical image. A medical professional, such as a pathologist, can select a region of the displayed medical image. Region information is transmitted back to the image server that applies image analysis operations on a region of the source medical image that corresponds to the selected region of the compressed medical image. In this manner, the data loss that occurs during image compression does not effect the image analysis operations. As such, the image analysis operations produce more accurate results than if the operations were applied by remote view station to the uncompressed image.
Abstract:
During video coding, a transform such as a discrete cosine transform (DCT) is applied to blocks of image data (e.g., motion-compensated interframe pixel differences) and the resulting transform coefficients for each block are quantized at a specified quantization level. Notwithstanding the fact that some coefficients are quantized to non-zero values, at least one non-zero quantized coefficient is treated as if it had a value of zero for purposes of further processing (e.g., run-length encoding (RLE) the quantized data. When segmentation analysis is performed to identify two or more different regions of interest in each frame, the number of coefficients that are treated as having a value of zero for RLE is different for different regions of interest (e.g., more coefficients for less-important regions). In this way, the number of bits used to encode image data are reduced to satisfy bit rate requirements without (1) having to drop frames adaptively, while (2) conforming to constraints that may be imposed on the magnitude of change in quantization level from frame to frame.
Abstract:
A method and apparatus for encoding (622) digital image data wherein a region of interest (606) can be specified either before the encoding process has begun or during the encoding process, such that the priority (616) of the encoder outputs are modified so as to place more emphasis on the region of interest, therefore increasing the speed and/or increasing the fidelity of the reconstructed region of interest. The system, therefore, enables more effective reconstruction of digital images over communication lines.
Abstract:
In order to control the quantity of encoding in accordance with the processing condition of a terminal at the time of simultaneously decoding and synthesizing a plurality of videos and audios, a picture and sound decoding device is provided with a reception managing section (11) which receives information, a separating section (12) which analyzes and separates received information, a priority determining section (14) which determines the processing priority of the pictures separated by the separating section (12), a picture expanding section (18) which expands the pictures in accordance with the priority determined by the determining section (14), a picture synthesizing section (19) which performs picture synthesization based on the expanded pictures, a synthesized result storing section (22) which stores the synthesized pictures, a reproducing time managing section (23) which manages the starting time of reproduction, and an output section (24) which outputs synthesis results in accordance with the information from the managing section (23).
Abstract:
A method of processing an input image for storage includes decomposing (50) the input image into a number of images at various resolutions, subdividing (65) at least some of these images into tiles (rectangular arrays) and storing (75) a block (referred to as the "tile block") representing each of the tiles, along with an index that specifies the respective locations of the tile blocks. In specific embodiments, the tiles are 64 x 64 pixels or 128 x 128 pixels. The representations of the tiles are typically compressed versions (78) of the tiles. In a specific embodiment, JPEG compression is used. In a specific embodiment, an operand image is recursively decomposed to produce a reduced image and a set of additional (or complementary) pixel data. At the first stage, the operand image is normally the input image, and, for each subsequent stage, the operand image is the reduced image from the previous stage. At each stage, the reduced image is at a characteristic resolution that is lower than the resolution of the operand image. The processing is typically carried out until the resulting reduced image is of a desired small size (55).
Abstract:
The electric power consumption by a terminal used for communication of multimedia information is controlled by changing the quality of transmitted information. The terminal is provided with input means (101, 102, 106 and 107) through which such information as images and sounds is inputted, channel control sections (123 and 124) which output the input information to channels and receive information from the channels, output means (103, 104, 108, 109 and 105) which output the information received from the channels in the form of images, sounds, etc., a codec means (110) which is provided between the input and output means and the control sections, encodes the input information in one of multiple encoding modes in which electric power is differently consumed, and decodes the information inputted from the channels, and a control section (133) which controls the selection of the encoding mode. This terminal can continue information communication for a required period of time at minimum power consumption at the sacrifice of the quality of transmitted information. Therefore, either the power consumption or quality of information can be adequately selected according to the transmission.
Abstract:
There are disclosed various methods, apparatuses and computer program products for image compression. In accordance with an embodiment a method comprises receiving an image presented by luma and chroma components; determining at least two different areas in the image; and encoding the chroma component of each of the at least two areas differently.
Abstract:
A video processing method includes receiving an omnidirectional content corresponding to a sphere, generating a projection-based frame according to the omnidirectional content and a pyramid projection layout, and encoding, by a video encoder, the projection-based frame to generate a part of a bitstream. The projection-based frame has a 360-degree content represented by a base projection face and a plurality of lateral projection faces packed in the pyramid projection layout. The base projection face and the lateral projection faces are obtained according to at least projection relationship between a pyramid and the sphere.
Abstract:
Techniques and systems are provided for processing video data. For example, 360-degree video data can be obtained for processing by an encoding device or a decoding device. The 360-degree video data includes pictures divided into motion-constrained tiles. The 360-degree video data can be used to generate a media file including a plurality of tracks. Each of the plurality of tracks contain a set of at least one of the motion-constrained tiles. The set of at least one of the motion-constrained tiles corresponds to at least one of a plurality of viewports of the 360-degree video data. A first tile representation can be generated for the media file. The first tile representation encapsulates a first track of the plurality of tracks, and the first track includes a first set of at least one of the motion-constrained tiles at a first tile location in the pictures of the 360-degree video data. The first set of at least one of the motion-constrained tiles corresponds to a viewport of the 360-degree video data.