Abstract:
Embodiments described herein provide a system of concurrent compute queues that enable the scheduling of a large number of compute contexts simultaneously on graphics processor hardware. One embodiment provides an apparatus comprising a system interface and a general-purpose graphics processor coupled with the system interface. The general-purpose graphics processor comprises a plurality of graphics processor hardware resources configured to be partitioned into a plurality of isolated partitions, each of the plurality of isolated partitions including a first command streamer, a second command streamer, and circuitry configured to schedule general-purpose graphics compute workloads submitted to a first plurality of command queues associated with the first command streamer and a second plurality of command queues associated with the second command streamer.
Abstract:
Method and apparatus for deriving a motion vector at a video decoder. A block-based motion vector may be produced at the video decoder by utilizing motion estimation among available pixels relative to blocks in one or more reference frames. The available pixels could be, for example, spatially neighboring blocks in the sequential scan coding order of a current frame, blocks in a previously decoded frame, or blocks in a downsampled frame in a lower pyramid when layered coding has been used.
Abstract:
Methods, systems, and computer program products for the generation of multiple layers of scaled encoded video data compatible with the HEVC standard. Residue from prediction processing may be transformed into coefficients in the frequency domain. The coefficients may then be sampled to create a layer of encoded data. The coefficients may be sampled in different ways to create multiple respective layers. The layers may then be multiplexed and sent to a decoder. There, one or more of the layers may be chosen. The choice of certain layer(s) may be dependent on the desired attributes of the resulting video. A certain level of video quality, frame rate, resolution, and/or bit depth may be desired, for example. The coefficients in the chosen layers may then be assembled to create a version of the residue to be used in video decoding.
Abstract:
Reconstructed picture quality for a video codec system may be improved by categorizing reconstructed pixels into different histogram bins with histogram segmentation and then applying different filters on different bins. Histogram segmentation may be performed by averagely dividing the histogram into M bins or adaptively dividing the histogram into N bins based on the histogram characteristics. Here M and N may be a predefined, fixed, non-negative integer value or an adaptively generated value at encoder side and may be sent to decoder through the coded bitstream.
Abstract:
Method and apparatus for deriving a motion vector at a video decoder. A block-based motion vector may be produced at the video decoder by utilizing motion estimation among available pixels relative to blocks in one or more reference frames. The available pixels could be, for example, spatially neighboring blocks in the sequential scan coding order of a current frame, blocks in a previously decoded frame, or blocks in a downsampled frame in a lower pyramid when layered coding has been used.
Abstract:
Reconstructed picture quality for a video codec system may be improved by categorizing reconstructed pixels into different histogram bins with histogram segmentation and then applying different filters on different bins. Histogram segmentation may be performed by averagely dividing the histogram into M bins or adaptively dividing the histogram into N bins based on the histogram characteristics. Here M and N may be a predefined, fixed, non-negative integer value or an adaptively generated value at encoder side and may be sent to decoder through the coded bitstream.
Abstract:
Technologies are presented that optimize data processing cost and efficiency. A computing system may comprise at least one processing element; a memory communicatively coupled to the at least one processing element; at least one compressor-decompressor communicatively coupled to the at least one processing element, and communicatively coupled to the memory through a memory interface; and a cache fabric comprising a plurality of distributed cache banks communicatively coupled to each other, to the at least one processing element, and to the at least one compressor-decompressor via a plurality of nodes. In this system, the at least one compressor-decompressor and the cache fabric are configured to manage and track uncompressed data of variable length for data requests by the processing element(s), allowing usage of compressed data in the memory.
Abstract:
Systems and methods are described including dynamically configuring a shared buffer to support processing of at least two video read streams associated with different video codec formats. The methods may include determining a buffer write address within the shared buffer in response to a memory request associated with one read stream, and determining a different buffer write address within the shared buffer in response to a memory request associated with the other read stream.
Abstract:
Systems and methods are described including dynamically configuring a shared buffer to support processing of at least two video read streams associated with different video codec formats. The methods may include determining a buffer write address within the shared buffer in response to a memory request associated with one read stream, and determining a different buffer write address within the shared buffer in response to a memory request associated with the other read stream.