Hierarchical tiled caching
    22.
    发明授权

    公开(公告)号:US09779533B2

    公开(公告)日:2017-10-03

    申请号:US14164441

    申请日:2014-01-27

    CPC classification number: G06T15/005 G06T1/60 G06T11/40

    Abstract: One embodiment of the present invention includes a method for processing graphics objects. The method includes receiving a first draw-call and a second draw-call. The method also includes dividing the first draw-call into a first set of sub-draw-calls and the second draw-call into a second set of sub-draw-calls. The method further includes identifying a first screen tile. The method also includes identifying a first group of sub-draw-calls included in the first set of sub-draw-calls that overlap the first screen tile and a second group of sub-draw-calls included in the second set of sub-draw-calls that overlap the second screen tile. The method further includes causing the first group of sub-draw-calls and the second group of sub-draw-calls to be processed together.

    Adaptive multilevel binning to improve hierarchical caching

    公开(公告)号:US09720842B2

    公开(公告)日:2017-08-01

    申请号:US13772160

    申请日:2013-02-20

    Abstract: A device driver calculates a tile size for a plurality of cache memories in a cache hierarchy. The device driver calculates a storage capacity of a first cache memory. The device driver calculates a first tile size based on the storage capacity of the first cache memory and one or more additional characteristics. The device driver calculates a storage capacity of a second cache memory. The device driver calculates a second tile size based on the storage capacity of the second cache memory and one or more additional characteristics, where the second tile size is different than the first tile size. The device driver transmits the second tile size to a second coalescing binning unit. One advantage of the disclosed techniques is that data locality and cache memory hit rates are improved where tile size is optimized for each cache level in the cache hierarchy.

    Rendering using multiple render target sample masks
    25.
    发明授权
    Rendering using multiple render target sample masks 有权
    渲染使用多个渲染目标样本掩模

    公开(公告)号:US09396515B2

    公开(公告)日:2016-07-19

    申请号:US13969408

    申请日:2013-08-16

    CPC classification number: G06T1/60 G06T1/20 G06T15/503

    Abstract: One embodiment sets forth a method for transforming 3-D images into 2-D rendered images using render target sample masks. A software application creates multiple render targets associated with a surface. For each render target, the software application also creates an associated render target sample mask configured to select one or more samples included in each pixel. Within the graphics pipeline, a pixel shader processes each pixel individually and outputs multiple render target-specific color values. For each render target, a ROP unit uses the associated render target sample mask to select covered samples included in the pixel. Subsequently, the ROP unit uses the render target-specific color value to update the selected samples in the render target, thereby achieving sample-level color granularity. Advantageously, by increasing the effective resolution using render target sample masks, the quality of the rendered image is improved without incurring the performance degradation associated with processing each sample individually.

    Abstract translation: 一个实施例提出了一种使用渲染目标样本掩模将3-D图像变换成2-D渲染图像的方法。 软件应用程序创建与表面相关联的多个渲染目标。 对于每个渲染目标,软件应用程序还创建相关联的渲染目标样本掩模,其被配置为选择包括在每个像素中的一个或多个样本。 在图形流水线中,像素着色器单独处理每个像素,并输出多个渲染目标特定的颜色值。 对于每个渲染目标,ROP单元使用相关联的渲染目标样本掩模来选择包含在像素中的覆盖样本。 随后,ROP单元使用渲染目标特定颜色值来更新渲染目标中的所选样本,从而实现样本级颜色粒度。 有利地,通过使用渲染目标样本掩模增加有效分辨率,可以提高渲染图像的质量,而不会导致与单独处理每个样本相关联的性能下降。

    Architecture and algorithms for data compression

    公开(公告)号:US10338820B2

    公开(公告)日:2019-07-02

    申请号:US15176082

    申请日:2016-06-07

    Abstract: A system architecture conserves memory bandwidth by including compression utility to process data transfers from the cache into external memory. The cache decompresses transfers from external memory and transfers full format data to naive clients that lack decompression capability and directly transfers compressed data to savvy clients that include decompression capability. An improved compression algorithm includes software that computes the difference between the current data word and each of a number of prior data words. Software selects the prior data word with the smallest difference as the nearest match and encodes the bit width of the difference to this data word. Software then encodes the difference between the current stride and the closest previous stride. Software combines the stride, bit width, and difference to yield final encoded data word. Software may encode the stride of one data word as a value relative to the stride of a previous data word.

Patent Agency Ranking