GRAPHICS PROCESSING UNIT WITH DEFERRED VERTEX SHADING
    1.
    发明申请
    GRAPHICS PROCESSING UNIT WITH DEFERRED VERTEX SHADING 审中-公开
    图形处理单元,带有VERTEX SHADING

    公开(公告)号:WO2010138870A3

    公开(公告)日:2012-04-12

    申请号:PCT/US2010036661

    申请日:2010-05-28

    CPC classification number: G06T15/40 G06T1/20 G06T15/005

    Abstract: Techniques are described for processing graphics images with a graphics processing unit (GPU) using deferred vertex shading. An example method includes the following: generating, within a processing pipeline of a graphics processing unit (GPU), vertex coordinates for vertices of each primitive within an image geometry, wherein the vertex coordinates comprise a location and a perspective parameter for each one of the vertices, and wherein the image geometry represents a graphics image; identifying, within the processing pipeline of the GPU, visible primitives within the image geometry based upon the vertex coordinates; and, responsive to identifying the visible primitives, generating, within the processing pipeline of the GPU, vertex attributes only for the vertices of the visible primitives in order to determine surface properties of the graphics image.

    Abstract translation: 描述了使用延迟顶点着色处理具有图形处理单元(GPU)的图形图像的技术。 示例性方法包括以下:在图形处理单元(GPU)的处理流水线内生成图像几何中每个图元的顶点的顶点坐标,其中顶点坐标包括位置和透视参数 顶点,并且其中图像几何表示图形图像; 在GPU的处理流水线内识别基于顶点坐标的图像几何图形内的可见原始图形; 并且响应于识别可见原语,在GPU的处理流水线内生成仅针对可见图元的顶点的顶点属性,以便确定图形图像的表面特性。

    OUT-OF-ORDER COMMAND EXECUTION IN A MULTIMEDIA PROCESSOR
    2.
    发明申请
    OUT-OF-ORDER COMMAND EXECUTION IN A MULTIMEDIA PROCESSOR 审中-公开
    多媒体处理器中的不合格的命令执行

    公开(公告)号:WO2012009701A1

    公开(公告)日:2012-01-19

    申请号:PCT/US2011/044285

    申请日:2011-07-15

    CPC classification number: G06F9/3885 G06F9/3838 G06T1/20

    Abstract: Techniques are described for reordering commands to improve the speed at which at least one command stream may execute. Prior to distributing commands in the at least one command stream to multiple pipelines, a multimedia processor analyzes any inter-pipeline dependencies and determines the current execution state of the pipelines. The processor may, based on this information, reorder the at least one command stream by prioritizing commands that lack any current dependencies and therefore may be executed immediately by the appropriate pipeline. Such out of order execution of commands in the at least one command stream may increase the throughput of the multimedia processor by increasing the rate at which the command stream is executed.

    Abstract translation: 描述了用于重新排序命令以提高至少一个命令流可以执行的速度的技术。 在将至少一个命令流中的命令分配到多个管线之前,多媒体处理器分析任何流水线间依赖性并确定管道的当前执行状态。 基于该信息,处理器可以通过对缺少任何当前依赖性的命令进行优先级排序来重新排序至少一个命令流,因此可以由适当的管道立即执行。 在至少一个命令流中命令执行的这种不正常执行可以通过增加命令流被执行的速率来增加多媒体处理器的吞吐量。

    PROGRAMMABLE STREAMING PROCESSOR WITH MIXED PRECISION INSTRUCTION EXECUTION
    3.
    发明申请
    PROGRAMMABLE STREAMING PROCESSOR WITH MIXED PRECISION INSTRUCTION EXECUTION 审中-公开
    具有混合精度指令执行的可编程流水处理器

    公开(公告)号:WO2009132013A1

    公开(公告)日:2009-10-29

    申请号:PCT/US2009/041268

    申请日:2009-04-21

    CPC classification number: G06T15/005 G06F8/47

    Abstract: The disclosure relates to a programmable streaming processor that is capable of executing mixed-precision (e.g., full-precision, half-precision) instructions using different execution units. The various execution units are each capable of using graphics data to execute instructions at a particular precision level. An exemplary programmable shader processor includes a controller and multiple execution units. The controller is configured to receive an instruction for execution and to receive an indication of a data precision for execution of the instruction. The controller is also configured to receive a separate conversion instruction that, when executed, converts graphics data associated with the instruction to the indicated data precision. When operable, the controller selects one of the execution units based on the indicated data precision. The controller then causes the selected execution unit to execute the instruction with the indicated data precision using the graphics data associated with the instruction.

    Abstract translation: 本公开涉及一种能够使用不同的执行单元执行混合精度(例如,全精度,半精度)指令的可编程流处理器。 各种执行单元都能够使用图形数据来执行特定精度级别的指令。 示例性可编程着色器处理器包括控制器和多个执行单元。 控制器被配置为接收用于执行的指令并且接收用于执行指令的数据精度的指示。 控制器还被配置为接收单独的转换指令,该指令在执行时将与指令相关联的图形数据转换为所指示的数据精度。 当可操作时,控制器基于指示的数据精度选择一个执行单元。 然后,控制器使所选择的执行单元使用与指令相关联的图形数据,以指示的数据精度执行指令。

    EFFICIENT 2-D AND 3-D GRAPHICS PROCESSING
    4.
    发明申请
    EFFICIENT 2-D AND 3-D GRAPHICS PROCESSING 审中-公开
    有效的二维和三维图形处理

    公开(公告)号:WO2008101210A3

    公开(公告)日:2009-10-22

    申请号:PCT/US2008054162

    申请日:2008-02-15

    CPC classification number: G06T15/005 G06T11/40 G09G5/363

    Abstract: Techniques for supporting both 2-D and 3-D graphics are described. A graphics processing unit (GPU) may perform 3-D graphics processing in accordance with a 3-D graphics pipeline to render 3-D images and may also perform 2-D graphics processing in accordance with a 2-D graphics pipeline to render 2-D images. Each stage of the 2-D graphics pipeline may be mapped to at least one stage of the 3-D graphics pipeline. For example, a clipping, masking and scissoring stage in 2-D graphics may be mapped to a depth test stage in 3-D graphics. Coverage values for pixels within paths in 2-D graphics may be determined using rasterization and depth test stages in 3-D graphics. A paint generation stage and an image interpolation stage in 2-D graphics may be mapped to a fragment shader stage in 3-D graphics. A blending stage in 2-D graphics may be mapped to a blending stage in 3-D graphics.

    Abstract translation: 描述了支持2-D和3-D图形的技术。 图形处理单元(GPU)可以根据3-D图形流水线执行3D图形处理以渲染3-D图像,并且还可以根据2-D图形流水线执行2-D图形处理以呈现2 -D图像。 2-D图形管线的每个阶段可以映射到3-D图形流水线的至少一个阶段。 例如,2-D图形中的裁剪,掩蔽和裁剪阶段可能被映射到3-D图形中的深度测试阶段。 2-D图形中路径内像素的覆盖值可以使用3-D图形中的光栅化和深度测试阶段来确定。 2-D图形中的油漆生成阶段和图像插值阶段可以映射到3-D图形中的片段着色器阶段。 2-D图形中的混合阶段可以映射到3-D图形的混合阶段。

    FRAGMENT SHADER BYPASS IN A GRAPHICS PROCESSING UNIT, AND APPARATUS AND METHOD THEREOF
    5.
    发明申请
    FRAGMENT SHADER BYPASS IN A GRAPHICS PROCESSING UNIT, AND APPARATUS AND METHOD THEREOF 审中-公开
    图形处理单元中的片状阴影旁边,及其装置及方法

    公开(公告)号:WO2009036314A3

    公开(公告)日:2009-07-09

    申请号:PCT/US2008076227

    申请日:2008-09-12

    CPC classification number: G06T15/005

    Abstract: Configuration information is used to make a determination to bypass fragment shading by a shader unit of a graphics processing unit, the shader unit capable of performing both vertex shading and fragment shader. Based on the determination, the shader unit performs vertex shading and bypasses fragment shading. A processing element other than the shader unit, such as a pixel blender, can be used to perform some fragment shading. Power is managed to 'turn off' power to unused components in a case that fragment shading is bypassed. For example, power can be turned off to a number of arithmetic logic units, the shader unit using the reduced number of arithmetic logic unit to perform vertex shading. At least one register bank of the shader unit can be used as a FIFO buffer storing pixel attribute data for use, with texture data, to fragment shading operations by another processing element.

    Abstract translation: 配置信息用于确定通过图形处理单元的着色器单元绕过片段着色,着色器单元能够执行顶点着色和片段着色。 基于确定,着色器单元执行顶点着色并绕过片段着色。 可以使用除着色器单元之外的处理元件,例如像素混合器,以执行某些片段着色。 在绕过片段着色的情况下,功率被设计为“关闭”未使用组件的电源。 例如,功率可以关闭到多个算术逻辑单元,着色器单元使用减少数量的算术逻辑单元来执行顶点着色。 着色器单元的至少一个寄存器组可以用作FIFO缓冲器,其存储与纹理数据一起使用的像素属性数据,以分割另一个处理元件的着色操作。

    DEMAND-BASED POWER CONTROL IN A GRAPHICS PROCESSING UNIT
    6.
    发明申请
    DEMAND-BASED POWER CONTROL IN A GRAPHICS PROCESSING UNIT 审中-公开
    图形处理单元中基于需求的功率控制

    公开(公告)号:WO2009049255A2

    公开(公告)日:2009-04-16

    申请号:PCT/US2008079644

    申请日:2008-10-10

    CPC classification number: G06F1/3203 G06F1/3287 G06T1/20 Y02D10/171 Y02D50/20

    Abstract: Disclosed herein is power controller for use with a graphics processing unit. The power controller monitors, manages and controls power supplied to components of a pipeline of the graphics processing unit. The power controller determining whether and to what extent power is to be supplied to a pipeline component based on status information received by the power controller in connection with the pipeline component. The power controller is capable of identifying a trend using the received status information, and determining whether and to what extent power is to be supplied to a pipeline component based on the identified trend.

    Abstract translation: 这里公开了与图形处理单元一起使用的功率控制器。 功率控制器监视,管理和控制提供给图形处理单元的管线的组件的电力。 功率控制器基于由电力控制器与流水线部件相关联的状态信息来确定是否以及在何种程度上向管道部件提供功率。 功率控制器能够使用接收到的状态信息来识别趋势,并且基于所识别的趋势来确定是否以及在何种程度上向管道部件供电。

    EFFICIENT SCISSORING FOR GRAPHICS APPLICATION
    7.
    发明申请
    EFFICIENT SCISSORING FOR GRAPHICS APPLICATION 审中-公开
    图形应用程序的高效分割

    公开(公告)号:WO2008064225A3

    公开(公告)日:2008-10-02

    申请号:PCT/US2007085241

    申请日:2007-11-20

    CPC classification number: G06T15/30 G06T2200/28

    Abstract: Scissoring for any number of scissoring regions is performed in a sequential order by drawing one scissoring region at a time on a drawing surface and updating scissor values for pixels within each scissoring region. A scissor value for a pixel may indicate the number of scissoring regions covering the pixel and may be incremented for each scissoring region covering the pixel. A scissor value for a pixel may also be a bitmap, and a bit for a scissoring region may be set to one if the pixel is within the scissoring region. Pixels within a region of interest are passed and rendered, and pixels outside of the region are discarded. This region may be defined by a reference value, which may be set to (a) one for the union of all scissoring regions, for a scissoring UNION operation, or (b) larger than one for the intersection of multiple (e.g., all) scissoring regions, for a scissoring AND operation.

    Abstract translation: 通过在绘图表面上一次绘制一个剪刀区域并且更新每个剪刀区域内的像素的剪刀值来按顺序执行任何数量的剪刀区域的剪刀。 像素的剪刀值可以指示覆盖像素的剪切区域的数量,并且可以针对覆盖像素的每个剪切区域递增。 像素的剪刀值也可以是位图,并且如果像素在剪刀区域内,则剪刀区域的位可以被设置为1。 感兴趣区域内的像素被传递并渲染,并且该区域外的像素被丢弃。 该区域可以由参考值定义,该参考值可以被设置为(a)用于所有剪切区域的联合,用于剪切UNION操作,或者(b)用于多个(例如,全部)交点的大于1的参考值, 剪裁区域,用于剪裁和操作。

    GRAPHICS PROCESSING UNIT WITH UNIFIED VERTEX CACHE AND SHADER REGISTER FILE
    8.
    发明申请
    GRAPHICS PROCESSING UNIT WITH UNIFIED VERTEX CACHE AND SHADER REGISTER FILE 审中-公开
    具有统一VERTEX CACHE和SHADER寄存器文件的图形处理单元

    公开(公告)号:WO2008039950A1

    公开(公告)日:2008-04-03

    申请号:PCT/US2007/079784

    申请日:2007-09-27

    CPC classification number: G06T15/005

    Abstract: Techniques are described for processing computerized images with a graphics processing unit (GPU) using a unified vertex cache and shader register file. The techniques include creating a shared shader coupled to the GPU pipeline and a unified vertex cache and shader register file coupled to the shared shader to substantially eliminate data movement within the GPU pipeline. The GPU pipeline sends image geometry information based on an image geometry for an image to the shared shader. The shared shader performs vertex shading to generate vertex coordinates and attributes of vertices in the image. The shared shader then stores the vertex attributes in the unified vertex cache and shader register file, and sends only the vertex coordinates of the vertices back to the GPU pipeline. The GPU pipeline processes the image based on the vertex coordinates, and the shared shader processes the image based on the vertex attributes.

    Abstract translation: 描述了使用统一的顶点高速缓存和着色器寄存器文件处理具有图形处理单元(GPU)的计算机化图像的技术。 这些技术包括创建耦合到GPU流水线的共享着色器和耦合到共享着色器的统一顶点高速缓存和着色器寄存器文件,以基本上消除GPU流水线内的数据移动。 GPU管道将基于图像的图像几何的图像几何信息发送到共享着色器。 共享着色器执行顶点着色以生成图像中顶点坐标和顶点属性。 共享着色器然后将顶点属性存储在统一的顶点缓存和着色器寄存器文件中,并且仅将顶点的顶点坐标发送回GPU管道。 GPU流水线基于顶点坐标处理图像,共享着色器基于顶点属性处理图像。

    GRAPHICS PROCESSING UNIT WITH EXTENDED VERTEX CACHE
    9.
    发明申请
    GRAPHICS PROCESSING UNIT WITH EXTENDED VERTEX CACHE 审中-公开
    带有扩展的VERTEX CACHE的图形处理单元

    公开(公告)号:WO2008019261A2

    公开(公告)日:2008-02-14

    申请号:PCT/US2007074882

    申请日:2007-07-31

    CPC classification number: G06T15/005

    Abstract: Techniques are described for processing computerized images with a graphics processing unit (GPU) using an extended vertex cache. The techniques include creating an extended vertex cache coupled to a GPU pipeline to reduce an amount of data passing through the GPU pipeline. The GPU pipeline receives an image geometry for an image, and stores attributes for vertices within the image geometry in the extended vertex cache. The GPU pipeline only passes vertex coordinates that identify the vertices and vertex cache index values that indicate storage locations of the attributes for each of the vertices in the extended vertex cache to other processing stages along the GPU pipeline. The techniques described herein defer the setup of attribute gradients to just before attribute interpolation in the GPU pipeline. The vertex attributes may be retrieved from the extended vertex cache for attribute gradient setup just before attribute interpolation in the GPU pipeline.

    Abstract translation: 描述了使用扩展顶点高速缓存来处理具有图形处理单元(GPU)的计算机化图像的技术。 这些技术包括创建耦合到GPU管线的扩展顶点高速缓存以减少通过GPU管线的数据量。 GPU管线接收图像的图像几何图形,并将图像几何图形内顶点的属性存储在扩展顶点高速缓存中。 GPU流水线仅将标识顶点的顶点坐标和指示扩展顶点高速缓存中的每个顶点的属性的存储位置的顶点高速缓存索引值传递到沿GPU流水线的其他处理阶段。 这里描述的技术将属性梯度的设置推迟到恰好在GPU流水线中的属性内插之前。 在GPU流水线中的属性插值之前,可以从扩展顶点缓存中检索顶点属性以便进行属性梯度设置。

    GRAPHICS PROCESSOR WITH ARITHMETIC AND ELEMENTARY FUNCTION UNITS
    10.
    发明申请
    GRAPHICS PROCESSOR WITH ARITHMETIC AND ELEMENTARY FUNCTION UNITS 审中-公开
    具有算术和元素功能单元的图形处理器

    公开(公告)号:WO2007140338A2

    公开(公告)日:2007-12-06

    申请号:PCT/US2007/069803

    申请日:2007-05-25

    CPC classification number: G06T1/20 G06F9/30167 G06F9/383 G06F9/3851 G06F9/3885

    Abstract: A graphics processor capable of efficiently performing arithmetic operations and computing elementary functions is described. The graphics processor has at least one arithmetic logic unit (ALU) that can perform arithmetic operations and at least one elementary function unit that can compute elementary functions. The ALU(s) and elementary function unit(s) may be arranged such that they can operate in parallel to improve throughput. The graphics processor may also include fewer elementary function units than ALUs, e.g., four ALUs and a single elementary function unit. The four ALUs may perform an arithmetic operation on (1) four components of an attribute for one pixel or (2) one component of an attribute for four pixels. The single elementary function unit may operate on one component of one pixel at a time. The use of a single elementary function unit may reduce cost while still providing good performance.

    Abstract translation: 描述能够有效执行算术运算和计算基本功能的图形处理器。 图形处理器具有至少一个可执行算术运算的算术逻辑单元(ALU)和至少一个可以计算基本功能的基本功能单元。 ALU和基本功能单元可以被布置成使得它们可以并行操作以提高吞吐量。 图形处理器还可以包括比ALU更少的基本功能单元,例如四个ALU和单个基本功能单元。 四个ALU可以对(1)四个像素的属性的四个分量或(2)四个像素的属性的一个分量执行算术运算。 单个基本功能单元可以一次操作一个像素的一个分量。 使用单个基本功能单元可以降低成本,同时仍然提供良好的性能。

Patent Agency Ranking