GRAPHICS PROCESSING HARDWARE FOR USING COMPUTE SHADERS AS FRONT END FOR VERTEX SHADERS
    31.
    发明申请
    GRAPHICS PROCESSING HARDWARE FOR USING COMPUTE SHADERS AS FRONT END FOR VERTEX SHADERS 审中-公开
    使用COMPUTE SHADERS作为前端用于VERTEX SHADERS的图形处理硬件

    公开(公告)号:US20140362102A1

    公开(公告)日:2014-12-11

    申请号:US14297290

    申请日:2014-06-05

    CPC classification number: G06T1/20 G06T1/60 G06T15/005

    Abstract: A GPU is configured to read and process data produced by a compute shader via the one or more ring buffers and pass the resulting processed data to a vertex shader as input. The GPU is further configured to allow the compute shader and vertex shader to write through a cache. Each ring buffer is configured to synchronize the compute shader and the vertex shader to prevent processed data generated by the compute shader that is written to a particular ring buffer from being overwritten before the data is accessed by the vertex shader. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.

    Abstract translation: GPU被配置为通过一个或多个环形缓冲器读取和处理由计算着色器产生的数据,并将所得到的处理后的数据传递给顶点着色器作为输入。 GPU进一步配置为允许计算着色器和顶点着色器通过高速缓存进行写入。 每个环形缓冲区被配置为使计算着色器和顶点着色器同步,以防止在由顶点着色器访问数据之前被写入特定环形缓冲区的计算着色器生成的被处理数据被覆盖。 要强调的是,该摘要被提供以符合要求抽象的规则,允许搜索者或其他读者快速确定技术公开的主题。 提交它的理解是,它不会用于解释或限制权利要求的范围或含义。

    CONFIGURABLE MULTIPLE-DIE GRAPHICS PROCESSING UNIT

    公开(公告)号:US20240193844A1

    公开(公告)日:2024-06-13

    申请号:US18077424

    申请日:2022-12-08

    CPC classification number: G06T15/005 G06F9/3802

    Abstract: A graphics processing unit (GPU) of a processing system is partitioned into multiple dies (referred to as GPU chiplets) that are configurable to collectively function and interface with an application as a single GPU in a first mode and as multiple GPUs in a second mode. By dividing the GPU into multiple GPU chiplets, the processing system flexibly and cost-effectively configures an amount of active GPU physical resources based on an operating mode. In addition, a configurable number of GPU chiplets are assembled into a single GPU, such that multiple different GPUs having different numbers of GPU chiplets can be assembled using a small number of tape-outs and a multiple-die GPU can be constructed out of GPU chiplets that implement varying generations of technology.

    Processing unit with small footprint arithmetic logic unit

    公开(公告)号:US11720328B2

    公开(公告)日:2023-08-08

    申请号:US17029836

    申请日:2020-09-23

    CPC classification number: G06F7/57 G06F17/16 G06N3/08

    Abstract: A parallel processing unit employs an arithmetic logic unit (ALU) having a relatively small footprint, thereby reducing the overall power consumption and circuit area of the processing unit. To support the smaller footprint, the ALU includes multiple stages to execute operations corresponding to a received instruction. The ALU executes at least one operation at a precision indicated by the received instruction, and then reduces the resulting data of the at least one operation to a smaller size before providing the results to another stage of the ALU to continue execution of the instruction.

    Dual vector arithmetic logic unit
    37.
    发明授权

    公开(公告)号:US11675568B2

    公开(公告)日:2023-06-13

    申请号:US17121354

    申请日:2020-12-14

    CPC classification number: G06F7/57 G06F9/3867 G06F17/16 G06T1/20 G06F15/8015

    Abstract: A processing system executes wavefronts at multiple arithmetic logic unit (ALU) pipelines of a single instruction multiple data (SIMD) unit in a single execution cycle. The ALU pipelines each include a number of ALUs that execute instructions on wavefront operands that are collected from vector general process register (VGPR) banks at a cache and output results of the instructions executed on the wavefronts at a buffer. By storing wavefronts supplied by the VGPR banks at the cache, a greater number of wavefronts can be made available to the SIMD unit without increasing the VGPR bandwidth, enabling multiple ALU pipelines to execute instructions during a single execution cycle.

    Prefetch kernels on data-parallel processors

    公开(公告)号:US11500778B2

    公开(公告)日:2022-11-15

    申请号:US16813075

    申请日:2020-03-09

    Abstract: Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel with reduced intermediate state storage resource requirements. These include executing a prefetch kernel on a graphics processing unit (GPU), such that the prefetch kernel begins executing before a processing kernel. The prefetch kernel performs memory operations that are based upon at least a subset of memory operations in the processing kernel.

    Spatial partitioning in a multi-tenancy graphics processing unit

    公开(公告)号:US11295507B2

    公开(公告)日:2022-04-05

    申请号:US17091957

    申请日:2020-11-06

    Abstract: A graphics processing unit (GPU) or other apparatus includes a plurality of shader engines. The apparatus also includes a first front end (FE) circuit and one or more second FE circuits. The first FE circuit is configured to schedule geometry workloads for the plurality of shader engines in a first mode. The first FE circuit is configured to schedule geometry workloads for a first subset of the plurality of shader engines and the one or more second FE circuits are configured to schedule geometry workloads for a second subset of the plurality of shader engines in a second mode. In some cases, a partition switch is configured to selectively connect the first FE circuit or the one or more second FE circuits to the second subset of the plurality of shader engines depending on whether the apparatus is in the first mode or the second mode.

Patent Agency Ranking