Spatial partitioning in a multi-tenancy graphics processing unit

    公开(公告)号:US12205218B2

    公开(公告)日:2025-01-21

    申请号:US17706811

    申请日:2022-03-29

    Abstract: A graphics processing unit (GPU) or other apparatus includes a plurality of shader engines. The apparatus also includes a first front end (FE) circuit and one or more second FE circuits. The first FE circuit is configured to schedule geometry workloads for the plurality of shader engines in a first mode. The first FE circuit is configured to schedule geometry workloads for a first subset of the plurality of shader engines and the one or more second FE circuits are configured to schedule geometry workloads for a second subset of the plurality of shader engines in a second mode. In some cases, a partition switch is configured to selectively connect the first FE circuit or the one or more second FE circuits to the second subset of the plurality of shader engines depending on whether the apparatus is in the first mode or the second mode.

    DUAL VECTOR ARITHMETIC LOGIC UNIT

    公开(公告)号:US20220188076A1

    公开(公告)日:2022-06-16

    申请号:US17121354

    申请日:2020-12-14

    Abstract: A processing system executes wavefronts at multiple arithmetic logic unit (ALU) pipelines of a single instruction multiple data (SIMD) unit in a single execution cycle. The ALU pipelines each include a number of ALUs that execute instructions on wavefront operands that are collected from vector general process register (VGPR) banks at a cache and output results of the instructions executed on the wavefronts at a buffer. By storing wavefronts supplied by the VGPR banks at the cache, a greater number of wavefronts can be made available to the SIMD unit without increasing the VGPR bandwidth, enabling multiple ALU pipelines to execute instructions during a single execution cycle.

    Policies for shader resource allocation in a shader core

    公开(公告)号:US10579388B2

    公开(公告)日:2020-03-03

    申请号:US16040224

    申请日:2018-07-19

    Abstract: A method for use in a processor for arbitrating between multiple processes to select wavefronts for execution on a shader core is provided. The processor includes a compute pipeline configured to issue wavefronts to the shader core for execution, a hardware queue descriptor associated with the compute pipeline, and the shader core. The shader core is configured to execute work for the compute pipeline corresponding to a first memory queue descriptor executed using data for the first memory queue descriptor that is loaded into a first hardware queue descriptor. The processor is configured to detect a context switch condition, and, responsive to the context switch condition, perform a context switch operation including loading data for a second memory queue descriptor into the first hardware queue descriptor. The shader core is configured to execute work corresponding to the second memory queue descriptor that is loaded into the first hardware queue descriptor.

    CONFIGURABLE MULTIPLE-DIE GRAPHICS PROCESSING UNIT

    公开(公告)号:US20240193844A1

    公开(公告)日:2024-06-13

    申请号:US18077424

    申请日:2022-12-08

    CPC classification number: G06T15/005 G06F9/3802

    Abstract: A graphics processing unit (GPU) of a processing system is partitioned into multiple dies (referred to as GPU chiplets) that are configurable to collectively function and interface with an application as a single GPU in a first mode and as multiple GPUs in a second mode. By dividing the GPU into multiple GPU chiplets, the processing system flexibly and cost-effectively configures an amount of active GPU physical resources based on an operating mode. In addition, a configurable number of GPU chiplets are assembled into a single GPU, such that multiple different GPUs having different numbers of GPU chiplets can be assembled using a small number of tape-outs and a multiple-die GPU can be constructed out of GPU chiplets that implement varying generations of technology.

    Dual vector arithmetic logic unit

    公开(公告)号:US11675568B2

    公开(公告)日:2023-06-13

    申请号:US17121354

    申请日:2020-12-14

    CPC classification number: G06F7/57 G06F9/3867 G06F17/16 G06T1/20 G06F15/8015

    Abstract: A processing system executes wavefronts at multiple arithmetic logic unit (ALU) pipelines of a single instruction multiple data (SIMD) unit in a single execution cycle. The ALU pipelines each include a number of ALUs that execute instructions on wavefront operands that are collected from vector general process register (VGPR) banks at a cache and output results of the instructions executed on the wavefronts at a buffer. By storing wavefronts supplied by the VGPR banks at the cache, a greater number of wavefronts can be made available to the SIMD unit without increasing the VGPR bandwidth, enabling multiple ALU pipelines to execute instructions during a single execution cycle.

    Spatial partitioning in a multi-tenancy graphics processing unit

    公开(公告)号:US11295507B2

    公开(公告)日:2022-04-05

    申请号:US17091957

    申请日:2020-11-06

    Abstract: A graphics processing unit (GPU) or other apparatus includes a plurality of shader engines. The apparatus also includes a first front end (FE) circuit and one or more second FE circuits. The first FE circuit is configured to schedule geometry workloads for the plurality of shader engines in a first mode. The first FE circuit is configured to schedule geometry workloads for a first subset of the plurality of shader engines and the one or more second FE circuits are configured to schedule geometry workloads for a second subset of the plurality of shader engines in a second mode. In some cases, a partition switch is configured to selectively connect the first FE circuit or the one or more second FE circuits to the second subset of the plurality of shader engines depending on whether the apparatus is in the first mode or the second mode.

    POLICIES FOR SHADER RESOURCE ALLOCATION IN A SHADER CORE

    公开(公告)号:US20180321946A1

    公开(公告)日:2018-11-08

    申请号:US16040224

    申请日:2018-07-19

    CPC classification number: G06F9/3851 G06F9/3879 G06F9/4881 G06T1/20

    Abstract: A method for use in a processor for arbitrating between multiple processes to select wavefronts for execution on a shader core is provided. The processor includes a compute pipeline configured to issue wavefronts to the shader core for execution, a hardware queue descriptor associated with the compute pipeline, and the shader core. The shader core is configured to execute work for the compute pipeline corresponding to a first memory queue descriptor executed using data for the first memory queue descriptor that is loaded into a first hardware queue descriptor. The processor is configured to detect a context switch condition, and, responsive to the context switch condition, perform a context switch operation including loading data for a second memory queue descriptor into the first hardware queue descriptor. The shader core is configured to execute work corresponding to the second memory queue descriptor that is loaded into the first hardware queue descriptor.

Patent Agency Ranking