Prefetch kernels on data-parallel processors

    公开(公告)号:US11500778B2

    公开(公告)日:2022-11-15

    申请号:US16813075

    申请日:2020-03-09

    Abstract: Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel with reduced intermediate state storage resource requirements. These include executing a prefetch kernel on a graphics processing unit (GPU), such that the prefetch kernel begins executing before a processing kernel. The prefetch kernel performs memory operations that are based upon at least a subset of memory operations in the processing kernel.

    Spatial partitioning in a multi-tenancy graphics processing unit

    公开(公告)号:US11295507B2

    公开(公告)日:2022-04-05

    申请号:US17091957

    申请日:2020-11-06

    Abstract: A graphics processing unit (GPU) or other apparatus includes a plurality of shader engines. The apparatus also includes a first front end (FE) circuit and one or more second FE circuits. The first FE circuit is configured to schedule geometry workloads for the plurality of shader engines in a first mode. The first FE circuit is configured to schedule geometry workloads for a first subset of the plurality of shader engines and the one or more second FE circuits are configured to schedule geometry workloads for a second subset of the plurality of shader engines in a second mode. In some cases, a partition switch is configured to selectively connect the first FE circuit or the one or more second FE circuits to the second subset of the plurality of shader engines depending on whether the apparatus is in the first mode or the second mode.

    Single pass flexible screen/scale rasterization

    公开(公告)号:US10546365B2

    公开(公告)日:2020-01-28

    申请号:US15843968

    申请日:2017-12-15

    Abstract: An apparatus, such as a head mounted device (HMD), includes one or more processors configured to implement a graphics pipeline that renders pixels in window space with a nonuniform pixel spacing. The apparatus also includes a first distortion function that maps the non-uniformly spaced pixels in window space to uniformly spaced pixels in raster space. The apparatus further includes a scan converter configured to sample the pixels in window space through the first distortion function. The scan converter is configured to render display pixels used to generate an image for display to a user based on the uniformly spaced pixels in raster space. In some cases, the pixels in the window space are rendered such that a pixel density per subtended area is constant across the user's field of view.

    RECONFIGURABLE VIRTUAL GRAPHICS AND COMPUTE PROCESSOR PIPELINE

    公开(公告)号:US20180114290A1

    公开(公告)日:2018-04-26

    申请号:US15331278

    申请日:2016-10-21

    Abstract: A graphics processing unit (GPU) includes a plurality of programmable processing cores configured to process graphics primitives and corresponding data and a plurality of fixed-function hardware units. The plurality of processing cores and the plurality of fixed-function hardware units are configured to implement a configurable number of virtual pipelines to concurrently process different command flows. Each virtual pipeline includes a configurable number of fragments and an operational state of each virtual pipeline is specified by a different context. The configurable number of virtual pipelines can be modified from a first number to a second number that is different than the first number. An emulation of a fixed-function hardware unit can be instantiated on one or more of the graphics processing cores in response to detection of a bottleneck in a fixed-function hardware unit. One or more of the virtual pipelines can then be reconfigured to utilize the emulation instead of the fixed-function hardware unit.

    Method and system for synchronization of workitems with divergent control flow
    29.
    发明授权
    Method and system for synchronization of workitems with divergent control flow 有权
    工作单元与发散控制流同步的方法和系统

    公开(公告)号:US09424099B2

    公开(公告)日:2016-08-23

    申请号:US13672291

    申请日:2012-11-08

    CPC classification number: G06F9/52 G06F9/522

    Abstract: Disclosed methods, systems, and computer program products embodiments include synchronizing a group of workitems on a processor by storing a respective program counter associated with each of the workitems, selecting at least one first workitem from the group for execution, and executing the selected at least one first workitem on the processor. The selecting is based upon the respective stored program counter associated with the at least one first workitem.

    Abstract translation: 公开的方法,系统和计算机程序产品实施例包括通过存储与每个工作项相关联的相应程序计数器来同步处理器上的一组工作项,从组中选择至少一个第一工作以供执行,以及执行所选择的至少 处理器上的第一个工作。 所述选择基于与所述至少一个第一工作项相关联的相应存储的程序计数器。

    Spatial partitioning in a multi-tenancy graphics processing unit

    公开(公告)号:US12205218B2

    公开(公告)日:2025-01-21

    申请号:US17706811

    申请日:2022-03-29

    Abstract: A graphics processing unit (GPU) or other apparatus includes a plurality of shader engines. The apparatus also includes a first front end (FE) circuit and one or more second FE circuits. The first FE circuit is configured to schedule geometry workloads for the plurality of shader engines in a first mode. The first FE circuit is configured to schedule geometry workloads for a first subset of the plurality of shader engines and the one or more second FE circuits are configured to schedule geometry workloads for a second subset of the plurality of shader engines in a second mode. In some cases, a partition switch is configured to selectively connect the first FE circuit or the one or more second FE circuits to the second subset of the plurality of shader engines depending on whether the apparatus is in the first mode or the second mode.

Patent Agency Ranking