VMID as a GPU task container for virtualization

    公开(公告)号:US12153958B2

    公开(公告)日:2024-11-26

    申请号:US18045128

    申请日:2022-10-07

    Abstract: Systems, apparatuses, and methods for abstracting tasks in virtual memory identifier (VMID) containers are disclosed. A processor coupled to a memory executes a plurality of concurrent tasks including a first task. Responsive to detecting one or more instructions of the first task which correspond to a first operation, the processor retrieves a first identifier (ID) which is used to uniquely identify the first task, wherein the first ID is transparent to the first task. Then, the processor maps the first ID to a second ID and/or a third ID. The processor completes the first operation by using the second ID and/or the third ID to identify the first task to at least a first data structure. In one implementation, the first operation is a memory access operation and the first data structure is a set of page tables. Also, in one implementation, the second ID identifies a first application of the first task and the third ID identifies a first operating system (OS) of the first task.

    Wait instruction for preventing execution of one or more instructions until a load counter or store counter reaches a specified value

    公开(公告)号:US11074075B2

    公开(公告)日:2021-07-27

    申请号:US15442412

    申请日:2017-02-24

    Abstract: Systems, apparatuses, and methods for maintaining separate pending load and store counters are disclosed herein. In one embodiment, a system includes at least one execution unit, a memory subsystem, and a pair of counters for each thread of execution. In one embodiment, the system implements a software based approach for managing dependencies between instructions. In one embodiment, the execution unit(s) maintains counters to support the software-based approach for managing dependencies between instructions. The execution unit(s) are configured to execute instructions that are used to manage the dependencies during run-time. In one embodiment, the execution unit(s) execute wait instructions to wait until a given counter is equal to a specified value before continuing to execute the instruction sequence.

    High-speed selective cache invalidates and write-backs on GPUS

    公开(公告)号:US10540280B2

    公开(公告)日:2020-01-21

    申请号:US15390080

    申请日:2016-12-23

    Abstract: Techniques for performing cache invalidates and write-backs in an accelerated processing device (e.g., a graphics processing device that renders three-dimensional graphics) are disclosed. The techniques involve receiving requests from a “master” (e.g., the central processing unit). The techniques involve invalidating virtual-to-physical address translations in an address translation request. The techniques include splitting up the requests based on whether the requests target virtually or physically tagged caches. Addresses for the portions of a request that target physically tagged caches are translated using invalidated virtual-to-physical address translations for speed. The split up request is processed to generate micro-transactions for individual caches targeted by the request. Micro-transactions for physically and virtually tagged caches are processed in parallel. Once all micro-transactions for a request have been processed, the unit that made the request is notified.

    SPLIT STORAGE OF ANTI-ALIASED SAMPLES
    29.
    发明申请
    SPLIT STORAGE OF ANTI-ALIASED SAMPLES 审中-公开
    抗锯齿样品的分离存储

    公开(公告)号:US20170018053A1

    公开(公告)日:2017-01-19

    申请号:US15282336

    申请日:2016-09-30

    Inventor: Mark Fowler

    CPC classification number: G06T1/60 G06T1/20 G06T11/40 G06T2200/12 G06T2200/28

    Abstract: Embodiments of the present invention are directed to improving the performance of anti-aliased image rendering. One embodiment is a method of rendering a pixel from an anti-aliased image. The method includes: storing a first set and a second set of samples from a plurality of anti-aliased samples of the pixel respectively in a first memory and a second memory; and rendering a determined number of said samples from one of only the first set or the first and second sets. Corresponding system and computer program product embodiments are also disclosed.

    Abstract translation: 本发明的实施例旨在提高抗锯齿图像渲染的性能。 一个实施例是从抗锯齿图像渲染像素的方法。 该方法包括:将来自多个像素的抗锯齿样本的第一组和第二组样本分别存储在第一存储器和第二存储器中; 以及从仅第一组或第一组和第二组中的一个呈现确定数量的所述样本。 还公开了相应的系统和计算机程序产品实施例。

    ADDRESS REMAPPING OF DISCARDED SURFACES

    公开(公告)号:US20250117330A1

    公开(公告)日:2025-04-10

    申请号:US18617092

    申请日:2024-03-26

    Abstract: As part of rendering a scene including at least one graphics object in a display space, the display space is divided into a plurality of tiles. A determination is made that contents of at least two of the plurality of tiles are no longer used after a current render pass. A write back memory address associated with a second tile is changed to match a write back memory address associated with a first tile. As a result, data is overwritten on a same physical page.

Patent Agency Ranking