Graphics discard engine
    1.
    发明授权

    公开(公告)号:US12236529B2

    公开(公告)日:2025-02-25

    申请号:US17562653

    申请日:2021-12-27

    Abstract: Systems, apparatuses, and methods for implementing a discard engine in a graphics pipeline are disclosed. A system includes a graphics pipeline with a geometry engine launching shaders that generate attribute data for vertices of each primitive of a set of primitives. The attribute data is consumed by pixel shaders, with each pixel shader generating a deallocation message when the pixel shader no longer needs the attribute data. A discard engine gathers deallocations from multiple pixel shaders and determines when the attribute data is no longer needed. Once a block of attributes has been consumed by all potential pixel shader consumers, the discard engine deallocates the given block of attributes. The discard engine sends a discard command to the caches so that the attribute data can be invalidated and not written back to memory.

    Performance and memory access tracking

    公开(公告)号:US12182396B2

    公开(公告)日:2024-12-31

    申请号:US18192694

    申请日:2023-03-30

    Abstract: Techniques for performing memory operations are disclosed herein. The techniques include generating a plurality of performance log entries based on observed operations; generating a plurality of memory access log entries based on the observed operations, wherein each performance log entry of the plurality of performance log entries are associated with one or more memory access log entries of the plurality of memory access log entries, wherein each performance log entry is associated with an epoch; and wherein each memory access log entry is associated with an epoch and a memory address range.

    SORT-TOP RASTERIZATION AND TILE RENDERING USING AN ACCELERATION STRUCTURE

    公开(公告)号:US20240404176A1

    公开(公告)日:2024-12-05

    申请号:US18205407

    申请日:2023-06-02

    Abstract: To render a scene in a display space, a processor is configured to perform sort-top tiled rendering. To this end, the processor is configured to divide a display space into two or more tiles and assign each tile to a respective graphics core of the processor. Further, the processor is configured to divide a viewport of the scene into corresponding frustums each representing a portion of the viewport in a respective tile. Using a corresponding frustum associated with an assigned tile, each graphics core performs one or more frustum queries to determine one or more graphics objects in a tile to rasterize, one or more draw calls to perform for a tile, or both.

    LOAD INSTRUCTION FOR MULTI SAMPLE ANTI-ALIASING

    公开(公告)号:US20210407182A1

    公开(公告)日:2021-12-30

    申请号:US17028811

    申请日:2020-09-22

    Abstract: Techniques for performing multi-sample anti-aliasing operations are provided. The techniques include detecting an instruction for a multi-sample anti-aliasing load operation; determining a sampling rate of source data for the load operation, data storage format of the source data, and loading mode indicating whether the load operation requests same or different color components, or depth data; and based on the determined sampling rate, data storage format, and loading mode, load data from a multi-sample source into a register.

    Automatic configuration of knobs to optimize performance of a graphics pipeline

    公开(公告)号:US11004251B2

    公开(公告)日:2021-05-11

    申请号:US16201879

    申请日:2018-11-27

    Abstract: A knob has a plurality of settings that configure a graphics pipeline. A first setting is associated with a first state of the graphics pipeline. The first setting is associated with the first state based on a measure of performance of the graphics pipeline while configured according to the first setting. The graphics pipeline is configured according to the first setting in response to the first state of the graphics pipeline matching a current state of the graphics pipeline. The graphics pipeline processes graphics according to the first setting. In some cases, the first setting is associated with the first state of the graphics pipeline by dithering or toggling the knob between the settings once per frame for a predetermined number of frames. The first setting achieves better performance than other ones of the plurality of settings during the predetermined number of frames.

    METHOD AND SYSTEM FOR DEPTH PRE-PROCESSING AND GEOMETRY SORTING USING BINNING HARDWARE

    公开(公告)号:US20200098169A1

    公开(公告)日:2020-03-26

    申请号:US16137830

    申请日:2018-09-21

    Abstract: Described herein are techniques for improving the effectiveness of depth culling. In a first technique, a binner is used to sort primitives into depth bins. Each depth bin covers a range of depths. The binner transmits the depth bins to the screen space pipeline for processing in near-to-far order. Processing the near bins first results in the depth buffer being updated, allowing fragments for the primitives in the farther bins to be culled more aggressively than if the depth binning did not occur. In a second technique, a buffer is used to initiate two-pass processing through the screen space pipeline. In the first pass, primitives are sent down to update the depth block and are then culled. The fragments are processed normally in the second pass, with the benefit of the updated depth values.

    Graphics primitives and positions through memory buffers

    公开(公告)号:US12169896B2

    公开(公告)日:2024-12-17

    申请号:US17489105

    申请日:2021-09-29

    Abstract: Systems, apparatuses, and methods for preemptively reserving buffer space for primitives and positions in a graphics pipeline are disclosed. A system includes a graphics pipeline frontend with any number of geometry engines coupled to corresponding shader engines. Each geometry engine launches shader wavefronts to execute on a corresponding shader engine. The geometry engine preemptively reserves buffer space for each wavefront prior to the wavefront being launched on the shader engine. When the shader engine executes a wavefront, the shader engine exports primitive and position data to the reserved buffer space. Multiple scan converters will consume the primitive and position data, with each scan converter consuming primitive and position data based on the screen coverage of the scan converter. After consuming the primitive and position data, the scan converters mark the buffer space as freed so that the geometry engine can then allocate the freed buffer space to subsequent shader wavefronts.

    Hierarchical work scheduling
    9.
    发明授权

    公开(公告)号:US12153957B2

    公开(公告)日:2024-11-26

    申请号:US17957714

    申请日:2022-09-30

    Abstract: A method for hierarchical work scheduling includes consuming a work item at a first scheduling domain having a local scheduler circuit and one or more workgroup processing elements. Consuming the work item produces a set of new work items. Subsequently, the local scheduler circuit distributes at least one new work item of the set of new work items to be executed locally at the first scheduling domain. If the local scheduler circuit of the first scheduling domain determines that the set of new work items includes one or more work items that would overload the first scheduling domain with work if scheduled for local execution, those work items are distributed to the next higher-level scheduler circuit in a scheduling domain hierarchy for redistribution to one or more other scheduling domains.

    Load instruction for multi sample anti-aliasing

    公开(公告)号:US12141915B2

    公开(公告)日:2024-11-12

    申请号:US17028811

    申请日:2020-09-22

    Abstract: Techniques for performing multi-sample anti-aliasing operations are provided. The techniques include detecting an instruction for a multi-sample anti-aliasing load operation; determining a sampling rate of source data for the load operation, data storage format of the source data, and loading mode indicating whether the load operation requests same or different color components, or depth data; and based on the determined sampling rate, data storage format, and loading mode, load data from a multi-sample source into a register.

Patent Agency Ranking