-
公开(公告)号:US12236529B2
公开(公告)日:2025-02-25
申请号:US17562653
申请日:2021-12-27
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Christopher J. Brennan , Randy Wayne Ramsey , Nishank Pathak , Ricky Wai Yeung Iu , Jimshed Mirza , Anthony Chan
Abstract: Systems, apparatuses, and methods for implementing a discard engine in a graphics pipeline are disclosed. A system includes a graphics pipeline with a geometry engine launching shaders that generate attribute data for vertices of each primitive of a set of primitives. The attribute data is consumed by pixel shaders, with each pixel shader generating a deallocation message when the pixel shader no longer needs the attribute data. A discard engine gathers deallocations from multiple pixel shaders and determines when the attribute data is no longer needed. Once a block of attributes has been consumed by all potential pixel shader consumers, the discard engine deallocates the given block of attributes. The discard engine sends a discard command to the caches so that the attribute data can be invalidated and not written back to memory.
-
公开(公告)号:US12182396B2
公开(公告)日:2024-12-31
申请号:US18192694
申请日:2023-03-30
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Christopher J. Brennan , Akshay Lahiry , Guennadi Riguer
IPC: G06F3/06
Abstract: Techniques for performing memory operations are disclosed herein. The techniques include generating a plurality of performance log entries based on observed operations; generating a plurality of memory access log entries based on the observed operations, wherein each performance log entry of the plurality of performance log entries are associated with one or more memory access log entries of the plurality of memory access log entries, wherein each performance log entry is associated with an epoch; and wherein each memory access log entry is associated with an epoch and a memory address range.
-
公开(公告)号:US20240404176A1
公开(公告)日:2024-12-05
申请号:US18205407
申请日:2023-06-02
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Christopher J. Brennan
Abstract: To render a scene in a display space, a processor is configured to perform sort-top tiled rendering. To this end, the processor is configured to divide a display space into two or more tiles and assign each tile to a respective graphics core of the processor. Further, the processor is configured to divide a viewport of the scene into corresponding frustums each representing a portion of the viewport in a respective tile. Using a corresponding frustum associated with an assigned tile, each graphics core performs one or more frustum queries to determine one or more graphics objects in a tile to rasterize, one or more draw calls to perform for a tile, or both.
-
公开(公告)号:US20220012933A1
公开(公告)日:2022-01-13
申请号:US17483678
申请日:2021-09-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Skyler Jonathon Saleh , Vineet Goel , Pazhani Pillai , Ruijin Wu , Christopher J. Brennan , Andrew S. Pomianowski
Abstract: Techniques for performing shader operations are provided. The techniques include, performing pixel shading at a shading rate defined by pixel shader variable rate shading (“VRS”) data, and updating the pixel VRS data that indicates one or more shading rates for one or more tiles based on whether the tiles of the one or more tiles include triangle edges or do not include triangle edges, to generate updated VRS data.
-
公开(公告)号:US20210407182A1
公开(公告)日:2021-12-30
申请号:US17028811
申请日:2020-09-22
Applicant: Advanced Micro Devices, Inc.
Inventor: Christopher J. Brennan , Fataneh F. Ghodrat , Tien E. Wei
Abstract: Techniques for performing multi-sample anti-aliasing operations are provided. The techniques include detecting an instruction for a multi-sample anti-aliasing load operation; determining a sampling rate of source data for the load operation, data storage format of the source data, and loading mode indicating whether the load operation requests same or different color components, or depth data; and based on the determined sampling rate, data storage format, and loading mode, load data from a multi-sample source into a register.
-
公开(公告)号:US11004251B2
公开(公告)日:2021-05-11
申请号:US16201879
申请日:2018-11-27
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Christopher J. Brennan
Abstract: A knob has a plurality of settings that configure a graphics pipeline. A first setting is associated with a first state of the graphics pipeline. The first setting is associated with the first state based on a measure of performance of the graphics pipeline while configured according to the first setting. The graphics pipeline is configured according to the first setting in response to the first state of the graphics pipeline matching a current state of the graphics pipeline. The graphics pipeline processes graphics according to the first setting. In some cases, the first setting is associated with the first state of the graphics pipeline by dithering or toggling the knob between the settings once per frame for a predetermined number of frames. The first setting achieves better performance than other ones of the plurality of settings during the predetermined number of frames.
-
7.
公开(公告)号:US20200098169A1
公开(公告)日:2020-03-26
申请号:US16137830
申请日:2018-09-21
Applicant: Advanced Micro Devices, Inc.
Inventor: Ruijin Wu , Young In Yeo , Sagar S. Bhandare , Vineet Goel , Martin G. Sarov , Christopher J. Brennan
Abstract: Described herein are techniques for improving the effectiveness of depth culling. In a first technique, a binner is used to sort primitives into depth bins. Each depth bin covers a range of depths. The binner transmits the depth bins to the screen space pipeline for processing in near-to-far order. Processing the near bins first results in the depth buffer being updated, allowing fragments for the primitives in the farther bins to be culled more aggressively than if the depth binning did not occur. In a second technique, a buffer is used to initiate two-pass processing through the screen space pipeline. In the first pass, primitives are sent down to update the depth block and are then culled. The fragments are processed normally in the second pass, with the benefit of the updated depth values.
-
公开(公告)号:US12169896B2
公开(公告)日:2024-12-17
申请号:US17489105
申请日:2021-09-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Todd Martin , Tad Robert Litwiller , Nishank Pathak , Randy Wayne Ramsey , Michael J. Mantor , Christopher J. Brennan , Mark M. Leather , Ryan James Cash
Abstract: Systems, apparatuses, and methods for preemptively reserving buffer space for primitives and positions in a graphics pipeline are disclosed. A system includes a graphics pipeline frontend with any number of geometry engines coupled to corresponding shader engines. Each geometry engine launches shader wavefronts to execute on a corresponding shader engine. The geometry engine preemptively reserves buffer space for each wavefront prior to the wavefront being launched on the shader engine. When the shader engine executes a wavefront, the shader engine exports primitive and position data to the reserved buffer space. Multiple scan converters will consume the primitive and position data, with each scan converter consuming primitive and position data based on the screen coverage of the scan converter. After consuming the primitive and position data, the scan converters mark the buffer space as freed so that the geometry engine can then allocate the freed buffer space to subsequent shader wavefronts.
-
公开(公告)号:US12153957B2
公开(公告)日:2024-11-26
申请号:US17957714
申请日:2022-09-30
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Matthaeus G. Chajdas , Christopher J. Brennan , Michael Mantor , Robert W. Martin , Nicolai Haehnle
IPC: G06F9/48
Abstract: A method for hierarchical work scheduling includes consuming a work item at a first scheduling domain having a local scheduler circuit and one or more workgroup processing elements. Consuming the work item produces a set of new work items. Subsequently, the local scheduler circuit distributes at least one new work item of the set of new work items to be executed locally at the first scheduling domain. If the local scheduler circuit of the first scheduling domain determines that the set of new work items includes one or more work items that would overload the first scheduling domain with work if scheduled for local execution, those work items are distributed to the next higher-level scheduler circuit in a scheduling domain hierarchy for redistribution to one or more other scheduling domains.
-
公开(公告)号:US12141915B2
公开(公告)日:2024-11-12
申请号:US17028811
申请日:2020-09-22
Applicant: Advanced Micro Devices, Inc.
Inventor: Christopher J. Brennan , Fataneh F. Ghodrat , Tien E. Wei
Abstract: Techniques for performing multi-sample anti-aliasing operations are provided. The techniques include detecting an instruction for a multi-sample anti-aliasing load operation; determining a sampling rate of source data for the load operation, data storage format of the source data, and loading mode indicating whether the load operation requests same or different color components, or depth data; and based on the determined sampling rate, data storage format, and loading mode, load data from a multi-sample source into a register.
-
-
-
-
-
-
-
-
-