-
公开(公告)号:US20240330199A1
公开(公告)日:2024-10-03
申请号:US18517513
申请日:2023-11-22
Applicant: ADVANCED MICRO DEVICES, INC. , ATI TECHNOLOGIES ULC
Inventor: Anthony ASARO , Jeffrey G. CHENG , Anirudh R. ACHARYA
IPC: G06F12/1009 , G06F9/455
CPC classification number: G06F12/1009 , G06F9/45558 , G06F2009/45583 , G06F2212/657
Abstract: A processor supports secure memory access in a virtualized computing environment by employing requestor identifiers at bus devices (such as a graphics processing unit) to identify the virtual machine associated with each memory access request. The virtualized computing environment uses the requestor identifiers to control access to different regions of system memory, ensuring that each VM accesses only those regions of memory that the VM is allowed to access. The virtualized computing environment thereby supports efficient memory access by the bus devices while ensuring that the different regions of memory are protected from unauthorized access.
-
公开(公告)号:US20210398242A1
公开(公告)日:2021-12-23
申请号:US17128388
申请日:2020-12-21
Applicant: ADVANCED MICRO DEVICES, INC. , ATI TECHNOLOGIES ULC
IPC: G06T1/20 , G06T1/60 , G06F12/0806 , G06F12/0888
Abstract: A graphics pipeline includes a cache having cache lines that are configured to store data used to process frames in a graphics pipeline. The graphics pipeline is implemented using a processor that processes frames for the graphics pipeline using data stored in the cache. The processor processes a first frame and writes back a dirty cache line from the cache to a memory concurrently with processing of the first frame. The dirty cache line is retained in the cache and marked as clean subsequent to being written back to the memory. In some cases, the processor generates a hint that indicates a priority for writing back the dirty cache line based on a read command occupancy at a system memory controller.
-
公开(公告)号:US20220156874A1
公开(公告)日:2022-05-19
申请号:US17231425
申请日:2021-04-15
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Anirudh R. ACHARYA , Ruijin WU , Young In YEO
Abstract: Systems and methods related to priority-based and performance-based selection of a render mode, such as a two-level binning mode, in which to execute workloads with a graphics processing unit (GPU) of a system are provided. A user mode driver (UMD) or kernel mode driver (KMD) executed at a central processing unit (CPU) configures low and medium priority workloads to be executed in a two-level binning mode and selects a binning mode for high priority workloads based on whether performance heuristics indicate that one or more binning conditions or override conditions have been met. High priority workloads are maintained in a high priority queue, while low and medium priority workloads are maintained in a low/medium priority queue, such that execution of low and medium priority workloads at the GPU can be preempted in favor of executing high priority workloads.
-
公开(公告)号:US20190163527A1
公开(公告)日:2019-05-30
申请号:US15828059
申请日:2017-11-30
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Anirudh R. ACHARYA , Michael MANTOR
Abstract: A first workload is executed in a first subset of pipelines of a processing unit. A second workload is executed in a second subset of the pipelines of the processing unit. The second workload is dependent upon the first workload. The first and second workloads are suspended and state information for the first and second workloads is stored in a first memory in response to suspending the first and second workloads. In some cases, a third workload executes in a third subset of the pipelines of the processing unit concurrently with executing the first and second workloads. In some cases, a fourth workload is executed in the first and second pipelines after suspending the first and second workloads. The first and second pipelines are resumed on the basis of the stored state information in response to completion or suspension of the fourth workload.
-
公开(公告)号:US20230096002A1
公开(公告)日:2023-03-30
申请号:US17890520
申请日:2022-08-18
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Ranjith Kumar SAJJA , Sreekanth GODEY , Anirudh R. ACHARYA
Abstract: Systems and methods related to controlling clock signals for clocking shader engines modules (SEs) and non-shader-engine modules (nSEs) of a graphics processing unit (GPU) are provided. One or more dividers receive a clock signal CLK and output a clock signal CLKA to the SEs and output a clock signal CLKB to the nSEs. The frequencies of CLKA and CLKB are independently selected based on sets of performance counter data monitored at the SEs and nSEs, respectively. The clock signal frequency for either the SEs or the nSEs is reduced when the corresponding sets of performance counter data indicates a comparatively lower processing workload for the SEs or for the nSEs.
-
公开(公告)号:US20210287418A1
公开(公告)日:2021-09-16
申请号:US17008292
申请日:2020-08-31
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Anirudh R. ACHARYA , Ruijin WU , Young In YEO , Mika TUOMI , Kiia KALLIO
Abstract: A processor dynamically selects a render mode for each render pass of a frame based on the characteristics of the render pass. A software driver of the processor receives graphics operations from an application executing at the processor and converts the graphics operations into a command stream that is provided to the graphics pipeline. As the driver converts the graphics operations into the command stream, the driver analyzes each render pass of the frame to determine characteristics of the render passes, and selects a render mode for each render pass based on the characteristics of the render pass.
-
公开(公告)号:US20190005604A1
公开(公告)日:2019-01-03
申请号:US15639980
申请日:2017-06-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Anirudh R. ACHARYA , Michael MANTOR , Vineet GOEL , Swapnil SAKHARSHETE
Abstract: A stage of a graphics pipeline in a graphics processing unit (GPU) detects an interrupt concurrently with the stage processing primitives in a first bin that represents a first portion of a first frame generated by a first application. The stage forwards a completed portion of the primitives to a subsequent stage of the graphics pipeline in response to the interrupt. The stage diverts a second bin that represents a second portion of the first frame from the stage to a memory in response to the interrupt. The stage processes primitives in a third bin that represents a portion of a second frame generated by a second application subsequent to diverting the second bin to the memory. The stage can then retrieve the second bin from the memory in response to the stage completing processing of the primitives in the third bin for additional processing.
-
公开(公告)号:US20240345617A1
公开(公告)日:2024-10-17
申请号:US18603883
申请日:2024-03-13
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Ranjith Kumar SAJJA , Sreekanth GODEY , Anirudh R. ACHARYA
CPC classification number: G06F1/08 , G06F1/12 , G06F9/505 , G06F11/3409
Abstract: Systems and methods related to controlling clock signals for clocking shader engines modules (SEs) and non-shader-engine modules (nSEs) of a graphics processing unit (GPU) are provided. One or more dividers receive a clock signal CLK and output a clock signal CLKA to the SEs and output a clock signal CLKB to the nSEs. The frequencies of CLKA and CLKB are independently selected based on sets of performance counter data monitored at the SEs and nSEs, respectively. The clock signal frequency for either the SEs or the nSEs is reduced when the corresponding sets of performance counter data indicates a comparatively lower processing workload for the SEs or for the nSEs.
-
公开(公告)号:US20210278873A1
公开(公告)日:2021-09-09
申请号:US17032701
申请日:2020-09-25
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Ranjith Kumar SAJJA , Sreekanth GODEY , Anirudh R. ACHARYA
Abstract: Systems and methods related to controlling clock signals for clocking shader engines modules (SEs) and non-shader-engine modules (nSEs) of a graphics processing unit (GPU) are provided. One or more dividers receive a clock signal CLK and output a clock signal CLKA to the SEs and output a clock signal CLKB to the nSEs. The frequencies of CLKA and CLKB are independently selected based on sets of performance counter data monitored at the SEs and nSEs, respectively. The clock signal frequency for either the SEs or the nSEs is reduced when the corresponding sets of performance counter data indicates a comparatively lower processing workload for the SEs or for the nSEs.
-
公开(公告)号:US20190164328A1
公开(公告)日:2019-05-30
申请号:US16238727
申请日:2019-01-03
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Anirudh R. ACHARYA , Swapnil SAKHARSHETE , Michael MANTOR , Mangesh P. NIJASURE , Todd MARTIN , Vineet GOEL
Abstract: Processing of non-real-time and real-time workloads is performed using discrete pipelines. A first pipeline includes a first shader and one or more fixed function hardware blocks. A second pipeline includes a second shader that is configured to emulate the at least one fixed function hardware block. First and second memory elements store first state information for the first pipeline and second state information for the second pipeline, respectively. A non-real-time workload executing in the first pipeline is preempted at a primitive boundary in response to a real-time workload being dispatched for execution in the second pipeline. The first memory element retains the first state information in response to preemption of the non-real-time workload. The first pipeline is configured to resume processing the subsequent primitive on the basis of the first state information stored in the first memory element.
-
-
-
-
-
-
-
-
-