-
公开(公告)号:US11669457B2
公开(公告)日:2023-06-06
申请号:US17459100
申请日:2021-08-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Paul James Moyer , Douglas Benson Hunt
IPC: G06F12/0891 , G06F9/30 , G06F9/38 , G06F12/0811
CPC classification number: G06F12/0891 , G06F9/3009 , G06F9/3816 , G06F12/0811
Abstract: Systems, apparatuses, and methods for generating a measurement of write memory bandwidth are disclosed. A control unit monitors writes to a cache hierarchy. If a write to a cache line is a first time that the cache line is being modified since entering the cache hierarchy, then the control unit increments a write memory bandwidth counter. Otherwise, if the write is to a cache line that has already been modified since entering the cache hierarchy, then the write memory bandwidth counter is not incremented. The first write to a cache line is a proxy for write memory bandwidth since this will eventually cause a write to memory. The control unit uses the value of the write memory bandwidth counter to generate a measurement of the write memory bandwidth. Also, the control unit can maintain multiple counters for different thread classes to calculate the write memory bandwidth per thread class.
-
公开(公告)号:US20220058025A1
公开(公告)日:2022-02-24
申请号:US17519902
申请日:2021-11-05
Applicant: Advanced Micro Devices, Inc.
Inventor: Paul James Moyer , Douglas Benson Hunt , Kai Troester
IPC: G06F9/38
Abstract: Systems, apparatuses, and methods for arbitrating threads in a computing system are disclosed. A computing system includes a processor with multiple cores, each capable of simultaneously processing instructions of multiple threads. When a thread throttling unit receives an indication that a shared cache has resource contention, the throttling unit sets a threshold number of cache misses for the cache. If the number of cache misses exceeds this threshold, then the throttling unit notifies a particular upstream computation unit to throttle the processing of instructions for the thread. After a time period elapses, if the cache continues to exceed the threshold, then the throttling unit notifies the upstream computation unit to more restrictively throttle the thread by performing one or more of reducing the selection rate and increasing the time period. Otherwise, the unit notifies the upstream computation unit to less restrictively throttle the thread.
-
公开(公告)号:US20210390057A1
公开(公告)日:2021-12-16
申请号:US17459100
申请日:2021-08-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Paul James Moyer , Douglas Benson Hunt
IPC: G06F12/0891 , G06F9/30 , G06F9/38 , G06F12/0811
Abstract: Systems, apparatuses, and methods for generating a measurement of write memory bandwidth are disclosed. A control unit monitors writes to a cache hierarchy. If a write to a cache line is a first time that the cache line is being modified since entering the cache hierarchy, then the control unit increments a write memory bandwidth counter. Otherwise, if the write is to a cache line that has already been modified since entering the cache hierarchy, then the write memory bandwidth counter is not incremented. The first write to a cache line is a proxy for write memory bandwidth since this will eventually cause a write to memory. The control unit uses the value of the write memory bandwidth counter to generate a measurement of the write memory bandwidth. Also, the control unit can maintain multiple counters for different thread classes to calculate the write memory bandwidth per thread class.
-
公开(公告)号:US11294710B2
公开(公告)日:2022-04-05
申请号:US15809432
申请日:2017-11-10
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Douglas Benson Hunt
IPC: G06F9/00 , G06F9/48 , G06F12/0875 , G06F9/46 , G06F12/0806 , G06F12/0897 , G06F12/0866
Abstract: A processing system suspends execution of a program thread based on an access latency required for a program thread to access memory. The processing system employs different memory modules having different memory technologies, located at different points in the processing system, and the like, or a combination thereof. The different memory modules therefore have different access latencies for memory transactions (e.g., memory reads and writes). When a program thread issues a memory transaction that results in an access to a memory module having a relatively long access latency (referred to as “slow” memory), the processor suspends execution of the program thread and releases processor resources used by the program thread. When the processor receives a response to the memory transaction from the memory module, the processor resumes execution of the suspended program thread.
-
公开(公告)号:US11169812B2
公开(公告)日:2021-11-09
申请号:US16584701
申请日:2019-09-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Paul James Moyer , Douglas Benson Hunt , Kai Troester
IPC: G06F9/38
Abstract: Systems, apparatuses, and methods for arbitrating threads in a computing system are disclosed. A computing system includes a processor with multiple cores, each capable of simultaneously processing instructions of multiple threads. When a thread throttling unit receives an indication that a shared cache has resource contention, the throttling unit sets a threshold number of cache misses for the cache. If the number of cache misses exceeds this threshold, then the throttling unit notifies a particular upstream computation unit to throttle the processing of instructions for the thread. After a time period elapses, if the cache continues to exceed the threshold, then the throttling unit notifies the upstream computation unit to more restrictively throttle the thread by performing one or more of reducing the selection rate and increasing the time period. Otherwise, the unit notifies the upstream computation unit to less restrictively throttle the thread.
-
公开(公告)号:US11106594B2
公开(公告)日:2021-08-31
申请号:US16562128
申请日:2019-09-05
Applicant: Advanced Micro Devices, Inc.
Inventor: Paul James Moyer , Douglas Benson Hunt
IPC: G06F12/0891 , G06F9/30 , G06F9/38 , G06F12/0811
Abstract: Systems, apparatuses, and methods for generating a measurement of write memory bandwidth are disclosed. A control unit monitors writes to a cache hierarchy. If a write to a cache line is a first time that the cache line is being modified since entering the cache hierarchy, then the control unit increments a write memory bandwidth counter. Otherwise, if the write is to a cache line that has already been modified since entering the cache hierarchy, then the write memory bandwidth counter is not incremented. The first write to a cache line is a proxy for write memory bandwidth since this will eventually cause a write to memory. The control unit uses the value of the write memory bandwidth counter to generate a measurement of the write memory bandwidth. Also, the control unit can maintain multiple counters for different thread classes to calculate the write memory bandwidth per thread class.
-
公开(公告)号:US10938559B2
公开(公告)日:2021-03-02
申请号:US15838826
申请日:2017-12-12
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Douglas Benson Hunt
Abstract: Security key identifier remapping includes associating a system-level security key identifier to a local-level identifier requiring fewer bits of storage space. The remapped security key identifiers are used to receive, at a first compute complex of a processing system, a memory access request including a memory address value and a system-level security key identifier. The compute complex responds to the memory access request based on a determination of whether a security key identifier map of the first compute complex includes a mapping of the system-level security key identifier to a local-level security key identifier. In response to determining that the security key identifier map of the first compute complex does not include a mapping of the system-level security key identifier to the local-level security key identifier, a cache miss message may be returned without probing caches of the first compute complex.
-
公开(公告)号:US10700954B2
公开(公告)日:2020-06-30
申请号:US15849266
申请日:2017-12-20
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Douglas Benson Hunt , Jay Fleischman
Abstract: A system includes a multi-core processor that includes a scheduler. The multi-core processor communicates with a system memory and an operating system. The multi-core processor executes a first process and a second process. The system uses the scheduler to control a use of a memory bandwidth by the second process until a current use in a control cycle by the first process meets a first setpoint of use for the first process when the first setpoint is at or below a latency sensitive (LS) floor or a current use in the control cycle by the first process exceeds the LS floor when the first setpoint exceeds the LS floor.
-
公开(公告)号:US20190294546A1
公开(公告)日:2019-09-26
申请号:US15925859
申请日:2018-03-20
Applicant: Advanced Micro Devices, Inc.
Inventor: Tanuj Kumar Agarwal , Anasua Bhowmik , Douglas Benson Hunt
IPC: G06F12/0862 , G06F3/06 , G06F12/0897
Abstract: A method includes monitoring a request rate of speculative memory read requests from a penultimate-level cache to a main memory. The speculative memory read requests correspond to data read requests that missed in the penultimate-level cache. A hit rate of searches of a last-level cache for data requested by the data read requests is monitored. Core demand speculative memory read requests to the main memory are selectively enabled in parallel with searching of the last-level cache for data of a corresponding core demand data read request based on the request rate and the hit rate. Prefetch speculative memory read requests to the main memory are selectively enabled in parallel with searching of the last-level cache for data of a corresponding prefetch data read request based on the request rate and the hit rate.
-
公开(公告)号:US12032965B2
公开(公告)日:2024-07-09
申请号:US17519902
申请日:2021-11-05
Applicant: Advanced Micro Devices, Inc.
Inventor: Paul James Moyer , Douglas Benson Hunt , Kai Troester
IPC: G06F9/38
CPC classification number: G06F9/3856 , G06F9/384 , G06F9/3885
Abstract: Systems, apparatuses, and methods for arbitrating threads in a computing system are disclosed. A computing system includes a processor with multiple cores, each capable of simultaneously processing instructions of multiple threads. When a thread throttling unit receives an indication that a shared cache has resource contention, the throttling unit sets a threshold number of cache misses for the cache. If the number of cache misses exceeds this threshold, then the throttling unit notifies a particular upstream computation unit to throttle the processing of instructions for the thread. After a time period elapses, if the cache continues to exceed the threshold, then the throttling unit notifies the upstream computation unit to more restrictively throttle the thread by performing one or more of reducing the selection rate and increasing the time period. Otherwise, the unit notifies the upstream computation unit to less restrictively throttle the thread.
-
-
-
-
-
-
-
-
-