-
公开(公告)号:US10366027B2
公开(公告)日:2019-07-30
申请号:US15826065
申请日:2017-11-29
Applicant: ADVANCED MICRO DEVICES, INC. , ATI TECHNOLOGIES ULC
Inventor: Eric Christopher Morton , Elizabeth Cooper , William L. Walker , Douglas Benson Hunt , Richard Martin Born , Richard H. Lee , Paul C. Miranda , Philip Ng , Paul Moyer
IPC: G06F13/28 , G06F12/0815 , G06F12/0862 , G06F12/0891 , G06F12/0893
Abstract: A method for steering data for an I/O write operation includes, in response to receiving the I/O write operation, identifying, at an interconnect fabric, a cache of one of a plurality of compute complexes as a target cache for steering the data based on at least one of: a software-provided steering indicator, a steering configuration implemented at boot initialization, and coherency information for a cacheline associated with the data. The method further includes directing, via the interconnect fabric, the identified target cache to cache the data from the I/O write operation. The data is temporarily buffered at the interconnect fabric, and if the target cache attempts to fetch the data while the data is still buffered at the interconnect fabric, the interconnect fabric provides a copy of the buffered data in response to the fetch operation instead of initiating a memory access operation to obtain the data from memory.
-
公开(公告)号:US11281280B2
公开(公告)日:2022-03-22
申请号:US16876325
申请日:2020-05-18
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Benjamin Tsien , Michael J. Tresidder , Ivan Yanfeng Wang , Kevin M. Lepak , Ann Ling , Richard M. Born , John P. Petry , Bryan P. Broussard , Eric Christopher Morton
IPC: G06F1/32 , G06F1/3206 , G06F1/3287 , G06F1/3234
Abstract: Systems, apparatuses, and methods for reducing chiplet interrupt latency are disclosed. A system includes one or more processing nodes, one or more memory devices, a communication fabric coupled to the processing unit(s) and memory device(s) via link interfaces, and a power management unit. The power management unit manages the power states of the various components and the link interfaces of the system. If the power management unit detects a request to wake up a given component, and the link interface to the given component is powered down, then the power management unit sends an out-of-band signal to wake up the given component in parallel with powering up the link interface. Also, when multiple link interfaces need to be powered up, the power management unit powers up the multiple link interfaces in an order which complies with voltage regulator load-step requirements while minimizing the latency of pending operations.
-
公开(公告)号:US10656696B1
公开(公告)日:2020-05-19
申请号:US15907719
申请日:2018-02-28
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Benjamin Tsien , Michael J. Tresidder , Ivan Yanfeng Wang , Kevin M. Lepak , Ann Ling , Richard M. Born , John P. Petry , Bryan P. Broussard , Eric Christopher Morton
IPC: G06F1/32 , G06F1/3206 , G06F1/3287 , G06F1/3234
Abstract: Systems, apparatuses, and methods for reducing chiplet interrupt latency are disclosed. A system includes one or more processing nodes, one or more memory devices, a communication fabric coupled to the processing unit(s) and memory device(s) via link interfaces, and a power management unit. The power management unit manages the power states of the various components and the link interfaces of the system. If the power management unit detects a request to wake up a given component, and the link interface to the given component is powered down, then the power management unit sends an out-of-band signal to wake up the given component in parallel with powering up the link interface. Also, when multiple link interfaces need to be powered up, the power management unit powers up the multiple link interfaces in an order which complies with voltage regulator load-step requirements while minimizing the latency of pending operations.
-
公开(公告)号:US20220114123A1
公开(公告)日:2022-04-14
申请号:US17068660
申请日:2020-10-12
Applicant: Advanced Micro Devices, Inc.
Inventor: Bryan P Broussard , Paul Moyer , Eric Christopher Morton , Pravesh Gupta
IPC: G06F13/26 , G06F13/40 , G06F12/0875
Abstract: A method of operating a processing unit includes storing a first copy of a first interrupt control value in a cache device of the processing unit, receiving from an interrupt controller a first interrupt message transmitted via an interconnect fabric, where the first interrupt message includes a second copy of the first interrupt control value, and if the first copy matches the second copy, servicing an interrupt specified in the first interrupt message.
-
公开(公告)号:US20210406180A1
公开(公告)日:2021-12-30
申请号:US17472977
申请日:2021-09-13
Applicant: Advanced Micro Devices, Inc.
Inventor: Vydhyanathan Kalyanasundharam , Kevin M. Lepak , Amit P. Apte , Ganesh Balakrishnan , Eric Christopher Morton , Elizabeth M. Cooper , Ravindra N. Bhargava
IPC: G06F12/0817 , G06F12/128 , G06F12/0811 , G06F12/0871 , G06F12/0831
Abstract: Systems, apparatuses, and methods for maintaining a region-based cache directory are disclosed. A system includes multiple processing nodes, with each processing node including a cache subsystem. The system also includes a cache directory to help manage cache coherency among the different cache subsystems of the system. In order to reduce the number of entries in the cache directory, the cache directory tracks coherency on a region basis rather than on a cache line basis, wherein a region includes multiple cache lines. Accordingly, the system includes a region-based cache directory to track regions which have at least one cache line cached in any cache subsystem in the system. The cache directory includes a reference count in each entry to track the aggregate number of cache lines that are cached per region. If a reference count of a given entry goes to zero, the cache directory reclaims the given entry.
-
公开(公告)号:US20190140954A1
公开(公告)日:2019-05-09
申请号:US15805762
申请日:2017-11-07
Applicant: Advanced Micro Devices, Inc.
Inventor: Alan Dodson Smith , Chintan S. Patel , Eric Christopher Morton , Vydhyanathan Kalyanasundharam , Narendra Kamat
IPC: H04L12/803 , H04L12/935 , H04L12/825
Abstract: A system for implementing load balancing schemes includes one or more processing units, a memory, and a communication fabric with a plurality of switches coupled to the processing unit(s) and the memory. A switch of the fabric determines a first number of streams on a first input port that are targeting a first output port. The switch also determines a second number of requestors, from all input ports, that are targeting the first output port. Then, the switch calculates a throttle factor for the first input port by dividing the first number of streams by the second number of streams. The switch applies the throttle factor to regulate bandwidth on the first input port for requestors targeting the first output port. The switch also calculates throttle factors for the other ports and applies the throttle factors when regulating bandwidth on the other ports.
-
公开(公告)号:US20170371787A1
公开(公告)日:2017-12-28
申请号:US15192734
申请日:2016-06-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Vydhyanathan Kalyanasundharam , Eric Christopher Morton , Amit P. Apte , Elizabeth M. Cooper
IPC: G06F12/0815 , G06F12/0813
CPC classification number: G06F12/0815 , G06F9/526 , G06F12/0813 , G06F12/0828 , G06F12/0842 , G06F2209/522 , G06F2209/523 , G06F2212/1024 , G06F2212/1041 , G06F2212/154 , G06F2212/284 , G06F2212/608 , G06F2212/62
Abstract: A system and method for network traffic management between multiple nodes are described. A computing system includes multiple nodes connected to one another. When a home node determines a number of nodes requesting read access for a given data block assigned to the home node exceeds a threshold and a copy of the given data block is already stored at a first node of the multiple nodes in the system, the home node sends a command to the first node. The command directs the first node to forward a copy of the given data block to the home node. The home node then maintains a copy of the given data block and forwards copies of the given data block to other requesting nodes until the home node detects a write request or a lock release request for the given data block.
-
公开(公告)号:US20240220405A1
公开(公告)日:2024-07-04
申请号:US18091329
申请日:2022-12-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Amit P. Apte , Eric Christopher Morton
CPC classification number: G06F12/0238 , G06F13/1647 , G06F13/1694 , G06F2212/7201
Abstract: The disclosed computing device can include at least one memory of a particular type having a plurality of memory channels, and at least one memory of at least one other type having a plurality of links. The computing device can also include remapping circuitry configured to homogenously interleave the plurality of memory channels with the at least one memory of the at least one other type. Various other methods, systems, and computer-readable media are also disclosed.
-
公开(公告)号:US10608943B2
公开(公告)日:2020-03-31
申请号:US15796528
申请日:2017-10-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Alan Dodson Smith , Chintan S. Patel , Eric Christopher Morton , Vydhyanathan Kalyanasundharam , Narendra Kamat
IPC: G01R31/08 , G06F11/00 , G08C15/00 , H04J1/16 , H04J3/14 , H04L1/00 , H04L12/26 , H04L12/803 , G06F9/50 , G06F13/36 , H04L12/863
Abstract: Systems, apparatuses, and methods for dynamic buffer management in multi-client token flow control routers are disclosed. A system includes at least one or more processing units, a memory, and a communication fabric with a plurality of routers coupled to the processing unit(s) and the memory. A router servicing multiple active clients allocates a first number of tokens to each active client. The first number of tokens is less than a second number of tokens needed to saturate the bandwidth of each client to the router. The router also allocates a third number of tokens to a free pool, with tokens from the free pool being dynamically allocated to different clients. The third number of tokens is equal to the difference between the second number of tokens and the first number of tokens. An advantage of this approach is reducing the amount of buffer space needed at the router.
-
10.
公开(公告)号:US10558591B2
公开(公告)日:2020-02-11
申请号:US15728191
申请日:2017-10-09
Applicant: Advanced Micro Devices, Inc.
IPC: G06F13/16 , G06F13/364 , G06F13/42 , G06F13/40
Abstract: Systems, apparatuses, and methods for implementing priority adjustment forwarding are disclosed. A system includes at least one or more processing units, a memory, and a communication fabric coupled to the processing unit(s) and the memory. The communication fabric includes a plurality of arbitration points. When a client determines that its bandwidth requirements are not being met, the client generates and sends an in-band priority adjustment request to the nearest arbitration point. This arbitration point receives the in-band priority adjustment request and then identifies any pending requests which are buffered at the arbitration point which meet the criteria specified by the in-band priority adjustment request. The arbitration point adjusts the priority of any identified requests, and then the arbitration point forwards the in-band priority adjustment request on the fabric to the next upstream arbitration point which processes the in-band priority adjustment request in the same manner.
-
-
-
-
-
-
-
-
-