-
公开(公告)号:US20220100391A1
公开(公告)日:2022-03-31
申请号:US17033170
申请日:2020-09-25
Applicant: Advanced Micro Devices, Inc.
Inventor: Michael W. LeBeane , Khaled Hamidouche , Hari S. Thangirala , Brandon Keith Potter
IPC: G06F3/06 , G06F12/02 , G06F12/0802
Abstract: A framework disclosed herein extends a relaxed, scoped memory model to a system that includes nodes across a commodity network and maintains coherency across the system. A new scope, cluster scope, is defined, that allows for memory accesses at scopes less than cluster scope to operate on locally cached versions of remote data from across the commodity network without having to issue expensive network operations. Cluster scope operations generate network commands that are used to synchronize memory across the commodity network.
-
公开(公告)号:US20240311182A1
公开(公告)日:2024-09-19
申请号:US18185641
申请日:2023-03-17
Applicant: Advanced Micro Devices, Inc.
Inventor: Kishore Punniyamurthy , Sagnik Basu , Khaled Hamidouche , Brandon Keith Potter
IPC: G06F9/48
CPC classification number: G06F9/4881
Abstract: A device includes a communication scheduler to generate schedule trees for scheduling data communication among a plurality of nodes configured to perform a collective operation using data contributed from the plurality of nodes. The device includes data reduction logic to: identify one or more skewed nodes among the plurality of nodes, perform, according to a first set of schedule trees, a first operation to generate partial results based on data contributed from non-skewed nodes, and perform, according to a second set of schedule trees, a second operation to generate final results based on the partial results and data contributed from the one or more skewed nodes.
-
公开(公告)号:US20240211399A1
公开(公告)日:2024-06-27
申请号:US18089480
申请日:2022-12-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Kishore Punniyamurthy , Khaled Hamidouche , Brandon Keith Potter
IPC: G06F12/0813 , G06N20/00
CPC classification number: G06F12/0813 , G06N20/00
Abstract: A distributed cache network used for machine learning is provided which comprises a network fabric having file systems which store data and a plurality of processing devices, each comprising cache memory and a processor configured to execute a training of a machine learning model and selectively cache portions of the data based on a frequency with which the data is accessed by the processor. Each processing device stores metadata identifying portions of data which are cached in the cache memory and other portions of the data which are cached in other processing devices of the network. When requested data is not cached in another processing device, the portion of requested data is accessed from a network file system via a client to server channel and is accessed from another processing device via a client to client channel when the requested data is cached in the other processing device.
-
公开(公告)号:US20250077409A1
公开(公告)日:2025-03-06
申请号:US18240640
申请日:2023-08-31
Applicant: Advanced Micro Devices, Inc , ATI Technologies ULC
Inventor: Kishore Punniyamurthy , Richard David Sodke , Furkan Eris , Sergey Blagodurov , Bradford Michael Beckmann , Brandon Keith Potter , Khaled Hamidouche
Abstract: A device includes a plurality of processing elements (PEs). A symmetric memory is allocated in each of the plurality of PEs. The device includes a switch connected to the plurality of PEs. The switch is to: receive, from a first processing element (PE) of the plurality of PEs, a message that includes a buffer offset, compute, based on the buffer offset, a first memory address of a first buffer in a first symmetric memory of the first PE and a second memory address of a second buffer in a second symmetric memory of a second PE of the plurality of PEs, and initiate, based on the first memory address and the second memory address, a direct memory access operation to access the first buffer and the second buffer.
-
公开(公告)号:US12086422B2
公开(公告)日:2024-09-10
申请号:US18320819
申请日:2023-05-19
Applicant: Advanced Micro Devices, Inc.
Inventor: Michael W. LeBeane , Khaled Hamidouche , Hari S. Thangirala , Brandon Keith Potter
IPC: G06F3/06 , G06F12/02 , G06F12/0802
CPC classification number: G06F3/0619 , G06F3/0656 , G06F3/067 , G06F12/0223 , G06F12/0802 , G06F2212/152
Abstract: A framework disclosed herein extends a relaxed, scoped memory model to a system that includes nodes across a commodity network and maintains coherency across the system. A new scope, cluster scope, is defined, that allows for memory accesses at scopes less than cluster scope to operate on locally cached versions of remote data from across the commodity network without having to issue expensive network operations. Cluster scope operations generate network commands that are used to synchronize memory across the commodity network.
-
公开(公告)号:US20230289070A1
公开(公告)日:2023-09-14
申请号:US18320819
申请日:2023-05-19
Applicant: Advanced Micro Devices, Inc.
Inventor: Michael W. LeBeane , Khaled Hamidouche , Hari S. Thangirala , Brandon Keith Potter
IPC: G06F3/06 , G06F12/02 , G06F12/0802
CPC classification number: G06F3/0619 , G06F12/0223 , G06F3/0656 , G06F3/067 , G06F12/0802 , G06F2212/152
Abstract: A framework disclosed herein extends a relaxed, scoped memory model to a system that includes nodes across a commodity network and maintains coherency across the system. A new scope, cluster scope, is defined, that allows for memory accesses at scopes less than cluster scope to operate on locally cached versions of remote data from across the commodity network without having to issue expensive network operations. Cluster scope operations generate network commands that are used to synchronize memory across the commodity network.
-
公开(公告)号:US11714559B2
公开(公告)日:2023-08-01
申请号:US17033170
申请日:2020-09-25
Applicant: Advanced Micro Devices, Inc.
Inventor: Michael W. LeBeane , Khaled Hamidouche , Hari S. Thangirala , Brandon Keith Potter
IPC: G06F3/06 , G06F12/02 , G06F12/0802
CPC classification number: G06F3/0619 , G06F3/067 , G06F3/0656 , G06F12/0223 , G06F12/0802 , G06F2212/152
Abstract: A framework disclosed herein extends a relaxed, scoped memory model to a system that includes nodes across a commodity network and maintains coherency across the system. A new scope, cluster scope, is defined, that allows for memory accesses at scopes less than cluster scope to operate on locally cached versions of remote data from across the commodity network without having to issue expensive network operations. Cluster scope operations generate network commands that are used to synchronize memory across the commodity network.
-
-
-
-
-
-