-
101.
公开(公告)号:US20220066662A1
公开(公告)日:2022-03-03
申请号:US17006646
申请日:2020-08-28
Applicant: Advanced Micro Devices, Inc.
Inventor: Mahzabeen Islam , Shaizeen Aga , Nuwan Jayasena , Jagadish B. Kotra
IPC: G06F3/06
Abstract: Approaches are provided for implementing hardware-software collaborative address mapping schemes that enable mapping data elements which are accessed together in the same row of one bank or over the same rows of different banks to achieve higher performance by reducing row conflicts. Using an intra-bank frame striping policy (IBFS), corresponding subsets of data elements are interleaved into a single row of a bank. Using an intra-channel frame striping policy (ICFS), corresponding subsets of data elements are interleaved into a single channel row of a channel. A memory controller utilizes ICFS and/or IBFS to efficiently store and access data elements in memory, such as processing-in-memory (PIM) enabled memory.
-
公开(公告)号:US20210373805A1
公开(公告)日:2021-12-02
申请号:US16885677
申请日:2020-05-28
Applicant: Advanced Micro Devices, Inc.
Inventor: Johnathan Alsop , Shaizeen Aga , Nuwan Jayasena
IPC: G06F3/06
Abstract: An approach is provided for reducing command bus traffic between memory controllers and PIM-enabled memory modules using special PIM commands. The term “special PIM command” is used herein to describe embodiments and refers to a PIM command for which the corresponding module-specific command information is provided to memory modules via a non-command bus data path. A memory controller generates and issues a special PIM command to multiple PIM-enabled memory modules via a command bus and provides module-specific command information (e.g., address information) for the special PIM command to the PIM-enabled memory modules via the non-command bus data path that is shared by the PIM-enabled memory modules and the memory controller.
-
公开(公告)号:US11119923B2
公开(公告)日:2021-09-14
申请号:US15440979
申请日:2017-02-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Amin Farmahini Farahani , Nuwan Jayasena
IPC: G06F3/06 , G06F12/00 , G06F13/00 , G06F12/0813 , G06F12/0837 , G06F12/0811
Abstract: A cache coherence technique for operating a multi-processor system including shared memory includes allocating a cache line of a cache memory of a processor to a memory address in the shared memory in response to execution of an instruction of a program executing on the processor. The technique includes encoding a shared information state of the cache line to indicate whether the memory address is a shared memory address shared by the processor and a second processor, or a private memory address private to the processor, in response to whether the instruction is included in a critical section of the program, the critical section being a portion of the program that confines access to shared, writeable data.
-
公开(公告)号:US20210279065A1
公开(公告)日:2021-09-09
申请号:US16808346
申请日:2020-03-03
Applicant: Advanced Micro Devices, Inc.
Inventor: Nuwan Jayasena , Shaizeen Aga , Anirban Nag
IPC: G06F9/38
Abstract: Techniques are provided for performing memory operations. The techniques include issuing, by a processor, a fence primitive to a memory system, the fence primitive issued in a manner that indicates a program order of memory operation execution.
-
公开(公告)号:US11030117B2
公开(公告)日:2021-06-08
申请号:US15650252
申请日:2017-07-14
Applicant: Advanced Micro Devices, Inc.
Inventor: Nuwan Jayasena , Brandon K. Potter , Andrew G. Kegel
Abstract: A host processor receives an address translation request from an accelerator, which may be trusted or un-trusted. The address translation request includes a virtual address in a virtual address space that is shared by the host processor and the accelerator. The host processor encrypts a physical address in a host memory indicated by the virtual address in response to the accelerator being permitted to access the physical address. The host processor then provides the encrypted physical address to the accelerator. The accelerator provides memory access requests including the encrypted physical address to the host processor, which decrypts the physical address and selectively accesses a location in the host memory indicated by the decrypted physical address depending upon whether the accelerator is permitted to access the location indicated by the decrypted physical address.
-
公开(公告)号:US10817422B2
公开(公告)日:2020-10-27
申请号:US16104567
申请日:2018-08-17
Applicant: Advanced Micro Devices, Inc.
Inventor: Nuwan Jayasena , Amin Farmahini Farahani , Michael Ignatowski
IPC: G06F13/00 , G06F12/0804 , G06F12/0862 , G06F13/16
Abstract: In one form, a data processing system includes a host integrated circuit having a memory controller, a memory bus coupled to the memory controller, and a memory module. The memory module includes a bulk memory and a memory module scratchpad coupled to the bulk memory, wherein the memory module scratchpad has a lower access overhead than the bulk memory. The memory controller selectively provides predetermined commands over the memory bus to cause the memory module to copy data between the bulk memory and the memory module scratchpad without conducting data on the memory bus in response to a data movement decision.
-
公开(公告)号:US20190220426A1
公开(公告)日:2019-07-18
申请号:US15872943
申请日:2018-01-16
Applicant: Advanced Micro Devices, Inc.
Inventor: Nuwan Jayasena , Michael Ignatowski
CPC classification number: G06F13/1694 , G06F3/0631 , G06F3/065 , G06F3/068
Abstract: A configurable computing system which uses near-memory and in-memory hardened logic blocks is described herein. The hardened logic blocks are incorporated into memory modules. The memory modules include an interface or communication logic to communicate between the configurable computing substrate and the memory module. In an implementation, the memory modules can include an on-die memory or other forms of non-configurable logic to enable more efficient processing for a variety of operations. In another implementation, the memory modules can include a portion of configurable computing substrate logic fabric to enable more efficient processing for a variety of operations. In another implementation, the memory modules can include an on-die memory and a portion of configurable computing substrate logic fabric to enable more efficient processing for a variety of operations.
-
公开(公告)号:US20190188577A1
公开(公告)日:2019-06-20
申请号:US15849633
申请日:2017-12-20
Applicant: Advanced Micro Devices, Inc.
Inventor: Nicholas Malaya , Nuwan Jayasena
Abstract: A system assigns experts of a mixture-of-experts artificial intelligence model to processing devices in an automated manner. The system includes an orchestrator component that maintains priority data that stores, for each of a set of experts, and for each of a set of execution parameters, ranking information that ranks different processing devices for the particular execution parameter. In one example, for the execution parameter of execution speed, and for a first expert, the priority data indicates that a central processing unit (“CPU”) executes the first expert faster than a graphics processing unit (“GPU”). In this example, for the execution parameter of power consumption, and for the first expert, the priority data indicates that a GPU uses less power than a CPU. The priority data stores such information for one or more processing devices, one or more experts, and one or more execution characteristics.
-
公开(公告)号:US10198369B2
公开(公告)日:2019-02-05
申请号:US15469071
申请日:2017-03-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Yasuko Eckert , Reena Panda , Nuwan Jayasena
Abstract: A data processing system includes a memory that includes a first memory bank and a second memory bank. The data processing system also includes a conflict detector connected to the memory and adapted to receive memory access information. The conflict detector tracks memory access statistics of the first memory bank, and determines if the first memory bank contains frequent row conflicts. The conflict detector also remaps a frequent row conflict in the first memory bank to the second memory bank. An indirection table is connected to the conflict detector and adapted to receive a memory access request, and redirects an address into a dynamically selected physical memory address in response to a remapping of the frequent row conflict to the second memory bank.
-
公开(公告)号:US20180276150A1
公开(公告)日:2018-09-27
申请号:US15469071
申请日:2017-03-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Yasuko Eckert , Reena Panda , Nuwan Jayasena
CPC classification number: G06F13/1642 , G06F3/0619 , G06F3/065 , G06F3/0653 , G06F3/0673 , G06F13/1673 , G06F13/4068 , Y02D10/14 , Y02D10/151
Abstract: A data processing system includes a memory that includes a first memory bank and a second memory bank. The data processing system also includes a conflict detector connected to the memory and adapted to receive memory access information. The conflict detector tracks memory access statistics of the first memory bank, and determines if the first memory bank contains frequent row conflicts. The conflict detector also remaps a frequent row conflict in the first memory bank to the second memory bank. An indirection table is connected to the conflict detector and adapted to receive a memory access request, and redirects an address into a dynamically selected physical memory address in response to a remapping of the frequent row conflict to the second memory bank.
-
-
-
-
-
-
-
-
-