Patent search ap:("Advanced Micro Devices Page Inc.") AND inv:"Johnathan Alsop"

11.

发明申请
APPROACH FOR ENFORCING ORDERING BETWEEN MEMORY-CENTRIC AND CORE-CENTRIC MEMORY OPERATIONS 有权

公开(公告)号：US20220317926A1

公开(公告)日：2022-10-06

申请号：US17219446

申请日：2021-03-31

Applicant: Advanced Micro Devices, Inc.

Inventor： Shaizeen Aga , Nuwan Jayasena , Johnathan Alsop

IPC: G06F3/06

Abstract: Ordering between memory-centric memory operations, referred to hereinafter as “MC-Mem-Ops,” and core-centric memory operations, referred to hereinafter as “CC-Mem-Ops,” is enforced using inter-centric fences, referred to hereinafter as an “IC-fences.” IC-fences are implemented by an ordering primitive or ordering instruction, that cause a memory controller, a cache controller, etc., to enforce ordering of MC-Mem-Ops and CC-Mem-Ops throughout the memory pipeline and at the memory controller by not reordering MC-Mem-Ops (or sometimes CC-Mem-Ops) that arrive before the IC-fence to after the IC-fence. Processing of an IC-fence also causes the memory controller to issue an ordering acknowledgment to the thread that issued the IC-fence instruction. IC-fences are tracked at the core and designated as complete when the ordering acknowledgment is received. Embodiments include a completion level-specific cache flush operation which, when used with an IC-fence, provides proper ordering between cached CC-Mem-Ops and MC-Mem-ops with reduced data transfer and completion times.

12.

发明申请
SYSTEM AND METHOD FOR COALESCED MULTICAST DATA TRANSFERS OVER MEMORY INTERFACES 有权

公开(公告)号：US20220317876A1

公开(公告)日：2022-10-06

申请号：US17218700

申请日：2021-03-31

Applicant: Advanced Micro Devices, Inc.

Inventor： Johnathan Alsop , Nuwan Jayasena , Shaizeen Aga , Andrew McCrabb

IPC: G06F3/06

Abstract: Methods and apparatuses to control digital data transfer via a memory channel between a memory module and a processor are disclosed. At least one of the memory module or the processor coalesces a plurality of short data words into multicast coalesced block data comprising a single data block for transfer via the memory channel. Each of the plurality of short data words pertains to one of at least two partitioned memory submodules in the memory module. The multicast coalesced block data is communicated over the memory channel.

13.

发明授权
Data placement with packet metadata 有权

公开(公告)号：US12182428B2

公开(公告)日：2024-12-31

申请号：US17124872

申请日：2020-12-17

Applicant: Advanced Micro Devices, Inc.

Inventor： Sergey Blagodurov , Johnathan Alsop , SeyedMohammad SeyedzadehDelcheh

IPC: G06F3/06 , G06F12/02

Abstract: Systems, apparatuses, and methods for determining data placement based on packet metadata are disclosed. A system includes a traffic analyzer that determines data placement across connected devices based on observed values of the metadata fields in actively exchanged packets across a plurality of protocol types. In one implementation, the protocol that is supported by the system is the compute express link (CXL) protocol. The traffic analyzer performs various actions in response to events observed in a packet stream that match items from a pre-configured list. Data movement is handled underneath the software applications by changing the virtual-to-physical address translation once the data movement is completed. After the data movement is finished, threads will pull in the new host physical address into their translation lookaside buffers (TLBs) via a page table walker or via an address translation service (ATS) request.

14.

发明授权
Dynamically coalescing atomic memory operations for memory-local computing 有权

公开(公告)号：US11726918B2

公开(公告)日：2023-08-15

申请号：US17361145

申请日：2021-06-28

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Johnathan Alsop , Alexandru Dutu , Shaizeen Aga , Nuwan Jayasena

IPC: G06F12/0871 , G06F12/02 , G06F12/084 , G06F12/0846

CPC classification number: G06F12/0871 , G06F12/0238 , G06F12/084 , G06F12/0846

Abstract: Dynamically coalescing atomic memory operations for memory-local computing is disclosed. In an embodiment, it is determined whether a first atomic memory access and a second atomic memory access are candidates for coalescing. In response to a triggering event, the atomic memory accesses that are candidates for coalescing are coalesced in a cache prior to requesting memory-local processing by a memory-local compute unit. The atomic memory accesses may be coalesced in the same cache line or atomic memory accesses in different cache lines may be coalesced using a multicast memory-local processing command.

15.

发明授权
Dynamic multi-bank memory command coalescing 有权

公开(公告)号：US11681465B2

公开(公告)日：2023-06-20

申请号：US16900526

申请日：2020-06-12

Applicant: Advanced Micro Devices, Inc.

Inventor： Johnathan Alsop , Shaizeen Dilawarhusen Aga

IPC: G06F3/06

CPC classification number: G06F3/0659 , G06F3/0604 , G06F3/0644 , G06F3/0673

Abstract: Systems, apparatuses, and methods for dynamically coalescing multi-bank memory commands to improve command throughput are disclosed. A system includes a processor coupled to a memory via a memory controller. The memory also includes processing-in-memory (PIM) elements which are able to perform computations within the memory. The processor generates memory requests targeting the memory which are sent to the memory controller. The memory controller stores commands received from the processor in a queue, and the memory controller determines whether opportunities exist for coalescing multiple commands together into a single multi-bank command. After coalescing multiple commands into a single combined multi-bank command, the memory controller conveys, across the memory bus to multiple separate banks, the single multi-bank command and a multi-bank code specifying which banks are targeted. The memory banks process the command in parallel, and the PIM elements process the data next to each respective bank.

16.

发明授权
Limited propagation of unnecessary memory updates 有权

公开(公告)号：US11526449B2

公开(公告)日：2022-12-13

申请号：US17007133

申请日：2020-08-31

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Johnathan Alsop , Pouya Fotouhi , Bradford Beckmann , Sergey Blagodurov

IPC: G06F12/08 , G06F12/0891 , G06F9/30 , G06F12/0882 , G06F12/0811

Abstract: A processing system limits the propagation of unnecessary memory updates by bypassing writing back dirty cache lines to other levels of a memory hierarchy in response to receiving an indication from software executing at a processor of the processing system that the value of the dirty cache line is dead (i.e., will not be read again or will not be read until after it has been overwritten). In response to receiving an indication from software that data is dead, a cache controller prevents propagation of the dead data to other levels of memory in response to eviction of the dead data or flushing of the cache at which the dead data is stored.

17.

发明授权
Detecting execution hazards in offloaded operations 有权

公开(公告)号：US11188406B1

公开(公告)日：2021-11-30

申请号：US17218506

申请日：2021-03-31

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Johnathan Alsop , Shaizeen Aga

IPC: G06F11/07 , G06F11/24 , G06F11/14

Abstract: Detecting execution hazards in offloaded operations is disclosed. A second offload operation is compared to a first offload operation that precedes the second offload operation. It is determined whether the second offload operation creates an execution hazard on an offload target device based on the comparison of the second offload operation to the first offload operation. If the execution hazard is detected, an error handling operation may be performed. In some examples, the offload operations are processing-in-memory operations.

18.

发明申请
MEMORY REQUEST PRIORITY ASSIGNMENT TECHNIQUES FOR PARALLEL PROCESSORS 有权

公开(公告)号：US20210173796A1

公开(公告)日：2021-06-10

申请号：US16706421

申请日：2019-12-06

Applicant: Advanced Micro Devices, Inc.

Inventor： Sooraj Puthoor , Kishore Punniyamurthy , Onur Kayiran , Xianwei Zhang , Yasuko Eckert , Johnathan Alsop , Bradford Michael Beckmann

IPC: G06F13/18 , G06F13/16

Abstract: Systems, apparatuses, and methods for implementing memory request priority assignment techniques for parallel processors are disclosed. A system includes at least a parallel processor coupled to a memory subsystem, where the parallel processor includes at least a plurality of compute units for executing wavefronts in lock-step. The parallel processor assigns priorities to memory requests of wavefronts on a per-work-item basis by indexing into a first priority vector, with the index generated based on lane-specific information. If a given event is detected, a second priority vector is generated by applying a given priority promotion vector to the first priority vector. Then, for subsequent wavefronts, memory requests are assigned priorities by indexing into the second priority vector with lane-specific information. The use of priority vectors to assign priorities to memory requests helps to reduce the memory divergence problem experienced by different work-items of a wavefront.

19.

发明授权
Approach for performing efficient memory operations using near-memory compute elements 有权

公开(公告)号：US12235756B2

公开(公告)日：2025-02-25

申请号：US17557568

申请日：2021-12-21

Applicant: Advanced Micro Devices, Inc.

Inventor： Shaizeen Aga , Johnathan Alsop , Nuwan Jayasena

IPC: G06F12/06

Abstract: Near-memory compute elements perform memory operations and temporarily store at least a portion of address information for the memory operations in local storage. A broadcast memory command is then issued to the near-memory compute elements that causes the near-memory compute elements to perform a subsequent memory operation using their respective address information stored in the local storage. This allows a single broadcast memory command to be used to perform memory operations across multiple memory elements, such as DRAM banks, using bank-specific address information. In one implementation, the approach is used to process workloads with irregular updates to memory while consuming less command bus bandwidth than conventional approaches. Implementations include using conditional flags to selectively designate address information in local storage that is to be processed with the broadcast memory command.

20.

发明公开
COMMUNICATION REDUCTION TECHNIQUES FOR PARALLEL COMPUTING 审中-公开

公开(公告)号：US20240119198A1

公开(公告)日：2024-04-11

申请号：US17958058

申请日：2022-09-30

Applicant: Advanced Micro Devices, Inc.

Inventor： Laurent S. White , Johnathan Alsop , Ganesh Dasika

IPC: G06F30/23 , G06F30/27

CPC classification number: G06F30/23 , G06F30/27 , G06F2119/02

Abstract: A physical system is simulated using a model including a plurality of elements in a mesh or grid. The elements are divided into partitions processed by different processing units. For some time steps, state data is transmitted between partitions and used to calculate flux data for updating the state of edge elements of the partitions. Periodically, transmission of state data is suppressed, and flux data is obtained by linear interpolation based on past flux data. Alternatively, flux data is obtained by processing state variables of an edge element and past flux data using a machine learning model, such as a DNN. Whether to suppress transmission of state data may be determined based on one or both of (a) uncertainty in an output of the machine learning model (e.g., Bayesian neural network) and (b) complexity of model of the physical system (e.g., spatial or temporal gradients).

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification