Patent search ap:("Advanced Micro Devices Page Inc.") AND inv:"Sooraj Puthoor"

11.

发明授权
Dynamic wavefront creation for processing units using a hybrid compactor 有权

公开(公告)号：US09898287B2

公开(公告)日：2018-02-20

申请号：US14682971

申请日：2015-04-09

Applicant: Advanced Micro Devices, Inc.

Inventor： Sooraj Puthoor , Bradford M. Beckmann , Dmitri Yudanov

IPC: G06F9/38 , G06F9/30 , G06F9/46

CPC classification number: G06F9/30058 , G06F9/3804 , G06F9/3851 , G06F9/3887 , G06F9/46

Abstract: A method, a non-transitory computer readable medium, and a processor for repacking dynamic wavefronts during program code execution on a processing unit, each dynamic wavefront including multiple threads are presented. If a branch instruction is detected, a determination is made whether all wavefronts following a same control path in the program code have reached a compaction point, which is the branch instruction. If no branch instruction is detected in executing the program code, a determination is made whether all wavefronts following the same control path have reached a reconvergence point, which is a beginning of a program code segment to be executed by both a taken branch and a not taken branch from a previous branch instruction. The dynamic wavefronts are repacked with all threads that follow the same control path, if all wavefronts following the same control path have reached the branch instruction or the reconvergence point.

12.

发明授权
Hardware accelerated dynamic work creation on a graphics processing unit 有权

公开(公告)号：US12131186B2

公开(公告)日：2024-10-29

申请号：US17993490

申请日：2022-11-23

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Anthony Gutierrez , Sooraj Puthoor

IPC: G06F9/48 , G06F9/38 , G06F9/54

CPC classification number: G06F9/4881 , G06F9/3877 , G06F9/542 , G06F9/545 , G06F9/546

Abstract: A processor core is configured to execute a parent task that is described by a data structure stored in a memory. A coprocessor is configured to dispatch a child task to the at least one processor core in response to the coprocessor receiving a request from the parent task concurrently with the parent task executing on the at least one processor core. In some cases, the parent task registers the child task in a task pool and the child task is a future task that is configured to monitor a completion object and enqueue another task associated with the future task in response to detecting the completion object. The future task is configured to self-enqueue by adding a continuation future task to a continuation queue for subsequent execution in response to the future task failing to detect the completion object.

13.

发明公开
Scheduling Processing-in-Memory Transactions 审中-公开

公开(公告)号：US20240220160A1

公开(公告)日：2024-07-04

申请号：US18148000

申请日：2022-12-29

Applicant: Advanced Micro Devices, Inc.

Inventor： Alexandru Dutu , Sooraj Puthoor

IPC: G06F3/06

CPC classification number: G06F3/0659 , G06F3/0607 , G06F3/0656 , G06F3/0658 , G06F3/0679

Abstract: Scheduling processing-in-memory transactions is described. In accordance with the described techniques, a memory controller receives a transaction header from a host, where the transaction header describes a number of operations to be executed by a processing-in-memory component as part of performing the transaction. The memory controller adds the transaction header to a buffer and sends either an acknowledgement message or a negative acknowledgement message to the host, based on a current load of the processing-in-memory component. The acknowledgement message causes the host to send operations of the transaction for execution by the processing-in-memory component and the negative acknowledgement message causes the host to refrain from sending the operations of the transaction for execution by the processing-in-memory component.

14.

发明授权
Partition and isolation of a processing-in-memory (PIM) device 有权

公开(公告)号：US11934827B2

公开(公告)日：2024-03-19

申请号：US17556291

申请日：2021-12-20

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Sooraj Puthoor , Muhammad Amber Hassaan , Ashwin Aji , Michael L. Chu , Nuwan Jayasena

IPC: G06F9/30 , G06F7/575 , G06F9/38

CPC classification number: G06F9/3004 , G06F7/575 , G06F9/3001 , G06F9/3856

Abstract: An apparatus that manages multi-process execution in a processing-in-memory (“PIM”) device includes a gatekeeper configured to: receive an identification of one or more registered PIM processes; receive, from a process, a memory request that includes a PIM command; if the requesting process is a registered PIM process and another registered PIM process is active on the PIM device, perform a context switch of PIM state between the registered PIM processes; and issue the PIM command of the requesting process to the PIM device.

15.

发明授权
Implementing heterogeneous wavefronts on a graphics processing unit (GPU) 有权

公开(公告)号：US11875425B2

公开(公告)日：2024-01-16

申请号：US17134904

申请日：2020-12-28

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Sooraj Puthoor , Bradford Beckmann , Nuwan Jayasena , Anthony Gutierrez

IPC: G06F9/38 , G06T1/20 , G06F9/30 , G06F9/54

CPC classification number: G06T1/20 , G06F9/30036 , G06F9/3836 , G06F9/3877 , G06F9/3887 , G06F9/545 , G06T2210/52

Abstract: Implementing heterogeneous wavefronts on a graphics processing unit (GPU) is disclosed. A scheduler assigns heterogeneous wavefronts for execution on a compute unit of a processing device. The heterogeneous wavefronts include different types of wavefronts such as vector compute wavefronts and service-level wavefronts that vary in resource requirements and instruction sets. As one example, heterogeneous wavefronts may include scalar wavefronts and vector compute wavefronts that execute on scalar units and vector units, respectively. Distinct sets of instructions are executed for the heterogeneous wavefronts on the compute unit. Heterogeneous wavefronts are processed in the same pipeline of the processing device.

16.

发明授权
Scoped persistence barriers for non-volatile memories 有权

公开(公告)号：US11573724B2

公开(公告)日：2023-02-07

申请号：US16432391

申请日：2019-06-05

Applicant: Advanced Micro Devices, Inc.

Inventor： Arkaprava Basu , Mitesh R. Meswani , Dibakar Gope , Sooraj Puthoor

IPC: G06F3/06 , G06F12/02

Abstract: A processing apparatus is provided that includes NVRAM and one or more processors configured to process a first set and a second set of instructions according to a hierarchical processing scope and process a scoped persistence barrier residing in the program after the first instruction set and before the second instruction set. The barrier includes an instruction to cause first data to persist in the NVRAM before second data persists in the NVRAM. The first data results from execution of each of the first set of instructions processed according to the one hierarchical processing scope. The second data results from execution of each of the second set of instructions processed according to the one hierarchical processing scope. The processing apparatus also includes a controller configured to cause the first data to persist in the NVRAM before the second data persists in the NVRAM based on the scoped persistence barrier.

17.

发明申请
SYSTEM PERFORMANCE MANAGEMENT USING PRIORITIZED COMPUTE UNITS 有权

公开(公告)号：US20220114097A1

公开(公告)日：2022-04-14

申请号：US17556348

申请日：2021-12-20

Applicant: Advanced Micro Devices, Inc.

Inventor： Zhe Wang , Sooraj Puthoor , Bradford M. Beckmann

IPC: G06F12/084

Abstract: Methods, devices, and systems for managing performance of a processor having multiple compute units. An effective number of the multiple compute units may be determined to designate as having priority. On a condition that the effective number is nonzero, the effective number of the multiple compute units may each be designated as a priority compute unit. Priority compute units may have access to a shared cache whereas non-priority compute units may not. Workgroups may be preferentially dispatched to priority compute units. Memory access requests from priority compute units may be served ahead of requests from non-priority compute units.

18.

发明申请
CONTINUATION ANALYSIS TASKS FOR GPU TASK SCHEDULING 审中-公开

公开(公告)号：US20200379802A1

公开(公告)日：2020-12-03

申请号：US16846654

申请日：2020-04-13

Applicant: Advanced Micro Devices, Inc.

Inventor： Steven Tony Tye , Brian L. Sumner , Bradford Michael Beckmann , Sooraj Puthoor

IPC: G06F9/48 , G06F9/38 , G06F9/50 , G06F9/52

Abstract: Systems, apparatuses, and methods for implementing continuation analysis tasks (CATs) are disclosed. In one embodiment, a system implements hardware acceleration of CATs to manage the dependencies and scheduling of an application composed of multiple tasks. In one embodiment, a continuation packet is referenced directly by a first task. When the first task completes, the first task enqueues a continuation packet on a first queue. The first task can specify on which queue to place the continuation packet. The agent responsible for the first queue dequeues and executes the continuation packet which invokes an analysis phase which is performed prior to determining which dependent tasks to enqueue. If it is determined during the analysis phase that a second task is now ready to be launched, the second task is enqueued on one of the queues. Then, an agent responsible for this queue dequeues and executes the second task.

19.

发明授权
Scoped persistence barriers for non-volatile memories 有权

公开(公告)号：US10324650B2

公开(公告)日：2019-06-18

申请号：US15274777

申请日：2016-09-23

Applicant: Advanced Micro Devices, Inc.

Inventor： Arkaprava Basu , Mitesh R. Meswani , Dibakar Gope , Sooraj Puthoor

IPC: G06F3/06 , G06F12/02

Abstract: A processing apparatus is provided that includes NVRAM and one or more processors configured to process a first set and a second set of instructions according to a hierarchical processing scope and process a scoped persistence barrier residing in the program after the first instruction set and before the second instruction set. The barrier includes an instruction to cause first data to persist in the NVRAM before second data persists in the NVRAM. The first data results from execution of each of the first set of instructions processed according to the one hierarchical processing scope. The second data results from execution of each of the second set of instructions processed according to the one hierarchical processing scope. The processing apparatus also includes a controller configured to cause the first data to persist in the NVRAM before the second data persists in the NVRAM based on the scoped persistence barrier.

20.

发明申请
CONTINUATION ANALYSIS TASKS FOR GPU TASK SCHEDULING 审中-公开

公开(公告)号：US20180349145A1

公开(公告)日：2018-12-06

申请号：US15607991

申请日：2017-05-30

Applicant: Advanced Micro Devices, Inc.

Inventor： Steven Tony Tye , Brian L. Sumner , Bradford Michael Beckmann , Sooraj Puthoor

IPC: G06F9/38 , G06F9/54

CPC classification number: G06F9/505 , G06F9/5066 , G06F2209/509

Abstract: Systems, apparatuses, and methods for implementing continuation analysis tasks (CATs) are disclosed. In one embodiment, a system implements hardware acceleration of CATs to manage the dependencies and scheduling of an application composed of multiple tasks. In one embodiment, a continuation packet is referenced directly by a first task. When the first task completes, the first task enqueues a continuation packet on a first queue. The first task can specify on which queue to place the continuation packet. The agent responsible for the first queue dequeues and executes the continuation packet which invokes an analysis phase which is performed prior to determining which dependent tasks to enqueue. If it is determined during the analysis phase that a second task is now ready to be launched, the second task is enqueued on one of the queues. Then, an agent responsible for this queue dequeues and executes the second task.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification