Patent search ap:("ADVANCED MICRO DEVICES Page INC.") AND inv:"Sooraj Puthoor"

11.

发明授权
Dynamic wavefront creation for processing units using a hybrid compactor 有权

公开(公告)号：US09898287B2

公开(公告)日：2018-02-20

申请号：US14682971

申请日：2015-04-09

Applicant: Advanced Micro Devices, Inc.

Inventor： Sooraj Puthoor , Bradford M. Beckmann , Dmitri Yudanov

IPC: G06F9/38 , G06F9/30 , G06F9/46

CPC classification number: G06F9/30058 , G06F9/3804 , G06F9/3851 , G06F9/3887 , G06F9/46

Abstract: A method, a non-transitory computer readable medium, and a processor for repacking dynamic wavefronts during program code execution on a processing unit, each dynamic wavefront including multiple threads are presented. If a branch instruction is detected, a determination is made whether all wavefronts following a same control path in the program code have reached a compaction point, which is the branch instruction. If no branch instruction is detected in executing the program code, a determination is made whether all wavefronts following the same control path have reached a reconvergence point, which is a beginning of a program code segment to be executed by both a taken branch and a not taken branch from a previous branch instruction. The dynamic wavefronts are repacked with all threads that follow the same control path, if all wavefronts following the same control path have reached the branch instruction or the reconvergence point.

12.

发明授权
Allocation of resources when processing at memory level through memory request scheduling 有权

公开(公告)号：US12204774B2

公开(公告)日：2025-01-21

申请号：US17986623

申请日：2022-11-14

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Alexandru Dutu , Nuwan Jayasena , Yasuko Eckert , Niti Madan , Sooraj Puthoor

IPC: G06F3/06

Abstract: An apparatus includes a memory controller that includes logic to receive a first memory request having a first request type and a second memory request having a second request type. The apparatus also includes a scheduling unit that includes logic to schedule an order of the first and second memory requests for execution based upon a first parameter value and a second parameter value. The first parameter value corresponds to a utility and energy cost for the first memory request and the second parameter value corresponds to a utility and energy cost for the second memory request.

13.

发明授权
Hardware assisted fine-grained data movement 有权

公开(公告)号：US11868809B2

公开(公告)日：2024-01-09

申请号：US18095704

申请日：2023-01-11

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Muhammad Amber Hassaan , Anirudh Mohan Kaushik , Sooraj Puthoor , Gokul Subramanian Ravi , Bradford Beckmann , Ashwin Aji

IPC: G06F9/46 , G06F9/48 , G06F9/52 , G06F16/901

CPC classification number: G06F9/4881 , G06F9/52 , G06F16/9024 , G06F2209/486

Abstract: A processor includes a task scheduling unit and a compute unit coupled to the task scheduling unit. The task scheduling unit performs a task dependency assessment of a task dependency graph and task data requirements that correspond to each task of the plurality of tasks. Based on the task dependency assessment, the task scheduling unit schedules a first task of the plurality of tasks and a second proxy object of a plurality of proxy objects specified by the task data requirements such that a memory transfer of the second proxy object of the plurality of proxy objects occurs while the first task is being executed.

14.

发明授权
Dynamic kernel memory space allocation 有权

公开(公告)号：US11720993B2

公开(公告)日：2023-08-08

申请号：US16138708

申请日：2018-09-21

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Anthony Gutierrez , Muhammad Amber Hassaan , Sooraj Puthoor

IPC: G06F12/02 , G06T1/60 , G06F9/30 , G06T1/20

CPC classification number: G06T1/60 , G06F9/30098 , G06F12/02 , G06F12/023 , G06T1/20

Abstract: A processing unit includes one or more processor cores and a set of registers to store configuration information for the processing unit. The processing unit also includes a coprocessor configured to receive a request to modify a memory allocation for a kernel concurrently with the kernel executing on the at least one processor core. The coprocessor is configured to modify the memory allocation by modifying the configuration information stored in the set of registers. In some cases, initial configuration information is provided to the set of registers by a different processing unit. The initial configuration information is stored in the set of registers prior to the coprocessor modifying the configuration information.

15.

发明公开
APPROACH FOR PROVIDING INDIRECT ADDRESSING IN MEMORY MODULES 审中-公开

公开(公告)号：US20230205705A1

公开(公告)日：2023-06-29

申请号：US17561406

申请日：2021-12-23

Applicant: Advanced Micro Devices, Inc.

Inventor： Matthew R. Poremba , Alexandru Dutu , Sooraj Puthoor

IPC: G06F12/10

CPC classification number: G06F12/10

Abstract: An approach provides indirect addressing support for PIM. Indirect PIM commands include address translation information that allows memory modules to perform indirect addressing. Processing logic in a memory module processes an indirect PIM command and retrieves, from a first memory location, a virtual address of a second memory location. The processing logic calculates a corresponding physical address for the virtual address using the address translation information included with the indirect PIM command and retrieves, from the second memory location, a virtual address of a third memory location. This process is repeated any number of times until one or more indirection stop criteria are satisfied. The indirection stop criteria stop the process when work has been completed normally or to prevent errors. Implementations include the processing logic in the memory module working in cooperation with a memory controller to perform indirect addressing.

16.

发明授权
Continuation analysis tasks for GPU task scheduling 有权

公开(公告)号：US11544106B2

公开(公告)日：2023-01-03

申请号：US16846654

申请日：2020-04-13

Applicant: Advanced Micro Devices, Inc.

Inventor： Steven Tony Tye , Brian L. Sumner , Bradford Michael Beckmann , Sooraj Puthoor

IPC: G06F9/48 , G06F9/38 , G06F9/50 , G06F9/52

Abstract: Systems, apparatuses, and methods for implementing continuation analysis tasks (CATs) are disclosed. In one embodiment, a system implements hardware acceleration of CATs to manage the dependencies and scheduling of an application composed of multiple tasks. In one embodiment, a continuation packet is referenced directly by a first task. When the first task completes, the first task enqueues a continuation packet on a first queue. The first task can specify on which queue to place the continuation packet. The agent responsible for the first queue dequeues and executes the continuation packet which invokes an analysis phase which is performed prior to determining which dependent tasks to enqueue. If it is determined during the analysis phase that a second task is now ready to be launched, the second task is enqueued on one of the queues. Then, an agent responsible for this queue dequeues and executes the second task.

17.

发明申请
FPGA-BASED PROGRAMMABLE DATA ANALYSIS AND COMPRESSION FRONT END FOR GPU 有权

公开(公告)号：US20220188493A1

公开(公告)日：2022-06-16

申请号：US17118442

申请日：2020-12-10

Applicant: Advanced Micro Devices, Inc.

Inventor： Kevin Y. Cheng , Sooraj Puthoor , Onur Kayiran

IPC: G06F30/331 , G06F30/34 , G06F9/38

Abstract: Methods, devices, and systems for information communication. Information transmitted from a host to a graphics processing unit (GPU) is received by information analysis circuitry of a field-programmable gate array (FPGA). A pattern in the information is determined by the information analysis circuitry. A predicted information pattern is determined, by the information analysis circuitry, based on the information. An indication of the predicted information pattern is transmitted to the host. Responsive to a signal from the host based on the predicted information pattern, the FPGA is reprogrammed to implement decompression circuitry based on the predicted information pattern. In some implementations, the information includes a plurality of packets. In some implementations, the predicted information pattern includes a pattern in a plurality of packets. In some implementations, the predicted information pattern includes a zero data pattern.

18.

发明授权
Mechanism for mitigating information leak via cache side channels during speculative execution 有权

公开(公告)号：US11231931B1

公开(公告)日：2022-01-25

申请号：US16228187

申请日：2018-12-20

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Sooraj Puthoor

IPC: G06F9/312 , G06F12/0815 , G06F12/0897 , G06F9/38 , G06F12/0875 , G06F9/30 , G06F12/0817 , G06F12/0811

Abstract: A processor includes a first core and a second core to execute computer instructions. Each of the cores includes its own private memory cache and speculative load queue. The speculative load queue stores cachelines for the computer instructions and data when the core is operating in a speculative state with respect to a process or thread. The processor includes a state tracking buffer having a state field to store a speculative exclusive ownership state for each cacheline in the speculative load queue when present therein.

19.

发明申请
MEMORY REQUEST PRIORITY ASSIGNMENT TECHNIQUES FOR PARALLEL PROCESSORS 有权

公开(公告)号：US20210173796A1

公开(公告)日：2021-06-10

申请号：US16706421

申请日：2019-12-06

Applicant: Advanced Micro Devices, Inc.

Inventor： Sooraj Puthoor , Kishore Punniyamurthy , Onur Kayiran , Xianwei Zhang , Yasuko Eckert , Johnathan Alsop , Bradford Michael Beckmann

IPC: G06F13/18 , G06F13/16

Abstract: Systems, apparatuses, and methods for implementing memory request priority assignment techniques for parallel processors are disclosed. A system includes at least a parallel processor coupled to a memory subsystem, where the parallel processor includes at least a plurality of compute units for executing wavefronts in lock-step. The parallel processor assigns priorities to memory requests of wavefronts on a per-work-item basis by indexing into a first priority vector, with the index generated based on lane-specific information. If a given event is detected, a second priority vector is generated by applying a given priority promotion vector to the first priority vector. Then, for subsequent wavefronts, memory requests are assigned priorities by indexing into the second priority vector with lane-specific information. The use of priority vectors to assign priorities to memory requests helps to reduce the memory divergence problem experienced by different work-items of a wavefront.

20.

发明申请
SCOPED PERSISTENCE BARRIERS FOR NON-VOLATILE MEMORIES 审中-公开

公开(公告)号：US20180088858A1

公开(公告)日：2018-03-29

申请号：US15274777

申请日：2016-09-23

Applicant: Advanced Micro Devices, Inc.

Inventor： Arkaprava Basu , Mitesh R. Meswani , Dibakar Gope , Sooraj Puthoor

IPC: G06F3/06 , G06F12/0804

CPC classification number: G06F3/0647 , G06F3/0619 , G06F3/0659 , G06F3/0685 , G06F12/0246

Abstract: A processing apparatus is provided that includes NVRAM and one or more processors configured to process a first set and a second set of instructions according to a hierarchical processing scope and process a scoped persistence barrier residing in the program after the first instruction set and before the second instruction set. The barrier includes an instruction to cause first data to persist in the NVRAM before second data persists in the NVRAM. The first data results from execution of each of the first set of instructions processed according to the one hierarchical processing scope. The second data results from execution of each of the second set of instructions processed according to the one hierarchical processing scope. The processing apparatus also includes a controller configured to cause the first data to persist in the NVRAM before the second data persists in the NVRAM based on the scoped persistence barrier.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification