Patent search ap:("Advanced Micro Devices Page Inc." OR "ATI Technologies ULC") AND inv:"Michael Mantor"

61.

发明公开
HIERARCHICAL WORK SCHEDULING 审中-公开

公开(公告)号：US20240111578A1

公开(公告)日：2024-04-04

申请号：US17957714

申请日：2022-09-30

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Matthaeus G. Chajdas , Christopher J. Brennan , Michael Mantor , Robert W. Martin , Nicolai Haehnle

IPC: G06F9/48

CPC classification number: G06F9/4881

Abstract: A method for hierarchical work scheduling includes consuming a work item at a first scheduling domain having a local scheduler circuit and one or more workgroup processing elements. Consuming the work item produces a set of new work items. Subsequently, the local scheduler circuit distributes at least one new work item of the set of new work items to be executed locally at the first scheduling domain. If the local scheduler circuit of the first scheduling domain determines that the set of new work items includes one or more work items that would overload the first scheduling domain with work if scheduled for local execution, those work items are distributed to the next higher-level scheduler circuit in a scheduling domain hierarchy for redistribution to one or more other scheduling domains.

62.

发明授权
Precise suspend and resume of workloads in a processing unit 有权

公开(公告)号：US11609791B2

公开(公告)日：2023-03-21

申请号：US15828059

申请日：2017-11-30

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Anirudh R. Acharya , Michael Mantor

IPC: G06F9/50 , G06T1/20 , G06F9/52 , G06F9/48

Abstract: A first workload is executed in a first subset of pipelines of a processing unit. A second workload is executed in a second subset of the pipelines of the processing unit. The second workload is dependent upon the first workload. The first and second workloads are suspended and state information for the first and second workloads is stored in a first memory in response to suspending the first and second workloads. In some cases, a third workload executes in a third subset of the pipelines of the processing unit concurrently with executing the first and second workloads. In some cases, a fourth workload is executed in the first and second pipelines after suspending the first and second workloads. The first and second pipelines are resumed on the basis of the stored state information in response to completion or suspension of the fourth workload.

63.

发明申请
PREFETCH KERNELS ON DATA-PARALLEL PROCESSORS 有权

公开(公告)号：US20230076872A1

公开(公告)日：2023-03-09

申请号：US17985674

申请日：2022-11-11

Applicant: Advanced Micro Devices, Inc.

Inventor： Nuwan S. Jayasena , James Michael O'Connor , Michael Mantor

IPC: G06F12/0862 , G06F9/52 , G06F8/41

Abstract: Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel that includes memory accesses for prefetching data for a processing kernel into a memory, and, subsequent to executing at least a portion of the prefetch kernel, executing the processing kernel where the processing kernel includes accesses to data that is stored into the memory resulting from execution of the prefetch kernel.

64.

发明申请
DUAL VECTOR ARITHMETIC LOGIC UNIT 有权

公开(公告)号：US20220188076A1

公开(公告)日：2022-06-16

申请号：US17121354

申请日：2020-12-14

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Bin He , Brian Emberling , Mark Leather , Michael Mantor

IPC: G06F7/57 , G06F17/16 , G06F9/38 , G06T1/20

Abstract: A processing system executes wavefronts at multiple arithmetic logic unit (ALU) pipelines of a single instruction multiple data (SIMD) unit in a single execution cycle. The ALU pipelines each include a number of ALUs that execute instructions on wavefront operands that are collected from vector general process register (VGPR) banks at a cache and output results of the instructions executed on the wavefronts at a buffer. By storing wavefronts supplied by the VGPR banks at the cache, a greater number of wavefronts can be made available to the SIMD unit without increasing the VGPR bandwidth, enabling multiple ALU pipelines to execute instructions during a single execution cycle.

65.

发明授权
Selective prefetching in multithreaded processing units 有权

公开(公告)号：US11226819B2

公开(公告)日：2022-01-18

申请号：US15818304

申请日：2017-11-20

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Brian Emberling , Michael Mantor

IPC: G06F9/30 , G06F12/0862 , G06F12/0811

Abstract: A processing unit includes a plurality of processing elements and one or more caches. A first thread executes a program that includes one or more prefetch instructions to prefetch information into a first cache. Prefetching is selectively enabled when executing the first thread on a first processing element dependent upon whether one or more second threads previously executed the program on the first processing element. The first thread is then dispatched to execute the program on the first processing element. In some cases, a dispatcher receives the first thread four dispatching to the first processing element. The dispatcher modifies the prefetch instruction to disable prefetching into the first cache in response to the one or more second threads having previously executed the program on the first processing element.

66.

发明授权
Broadcast synchronization for dynamically adaptable arrays 有权

公开(公告)号：US11200060B1

公开(公告)日：2021-12-14

申请号：US17132002

申请日：2020-12-23

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Sateesh Lagudu , Arun Vaidyanathan Ananthanarayan , Michael Mantor , Allen H. Rush

IPC: G06F9/30 , G06F9/32 , G06F15/80

Abstract: An array processor includes processor element arrays (PEAs) distributed in rows and columns. The PEAs are configured to perform operations on parameter values. A first sequencer received a first direct memory access (DMA) instruction that includes a request to read data from at least one address in memory. A texture address (TA) engine requests the data from the memory based on the at least one address and a texture data (TD) engine provides the data to the PEAs. The PEAs provide first synchronization signals to the TD engine to indicate availability of registers for receiving the data. The TD engine provides second synchronization signals to the first sequencer in response to receiving acknowledgments that the PEAs have consumed the data.

67.

发明申请
SYSTEM AND METHOD FOR PROTECTING GPU MEMORY INSTRUCTIONS AGAINST FAULTS 有权

公开(公告)号：US20210117269A1

公开(公告)日：2021-04-22

申请号：US17113815

申请日：2020-12-07

Applicant: Advanced Micro Devices, Inc.

Inventor： John Kalamatianos , Michael Mantor , Sudhanva Gurumurthi

IPC: G06F11/10 , G06F12/0866 , G06F11/16

Abstract: A system and method for protecting memory instructions against faults are described. The system and method include converting the slave instructions to dummy operations, modifying memory arbiter to issue up to N master and N slave global/shared memory instructions per cycle, sending master memory requests to memory system, using slave requests for error checking, entering master requests to the GM/LM FIFO, storing slave requests in a register, and comparing the entered master requests with the stored slave requests.

68.

发明授权
System and method for protecting GPU memory instructions against faults 有权

公开(公告)号：US10860418B2

公开(公告)日：2020-12-08

申请号：US16378287

申请日：2019-04-08

Applicant: Advanced Micro Devices, Inc.

Inventor： John Kalamatianos , Michael Mantor , Sudhanva Gurumurthi

IPC: G06F11/10 , G06F11/16 , G06F12/0866 , G06F11/00 , H03M13/00

Abstract: A system and method for protecting memory instructions against faults are described. The system and method include converting the slave instructions to dummy operations, modifying memory arbiter to issue up to N master and N slave global/shared memory instructions per cycle, sending master memory requests to memory system, using slave requests for error checking, entering master requests to the GM/LM FIFO, storing slave requests in a register, and comparing the entered master requests with the stored slave requests.

69.

发明申请
PREFETCH KERNELS ON DATA-PARALLEL PROCESSORS 审中-公开

公开(公告)号：US20200210341A1

公开(公告)日：2020-07-02

申请号：US16813075

申请日：2020-03-09

Applicant: Advanced Micro Devices, Inc.

Inventor： Nuwan S. Jayasena , James Michael O'Connor , Michael Mantor

IPC: G06F12/0862 , G06F9/52 , G06F8/41

Abstract: Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel with reduced intermediate state storage resource requirements. These include executing a prefetch kernel on a graphics processing unit (GPU), such that the prefetch kernel begins executing before a processing kernel. The prefetch kernel performs memory operations that are based upon at least a subset of memory operations in the processing kernel.

70.

发明授权
Reconfigurable virtual graphics and compute processor pipeline 有权

公开(公告)号：US10664942B2

公开(公告)日：2020-05-26

申请号：US15331278

申请日：2016-10-21

Applicant: Advanced Micro Devices, Inc.

Inventor： Timour T. Paltashev , Michael Mantor , Rex Eldon McCrary

IPC: G06T1/20 , G06T15/00 , G06T15/80 , G06T1/60 , G06T17/10

Abstract: A graphics processing unit (GPU) includes a plurality of programmable processing cores configured to process graphics primitives and corresponding data and a plurality of fixed-function hardware units. The plurality of processing cores and the plurality of fixed-function hardware units are configured to implement a configurable number of virtual pipelines to concurrently process different command flows. Each virtual pipeline includes a configurable number of fragments and an operational state of each virtual pipeline is specified by a different context. The configurable number of virtual pipelines can be modified from a first number to a second number that is different than the first number. An emulation of a fixed-function hardware unit can be instantiated on one or more of the graphics processing cores in response to detection of a bottleneck in a fixed-function hardware unit. One or more of the virtual pipelines can then be reconfigured to utilize the emulation instead of the fixed-function hardware unit.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification