Patent search ap:("Advanced Micro Devices Page Inc.") AND inv:"Bradford M. Beckmann"

1.

发明授权
Fine-grained conditional dispatching 有权

公开(公告)号：US11809902B2

公开(公告)日：2023-11-07

申请号：US17031424

申请日：2020-09-24

Applicant: Advanced Micro Devices, Inc.

Inventor： Alexandru Dutu , Marcus Nathaniel Chow , Matthew D. Sinclair , Bradford M. Beckmann , David A. Wood

IPC: G06F9/48 , G06F9/54 , G06F9/38

CPC classification number: G06F9/4881 , G06F9/3838 , G06F9/545

Abstract: Techniques for executing workgroups are provided. The techniques include executing, for a first workgroup of a first kernel dispatch, a workgroup dependency instruction that includes an indication to prioritize execution of a second workgroup of a second kernel dispatch, and in response to the workgroup dependency instruction, dispatching the second workgroup of the second kernel dispatch prior to dispatching a third workgroup of the second kernel dispatch, wherein no workgroup dependency instruction including an indication to prioritize execution of the third workgroup has been executed.

2.

发明申请
DECOMPOSING MATRICES FOR PROCESSING AT A PROCESSOR-IN-MEMORY 有权

公开(公告)号：US20230102296A1

公开(公告)日：2023-03-30

申请号：US17490037

申请日：2021-09-30

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Michael W. Boyer , Ashish Gondimalla , Bradford M. Beckmann

IPC: G06F17/16

Abstract: A processing unit decomposes a matrix for partial processing at a processor-in-memory (PIM) device. The processing unit receives a matrix to be used as an operand in an arithmetic operation (e.g., a matrix multiplication operation). In response, the processing unit decomposes the matrix into two component matrices: a sparse component matrix and a dense component matrix. The processing unit itself performs the arithmetic operation with the dense component matrix, but sends the sparse component matrix to the PIM device for execution of the arithmetic operation. The processing unit thereby offloads at least some of the processing overhead to the PIM device, improving overall efficiency of the processing system.

3.

发明授权
Techniques for improving operand caching 有权

公开(公告)号：US11436016B2

公开(公告)日：2022-09-06

申请号：US16703833

申请日：2019-12-04

Applicant: Advanced Micro Devices, Inc.

Inventor： Anthony T. Gutierrez , Bradford M. Beckmann , Marcus Nathaniel Chow

IPC: G06F9/30 , G06F9/38

Abstract: A technique for determining whether a register value should be written to an operand cache or whether the register value should remain in and not be evicted from the operand cache is provided. The technique includes executing an instruction that accesses an operand that comprises the register value, performing one or both of a lookahead technique and a prediction technique to determine whether the register value should be written to an operand cache or whether the register value should remain in and not be evicted from the operand cache, and based on the determining, updating the operand cache.

4.

发明申请
FINE-GRAINED CONDITIONAL DISPATCHING 有权

公开(公告)号：US20220091880A1

公开(公告)日：2022-03-24

申请号：US17031424

申请日：2020-09-24

Applicant: Advanced Micro Devices, Inc.

Inventor： Alexandru Dutu , Marcus Nathaniel Chow , Matthew D. Sinclair , Bradford M. Beckmann , David A. Wood

IPC: G06F9/48 , G06F9/38 , G06F9/54

Abstract: Techniques for executing workgroups are provided. The techniques include executing, for a first workgroup of a first kernel dispatch, a workgroup dependency instruction that includes an indication to prioritize execution of a second workgroup of a second kernel dispatch, and in response to the workgroup dependency instruction, dispatching the second workgroup of the second kernel dispatch prior to dispatching a third workgroup of the second kernel dispatch, wherein no workgroup dependency instruction including an indication to prioritize execution of the third workgroup has been executed.

5.

发明授权
System performance management using prioritized compute units 有权

公开(公告)号：US11204871B2

公开(公告)日：2021-12-21

申请号：US14755401

申请日：2015-06-30

Applicant: Advanced Micro Devices, Inc.

Inventor： Zhe Wang , Sooraj Puthoor , Bradford M. Beckmann

IPC: G06F12/08 , G06F12/084

Abstract: Methods, devices, and systems for managing performance of a processor having multiple compute units. An effective number of the multiple compute units may be determined to designate as having priority. On a condition that the effective number is nonzero, the effective number of the multiple compute units may each be designated as a priority compute unit. Priority compute units may have access to a shared cache whereas non-priority compute units may not. Workgroups may be preferentially dispatched to priority compute units. Memory access requests from priority compute units may be served ahead of requests from non-priority compute units.

6.

发明授权
Message aggregation, combining and compression for efficient data communications in GPU-based clusters 有权

公开(公告)号：US10320695B2

公开(公告)日：2019-06-11

申请号：US15165953

申请日：2016-05-26

Applicant: Advanced Micro Devices, Inc.

Inventor： Steven K. Reinhardt , Marc S. Orr , Bradford M. Beckmann , Shuai Che , David A. Wood

IPC: G06F15/173 , H04L12/805 , H04L12/811

Abstract: A system and method for efficient management of network traffic management of highly data parallel computing. A processing node includes one or more processors capable of generating network messages. A network interface is used to receive and send network messages across a network. The processing node reduces at least one of a number or a storage size of the original network messages into one or more new network messages. The new network messages are sent to the network interface to send across the network.

7.

发明授权
Dynamic wavefront creation for processing units using a hybrid compactor 有权

公开(公告)号：US09898287B2

公开(公告)日：2018-02-20

申请号：US14682971

申请日：2015-04-09

Applicant: Advanced Micro Devices, Inc.

Inventor： Sooraj Puthoor , Bradford M. Beckmann , Dmitri Yudanov

IPC: G06F9/38 , G06F9/30 , G06F9/46

CPC classification number: G06F9/30058 , G06F9/3804 , G06F9/3851 , G06F9/3887 , G06F9/46

Abstract: A method, a non-transitory computer readable medium, and a processor for repacking dynamic wavefronts during program code execution on a processing unit, each dynamic wavefront including multiple threads are presented. If a branch instruction is detected, a determination is made whether all wavefronts following a same control path in the program code have reached a compaction point, which is the branch instruction. If no branch instruction is detected in executing the program code, a determination is made whether all wavefronts following the same control path have reached a reconvergence point, which is a beginning of a program code segment to be executed by both a taken branch and a not taken branch from a previous branch instruction. The dynamic wavefronts are repacked with all threads that follow the same control path, if all wavefronts following the same control path have reached the branch instruction or the reconvergence point.

8.

发明申请
SELECTING A RESOURCE FROM A SET OF RESOURCES FOR PERFORMING AN OPERATION 有权
Title translation: 从一组资源中选择一个资源来执行操作

公开(公告)号：US20160062803A1

公开(公告)日：2016-03-03

申请号：US14935056

申请日：2015-11-06

Applicant: Advanced Micro Devices, Inc.

Inventor： Bradford M. Beckmann , Mithuna S. Thottethodi , James M. O'Connor , Mauricio Breternitz , Lisa R. Hsu , Gabriel H. Loh , Yasuko Eckert

IPC: G06F9/50 , G06F12/08

CPC classification number: G06F9/5016 , G06F9/5011 , G06F12/0875 , G06F2212/45

Abstract: The described embodiments comprise a selection mechanism that selects a resource from a set of resources in a computing device for performing an operation. In some embodiments, the selection mechanism performs a lookup in a table selected from a set of tables to identify a resource from the set of resources. When the resource is not available for performing the operation and until another resource is selected for performing the operation, the selection mechanism identifies a next resource in the table and selects the next resource for performing the operation when the next resource is available for performing the operation.

Abstract translation: 所描述的实施例包括从用于执行操作的计算设备中的一组资源中选择资源的选择机制。在一些实施例中，选择机制在从一组表中选择的表中执行查找以从资源集合中识别资源。当资源不可用于执行操作并且直到选择用于执行操作的另一资源为止时，选择机制识别表中的下一个资源，并且当下一个资源可用于执行操作时选择用于执行操作的下一个资源。

9.

发明授权
Selecting a resource from a set of resources for performing an operation 有权
Title translation: 从一组用于执行操作的资源中选择资源

公开(公告)号：US09183055B2

公开(公告)日：2015-11-10

申请号：US13761985

申请日：2013-02-07

Applicant: Advanced Micro Devices, Inc.

Inventor： Bradford M. Beckmann , Mithuna S. Thottethodi , James M. O'Connor , Mauricio Breternitz , Lisa R. Hsu , Gabriel H. Loh , Yasuko Eckert

IPC: G06F9/46 , G06F9/50

CPC classification number: G06F9/5016 , G06F9/5011 , G06F12/0875 , G06F2212/45

Abstract: The described embodiments comprise a selection mechanism that selects a resource from a set of resources in a computing device for performing an operation. In some embodiments, the selection mechanism is configured to perform a lookup in a table selected from a set of tables to identify a resource from the set of resources. When the identified resource is not available for performing the operation and until a resource is selected for performing the operation, the selection mechanism is configured to identify a next resource in the table and select the next resource for performing the operation when the next resource is available for performing the operation.

Abstract translation: 所描述的实施例包括从用于执行操作的计算设备中的一组资源中选择资源的选择机制。在一些实施例中，选择机制被配置为在从一组表中选择的表中执行查找，以从资源集合中识别资源。当所识别的资源不可用于执行操作并且直到选择资源来执行操作时，选择机制被配置为识别表中的下一个资源，并且当下一个资源可用时选择用于执行操作的下一个资源用于执行操作。

10.

发明授权
Multi-core processing device with invalidation cache tags and methods 有权
Title translation: 具有无效缓存标签和方法的多核处理设备

公开(公告)号：US09003130B2

公开(公告)日：2015-04-07

申请号：US13719730

申请日：2012-12-19

Applicant: Advanced Micro Devices, Inc.

Inventor： James O'Connor , Bradford M. Beckmann

IPC: G06F13/00 , G06F12/08

CPC classification number: G06F12/0864 , G06F12/0815

Abstract: A data processing device is provided that facilitates cache coherence policies. In one embodiment, a data processing device utilizes invalidation tags in connection with a cache that is associated with a processing engine. In some embodiments, the cache is configured to store a plurality of cache entries where each cache entry includes a cache line configured to store data and a corresponding cache tag configured to store address information associated with data stored in the cache line. Such address information includes invalidation flags with respect to addresses stored in the cache tags. Each cache tag is associated with an invalidation tag configured to store information related to invalidation commands of addresses stored in the cache tag. In such embodiment, the cache is configured to set invalidation flags of cache tags based upon information stored in respective invalidation tags.

Abstract translation: 提供了一种有助于高速缓存一致性策略的数据处理设备。在一个实施例中，数据处理设备利用与处理引擎相关联的高速缓存的无效标签。在一些实施例中，高速缓存被配置为存储多个高速缓存条目，其中每个高速缓存条目包括被配置为存储数据的高速缓存行和被配置为存储与存储在高速缓存行中的数据相关联的地址信息的对应高速缓存标签。这样的地址信息包括关于存储在高速缓存标签中的地址的无效标志。每个缓存标签与被配置为存储与存储在高速缓存标签中的地址的无效命令相关的信息的无效标签相关联。在这种实施例中，高速缓存被配置为基于存储在相应无效标签中的信息来设置高速缓存标签的无效标志。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification