Patent search ap:("Advanced Micro Devices Page Inc.") AND inv:"Shuai Che"

11.

发明授权
Method and apparatus for peer-to-peer messaging in heterogeneous machine clusters 有权

公开(公告)号：US11429462B2

公开(公告)日：2022-08-30

申请号：US16887643

申请日：2020-05-29

Applicant: Advanced Micro Devices, Inc.

Inventor： Shuai Che

IPC: G06F9/54

Abstract: Various computing network messaging techniques and apparatus are disclosed. In one aspect, a method of computing is provided that includes executing a first thread and a second thread. A message is sent from the first thread to the second thread. The message includes a domain descriptor that identifies a first location of the first thread and a second location of the second thread.

12.

发明授权
Flexible framework to support memory synchronization operations 有权

公开(公告)号：US10198261B2

公开(公告)日：2019-02-05

申请号：US15096205

申请日：2016-04-11

Applicant: Advanced Micro Devices, Inc.

Inventor： Shuai Che , Marc S. Orr , Bradford M. Beckmann

IPC: G06F9/30 , G06F12/10 , G06F12/0802 , G06F12/0811 , G06F12/0815 , G06F12/0837 , G06F12/0875 , G06F12/0897

Abstract: A method of performing memory synchronization operations is provided that includes receiving, at a programmable cache controller in communication with one or more caches, an instruction in a first language to perform a memory synchronization operation of synchronizing a plurality of instruction sequences executing on a processor, mapping the received instruction in the first language to one or more selected cache operations in a second language executable by the cache controller and executing the one or more cache operations to perform the memory synchronization operation. The method further comprises receiving a second mapping that provides mapping instructions to map the received instruction to one or more other cache operations, mapping the received instruction to one or more other cache operations and executing the one or more other cache operations to perform the memory synchronization operation.

13.

发明授权
System and method for repurposing dead cache blocks 有权

公开(公告)号：US09990289B2

公开(公告)日：2018-06-05

申请号：US14491296

申请日：2014-09-19

Applicant: Advanced Micro Devices, Inc.

Inventor： Gabriel H. Loh , Derek R. Hower , Shuai Che

IPC: G06F12/08 , G06F12/0815 , G06F12/0864 , G06F12/0891

CPC classification number: G06F12/0815 , G06F12/0864 , G06F12/0891 , Y02D10/13

Abstract: A processing system having a multilevel cache hierarchy employs techniques for repurposing dead cache blocks so as to use otherwise wasted space in a cache hierarchy employing a write-back scheme. For a cache line containing invalid data with a valid tag, the valid tag is maintained for cache coherence purposes or otherwise, resulting in a valid tag for a dead cache block. A cache controller repurposes the dead cache block by storing any of a variety of new data at the dead cache block, while storing the new tag in a tag entry of a dead block tag way with an identifier indicating the location of the new data.

14.

发明申请
Offloading Execution of an Application by a Network Connected Device 审中-公开

公开(公告)号：US20170353397A1

公开(公告)日：2017-12-07

申请号：US15174624

申请日：2016-06-06

Applicant: Advanced Micro Devices, Inc.

Inventor： Shuai Che

IPC: H04L12/911 , H04L29/08

CPC classification number: H04L67/10

Abstract: A client device detects one or more servers to which an application can be offloaded. The client device receives information from the servers regarding their graphics processing unit (GPU) compute resources. The client device selects one of the servers to offload the application based on such factors as the GPU compute resources, other performance metrics, power, and bandwidth/latency/quality of the communication channel between the server and the client device. The client device sends host code and a GPU computation kernel in intermediate language format to the server. The server compiles the host code and GPU kernel code into suitable machine instruction set architecture code for execution on CPU(s) and GPU(s) of the server. Once the application execution is complete, the server returns the results of the execution to the client device.

15.

发明申请
SYSTEMS AND METHODS OF SUPPORTING PARALLEL PROCESSOR MESSAGE-BASED COMMUNICATIONS 审中-公开

公开(公告)号：US20170289078A1

公开(公告)日：2017-10-05

申请号：US15084101

申请日：2016-03-29

Applicant: Advanced Micro Devices, Inc.

Inventor： Shuai Che

IPC: H04L12/58 , H04L29/08

Abstract: A method of message-based communication is provided which includes executing, on one or more accelerated processing units, a plurality of groups of work items, receiving a first message from a first group of work items of the plurality of groups of work items executing on the one or more accelerated processing units and storing the first message at a first segment of memory allocated to a second group of work items of the plurality of groups of work items executing on the accelerated processing unit.

16.

发明申请
GENERATING A SCHEDULE OF INSTRUCTIONS BASED ON A PROCESSOR MEMORY TREE 审中-公开
Title translation: 根据处理器记忆树生成指令时间表

公开(公告)号：US20160239278A1

公开(公告)日：2016-08-18

申请号：US14623180

申请日：2015-02-16

Applicant: Advanced Micro Devices, Inc.

Inventor： Shuai Che

IPC: G06F9/45

CPC classification number: G06F8/4441

Abstract: A processor employs a memory tree and a code generation and scheduling framework (CGSF) to generate instructions to access data at memory modules associated with the processor. The memory tree is a data structure having a plurality of nodes, with each node corresponding to a different memory module, memory cluster, or other portion of memory. The CGSF employs the memory tree to expose the memory hierarchy of the processor to a computer programmer. The computer programmer can employ compiler directives to identify nodes of the memory tree and to establish data ordering and manipulation formats for each node. Based on the directives and the memory tree, the CGSF generates schedules of instructions that, when executed at the processor, enforce the data ordering and manipulation formats.

Abstract translation: 处理器使用存储器树和代码生成和调度框架（CGSF）来生成用于访问与处理器相关联的存储器模块中的数据的指令。存储器树是具有多个节点的数据结构，每个节点对应于不同的存储器模块，存储器簇或存储器的其他部分。 CGSF使用记忆树将处理器的存储器层次结构公开到计算机编程器。计算机程序员可以使用编译器指令来识别存储器树的节点，并为每个节点建立数据排序和操作格式。基于指令和存储器树，CGSF生成指令的计划，当处理器执行时，执行数据排序和操作格式。

17.

发明申请
METHOD AND SYSTEM FOR BLOCK SCHEDULING CONTROL IN A PROCESSOR BY REMAPPING 有权
Title translation: 通过重新处理器进行块调度控制的方法和系统

公开(公告)号：US20160117206A1

公开(公告)日：2016-04-28

申请号：US14523682

申请日：2014-10-24

Applicant: Advanced Micro Devices, Inc.

Inventor： Shuai Che , Derek R. Hower

IPC: G06F9/54 , G06T1/20

CPC classification number: G06F9/547 , G06F9/4881 , G06T1/20 , G06T2200/28

Abstract: A method and a system for block scheduling are disclosed. The method includes retrieving an original block ID, determining a corresponding new block ID from a mapping, executing a new block corresponding to the new block ID, and repeating the retrieving, determining, and executing for each original block ID. The system includes a program memory configured to store multi-block computer programs, an identifier memory configured to store block identifiers (ID's), management hardware configured to retrieve an original block ID from the program memory, scheduling hardware configured to receive the original block ID from the management hardware and determine a new block ID corresponding to the original block ID using a stored mapping, and processing hardware configured to receive the new block ID from the scheduling hardware and execute a new block corresponding to the new block ID.

Abstract translation: 公开了一种用于块调度的方法和系统。该方法包括检索原始块ID，从映射确定相应的新块ID，执行与新块ID相对应的新块，并重复检索，确定和执行每个原始块ID。该系统包括被配置为存储多块计算机程序的程序存储器，被配置为存储块标识符（ID）的标识符存储器，被配置为从程序存储器检索原始块ID的管理硬件，被配置为接收原始块ID的调度硬件使用存储的映射来确定与原始块ID相对应的新块ID，以及配置为从调度硬件接收新块ID并执行与新块ID相对应的新块的处理硬件。

18.

发明授权
Systems and methods of supporting parallel processor message-based communications 有权

公开(公告)号：US10681125B2

公开(公告)日：2020-06-09

申请号：US15084101

申请日：2016-03-29

Applicant: Advanced Micro Devices, Inc.

Inventor： Shuai Che

IPC: G06F15/16 , H04L29/08 , G06F15/80

Abstract: A method of message-based communication is provided which includes executing, on one or more accelerated processing units, a plurality of groups of work items, receiving a first message from a first group of work items of the plurality of groups of work items executing on the one or more accelerated processing units and storing the first message at a first segment of memory allocated to a second group of work items of the plurality of groups of work items executing on the accelerated processing unit.

19.

发明授权
Two-phase hybrid vertex classification 有权

公开(公告)号：US10134355B2

公开(公告)日：2018-11-20

申请号：US14720293

申请日：2015-05-22

Applicant: Advanced Micro Devices, Inc.

Inventor： Shuai Che

IPC: G06F13/14 , G09G5/00 , G09G5/04 , G06T1/20

Abstract: A processor performs vertex coloring for a graph based at least in part on the degree of each vertex of the graph and based at least in part with another coloring approach, such as comparison of random values assigned to the vertices. For each vertex in the graph, a processor determines whether the degree of the vertex is a local maximum; that is, whether the degree of the vertex is greater than the degree of each of its connected vertices. Each vertex having a local-maximum degree is assigned a specified or randomly selected color, and is then omitted from future iterations of the coloring process. After a stop criterion is met, the processor assigns random values to the remaining uncolored vertices and assigns colors based on comparisons of the random values.

20.

发明申请
Managing Cache Coherence Using Information in a Page Table 有权

公开(公告)号：US20170337136A1

公开(公告)日：2017-11-23

申请号：US15162464

申请日：2016-05-23

Applicant: Advanced Micro Devices, Inc.

Inventor： Arkaprava Basu , Bradford M. Beckmann , Shuai Che , Sooraj Puthoor

IPC: G06F12/1009 , G06F12/0815 , G06F12/14

CPC classification number: G06F12/1009 , G06F12/0817 , G06F12/0837 , G06F12/1027 , G06F12/1483 , G06F2212/1024 , G06F2212/1052 , G06F2212/621 , G06F2212/657

Abstract: The described embodiments include a computing device with two or more types of processors and a memory that is shared between the two or more types of processors. The computing device performs operations for handling cache coherency between the two or more types of processors. During operation, the computing device sets a cache coherency indicator in metadata in a page table entry in a page table, the page table entry information about a page of data that is stored in the memory. The computing device then uses the cache coherency indicator to determine operations to be performed when accessing data in the page of data in the memory. For example, the computing device can use the coherency indicator to determine whether a coherency operation is to be performed when a processor of a given type accesses data in the page of data in the memory.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification