-
公开(公告)号:US20250004963A1
公开(公告)日:2025-01-02
申请号:US18217079
申请日:2023-06-30
Applicant: Advanced Micro Devices, Inc.
Inventor: William Peter Ehrett , Anthony Gutierrez , Vedula Venkata Srikant Bharadwaj , Karthik Ramu Sangaiah , Prachi Shukla , Sriseshan Srikanth , Ganesh Dasika , John Kalamatianos
IPC: G06F13/36
Abstract: A semiconductor device, referred to herein as a Globally Interconnected Operations (GIO) layer, provides global operations in the form of global data reduction for one or more PE arrays. The GIO layer includes processing elements that perform global data reduction on processing results from one or more PE arrays. The GIO layer includes connectors that allow it to be arranged in a 3D stack with one or more PE arrays, for example, on top of or beneath a PE array. This allows reduction operations to be implemented across PE arrays using an efficient topology with superior flexibility, scalability, latency and/or power characteristics that is customizable for particular use cases at assembly time, without requiring costly and time-consuming redesign of PE arrays, and without being constrained by particular PE array designs.
-
公开(公告)号:US20230153218A1
公开(公告)日:2023-05-18
申请号:US17526218
申请日:2021-11-15
Applicant: Advanced Micro Devices, Inc.
Inventor: Shrikanth Ganapathy , Yasuko Eckert , Anthony Gutierrez , Karthik Ramu Sangaiah , Vedula Venkata Srikant Bharadwaj
CPC classification number: G06F11/3051 , G06F15/80 , G06F11/3024
Abstract: A processor includes a controller and plurality of chiplets, each chiplet including a plurality of processor cores. The controller provides chiplet-level performance information for the chiplets that identifies a performance of each chiplet at each of a plurality of performance levels for specified sets of processor cores on that chiplet. The controller receives an identification of one or more selected chiplets from among the plurality of chiplets for which a specified number of processor cores are to be configured at a given performance level, the one or more selected chiplets having been selected based on the chiplet-level performance information and performance requirements. The controller configures the specified number of processor cores of the one or more selected chiplets at the given performance level. A task is then run on the specified number of processor cores of the one or more selected chiplets at the given performance level.
-
公开(公告)号:US20250111195A1
公开(公告)日:2025-04-03
申请号:US18478639
申请日:2023-09-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Karthik Ramu Sangaiah , Yao Cui Fehlis
Abstract: Disclosed is a computer-implemented method for model ensemble acceleration in an active learning loop. The method includes receiving a set of datapoint inputs, where each datapoint input is an unlabeled equivalent of other datapoint inputs in the set of datapoint inputs and has a different applied weight value. The method then executes a set of neural network models, where the execution of each neural network model is based on the received set of datapoint inputs. The outputs from the set of neural network models are analyzed, where an inference computation is performed, and a label for the set of datapoints is determined. The method then stores the labeled set of datapoint inputs in a database. Various other methods, systems, and computer-readable media are also disclosed.
-
公开(公告)号:US12105957B2
公开(公告)日:2024-10-01
申请号:US18087964
申请日:2022-12-23
Applicant: Advanced Micro Devices, Inc.
Inventor: John Kalamatianos , Karthik Ramu Sangaiah , Anthony Thomas Gutierrez
CPC classification number: G06F3/061 , G06F3/0656 , G06F3/0659 , G06F3/0673
Abstract: A memory controller includes an arbiter, a vector arithmetic logic unit (VALU), a read buffer and a write buffer both coupled to the VALU, and an atomic memory operation scheduler. The VALU performs scattered atomic memory operations on arrays of data elements responsive to selected memory access commands. The atomic memory operation scheduler is for scheduling atomic memory operations at the VALU; identifying a plurality of scattered atomic memory operations with commutative and associative properties, the plurality of scattered atomic memory operations on at least one element of an array of data elements associated with an address; and commanding the VALU to perform the plurality of scattered atomic memory operations.
-
公开(公告)号:US11816490B2
公开(公告)日:2023-11-14
申请号:US17550878
申请日:2021-12-14
Applicant: Advanced Micro Devices, Inc.
CPC classification number: G06F9/3853 , G06F1/189 , G06F9/30145 , G06F9/3885
Abstract: VLIW directed Power Management is described. In accordance with described techniques, a program is compiled to generate instructions for execution by a very long instruction word machine. During the compiling, power configurations for the very long instruction word machine to execute the instructions are determined, and fields of the instructions are populated with the power configurations. In one or more implementations, an instruction that includes a power configuration for the very long instruction word machine and operations for execution by the very long instruction word machine is obtained. A power setting of the very long instruction word machine is adjusted based on the power configuration of the instruction, and the operations of the instruction are executed by the very long instruction word machine.
-
公开(公告)号:US11797410B2
公开(公告)日:2023-10-24
申请号:US17526218
申请日:2021-11-15
Applicant: Advanced Micro Devices, Inc.
Inventor: Shrikanth Ganapathy , Yasuko Eckert , Anthony Gutierrez , Karthik Ramu Sangaiah , Vedula Venkata Srikant Bharadwaj
CPC classification number: G06F11/3051 , G06F11/3024 , G06F15/80 , G06F11/3409 , Y02D10/00
Abstract: A processor includes a controller and plurality of chiplets, each chiplet including a plurality of processor cores. The controller provides chiplet-level performance information for the chiplets that identifies a performance of each chiplet at each of a plurality of performance levels for specified sets of processor cores on that chiplet. The controller receives an identification of one or more selected chiplets from among the plurality of chiplets for which a specified number of processor cores are to be configured at a given performance level, the one or more selected chiplets having been selected based on the chiplet-level performance information and performance requirements. The controller configures the specified number of processor cores of the one or more selected chiplets at the given performance level. A task is then run on the specified number of processor cores of the one or more selected chiplets at the given performance level.
-
7.
公开(公告)号:US20250103342A1
公开(公告)日:2025-03-27
申请号:US18475918
申请日:2023-09-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Ryan Lynn Swann , Alexander Sean Underwood , Derrick A. Aguren , Karthik Ramu Sangaiah , Sumanth Gudaparthi , Rose R. Thompson
IPC: G06F9/38
Abstract: A method, apparatus and computer readable medium that use of a lightweight finite state machine (FSM) control flow block to enable limited execution of data-dependent control flow, thereby enhancing the control flow flexibility of array scale SIMD processors. In certain cases, the FSM block contains registers responsible for decoding and managing single global instructions into multiple local instructions that can incorporate data-dependent control flow.
-
公开(公告)号:US20240211134A1
公开(公告)日:2024-06-27
申请号:US18087964
申请日:2022-12-23
Applicant: Advanced Micro Devices, Inc.
Inventor: John Kalamatianos , Karthik Ramu Sangaiah , Anthony Thomas Gutierrez
IPC: G06F3/06
CPC classification number: G06F3/061 , G06F3/0656 , G06F3/0659 , G06F3/0673
Abstract: A memory controller includes an arbiter, a vector arithmetic logic unit (VALU), a read buffer and a write buffer both coupled to the VALU, and an atomic memory operation scheduler. The VALU performs scattered atomic memory operations on arrays of data elements responsive to selected memory access commands. The atomic memory operation scheduler is for scheduling atomic memory operations at the VALU; identifying a plurality of scattered atomic memory operations with commutative and associative properties, the plurality of scattered atomic memory operations on at least one element of an array of data elements associated with an address; and commanding the VALU to perform the plurality of scattered atomic memory operations.
-
公开(公告)号:US11947487B2
公开(公告)日:2024-04-02
申请号:US17852306
申请日:2022-06-28
Applicant: Advanced Micro Devices, Inc.
Inventor: Johnathan Robert Alsop , Karthik Ramu Sangaiah , Anthony T. Gutierrez
IPC: G06F15/82
CPC classification number: G06F15/825
Abstract: Methods and systems are disclosed for performing dataflow execution by an accelerated processing unit (APU). Techniques disclosed include decoding information from one or more dataflow instructions. The decoded information is associated with dataflow execution of a computational task. Techniques disclosed further include configuring, based on the decoded information, dataflow circuitry, and, then, executing the dataflow execution of the computational task using the dataflow circuitry.
-
公开(公告)号:US20230418782A1
公开(公告)日:2023-12-28
申请号:US17852306
申请日:2022-06-28
Applicant: Advanced Micro Devices, Inc.
Inventor: Johnathan Robert Alsop , Karthik Ramu Sangaiah , Anthony T. Gutierrez
IPC: G06F15/82
CPC classification number: G06F15/825
Abstract: Methods and systems are disclosed for performing dataflow execution by an accelerated processing unit (APU). Techniques disclosed include decoding information from one or more dataflow instructions. The decoded information is associated with dataflow execution of a computational task. Techniques disclosed further include configuring, based on the decoded information, dataflow circuitry, and, then, executing the dataflow execution of the computational task using the dataflow circuitry.
-
-
-
-
-
-
-
-
-