-
公开(公告)号:US11989591B2
公开(公告)日:2024-05-21
申请号:US17037727
申请日:2020-09-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Anthony Gutierrez , Vedula Venkata Srikant Bharadwaj , Yasuko Eckert , Mark H. Oskin
IPC: G06F9/50 , G06F9/30 , G06F9/38 , G06F9/4401
CPC classification number: G06F9/505 , G06F9/30043 , G06F9/3802 , G06F9/3836 , G06F9/4403 , G06F9/4418
Abstract: A dynamically configurable overprovisioned microprocessor optimally supports a variety of different compute application workloads and with the capability to tradeoff among compute performance, energy consumption, and clock frequency on a per-compute application basis, using general-purpose microprocessor designs. In some embodiments, the overprovisioned microprocessor comprises a physical compute resource and a dynamic configuration logic configured to: detect an activation-warranting operating condition; undarken the physical compute resource responsive to detecting the activation-warranting operating condition; detect a configuration-warranting operating condition; and configure the overprovisioned microprocessor to use the undarkened physical compute resource responsive to detecting the configuration-warranting operating condition.
-
公开(公告)号:US20230393855A1
公开(公告)日:2023-12-07
申请号:US17833504
申请日:2022-06-06
Applicant: Advanced Micro Devices, Inc.
Inventor: Gabriel H. Loh , Yasuko Eckert , Bradford Beckmann , Michael Estlick , Jay Fleischman
CPC classification number: G06F9/3887 , G06F9/3877 , G06F9/30098 , G06F9/3555
Abstract: An approach is provided for implementing register based single instruction, multiple data (SIMD) lookup table operations. According to the approach, an instruction set architecture (ISA) can support one or more SIMD instructions that enable vectors or multiple values in source data registers to be processed in parallel using a lookup table or truth table stored in one or more function registers. The SIMD instructions can be flexibly configured to support functions with inputs and outputs of various sizes and data formats. Various approaches are also described for supporting very large lookup tables that span multiple registers.
-
公开(公告)号:US11797410B2
公开(公告)日:2023-10-24
申请号:US17526218
申请日:2021-11-15
Applicant: Advanced Micro Devices, Inc.
Inventor: Shrikanth Ganapathy , Yasuko Eckert , Anthony Gutierrez , Karthik Ramu Sangaiah , Vedula Venkata Srikant Bharadwaj
CPC classification number: G06F11/3051 , G06F11/3024 , G06F15/80 , G06F11/3409 , Y02D10/00
Abstract: A processor includes a controller and plurality of chiplets, each chiplet including a plurality of processor cores. The controller provides chiplet-level performance information for the chiplets that identifies a performance of each chiplet at each of a plurality of performance levels for specified sets of processor cores on that chiplet. The controller receives an identification of one or more selected chiplets from among the plurality of chiplets for which a specified number of processor cores are to be configured at a given performance level, the one or more selected chiplets having been selected based on the chiplet-level performance information and performance requirements. The controller configures the specified number of processor cores of the one or more selected chiplets at the given performance level. A task is then run on the specified number of processor cores of the one or more selected chiplets at the given performance level.
-
公开(公告)号:US20220198261A1
公开(公告)日:2022-06-23
申请号:US17131546
申请日:2020-12-22
Applicant: Advanced Micro Devices, Inc.
Inventor: Sergey Blagodurov , Yasuko Eckert , John D. Wilkes
Abstract: A system and method for providing for adoption of solvers for solving at least one task is disclosed. The system and method include a controller, solvers capable of solving the at least one task, and at least one memory. The controller admits ones of the solvers into a competition for solving the at least one task, provides, via the at least one memory, an input of the task to the admitted solvers, provides, via the at least one memory, intermediate results of execution by the admitted solvers that are provided the input, receives a prediction of the next intermediate result from the admitted solvers predicting from at least one of the provided input and received intermediate results, and ranks the at least one of the admitted solvers for solving the task based on at least one of the next intermediate results, the provided input and received intermediate results.
-
公开(公告)号:US20220188208A1
公开(公告)日:2022-06-16
申请号:US17118404
申请日:2020-12-10
Applicant: Advanced Micro Devices, Inc.
Inventor: Anthony Gutierrez , Yasuko Eckert , Sergey Blagodurov , Jagadish B. Kotra
IPC: G06F11/30 , G06F1/20 , G06F12/0815 , G11C11/406 , G06F9/48 , G06F9/30
Abstract: A method may include, in response to a change in an operating parameter of a processing unit, modifying a signal pathway to a processing circuit component of the processing unit, and communicating with the processing circuit component via the signal pathway.
-
公开(公告)号:US20210255871A1
公开(公告)日:2021-08-19
申请号:US16794124
申请日:2020-02-18
Applicant: Advanced Micro Devices, Inc.
Inventor: Onur Kayiran , Jieming Yin , Yasuko Eckert
Abstract: A technique for processing qubits in a quantum computing device is provided. The technique includes determining that, in a first cycle, a first quantum processing region is to perform a first quantum operation that does not use a qubit that is stored in the first quantum processing region, identifying a second quantum processing region that is to perform a second quantum operation at a second cycle that is later than the first cycle, wherein the second quantum operation uses the qubit, determining that between the first cycle and the second cycle, no quantum operations are performed in the second quantum processing region, and moving the qubit from the first quantum processing region to the second quantum processing region.
-
公开(公告)号:US10719441B1
公开(公告)日:2020-07-21
申请号:US16274146
申请日:2019-02-12
Applicant: Advanced Micro Devices, Inc.
Inventor: Jieming Yin , Yasuko Eckert , Matthew R. Poremba , Steven E. Raasch , Doug Hunt
IPC: G06F12/0802
Abstract: An electronic device handles memory access requests for data in a memory. The electronic device includes a memory controller for the memory, a last-level cache memory, a request generator, and a predictor. The predictor determines a likelihood that a cache memory access request for data at a given address will hit in the last-level cache memory. Based on the likelihood, the predictor determines: whether a memory access request is to be sent by the request generator to the memory controller for the data in parallel with the cache memory access request being resolved in the last-level cache memory, and, when the memory access request is to be sent, a type of memory access request that is to be sent. When the memory access request is to be sent, the predictor causes the request generator to send a memory request of the type to the memory controller.
-
公开(公告)号:US20190391850A1
公开(公告)日:2019-12-26
申请号:US16019374
申请日:2018-06-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Nicholas Malaya , Yasuko Eckert
Abstract: Methods and systems for opportunistic load balancing in deep neural networks (DNNs) using metadata. Representative computational costs are captured, obtained or determined for a given architectural, functional or computational aspect of a DNN system. The representative computational costs are implemented as metadata for the given architectural, functional or computational aspect of the DNN system. In an implementation, the computed computational cost is implemented as the metadata. A scheduler detects whether there are neurons in subsequent layers that are ready to execute. The scheduler uses the metadata and neuron availability to schedule and load balance across compute resources and available resources.
-
公开(公告)号:US10097091B1
公开(公告)日:2018-10-09
申请号:US15793951
申请日:2017-10-25
Applicant: Advanced Micro Devices, Inc.
Inventor: Wei Huang , Yasuko Eckert , Xudong An , Muhammad Shoaib Bin Altaf , Jieming Yin
Abstract: The described embodiments include an apparatus that controls voltages for an integrated circuit chip having a set of circuits. The apparatus includes a switching voltage regulator separate from the integrated circuit chip and two or more low dropout (LDO) regulators fabricated on the integrated circuit chip. The switching voltage regulator provides an output voltage that is received as an input voltage by each of the two or more LDO regulators, and each of the two or more LDO regulators provides a local output voltage, each local output voltage received as a local input voltage by a different subset of the circuits in the set of circuits. During operation, a controller sets an operating point for each of the subsets of circuits based on a combined power efficiency for the subsets of the circuits and the LDO regulators, each operating point including a corresponding frequency and voltage.
-
公开(公告)号:US20180115496A1
公开(公告)日:2018-04-26
申请号:US15331002
申请日:2016-10-21
Applicant: Advanced Micro Devices, Inc.
Inventor: Yasuko Eckert , Onur Kayiran , Nuwan S. Jayasena , Gabriel H. Loh , Dong Ping Zhang
IPC: H04L12/911 , H04L12/863
CPC classification number: H04L47/70 , G06F9/5066 , H04L47/50 , H04L67/10 , H04L67/2842 , Y02D10/22 , Y02D10/36
Abstract: Systems, apparatuses, and methods for implementing mechanisms to improve data locality for distributed processing units are disclosed. A system includes a plurality of distributed processing units (e.g., GPUs) and memory devices. Each processing unit is coupled to one or more local memory devices. The system determines how to partition a workload into a plurality of workgroups based on maximizing data locality and data sharing. The system determines which subset of the plurality of workgroups to dispatch to each processing unit of the plurality of processing units based on maximizing local memory accesses and minimizing remote memory accesses. The system also determines how to partition data buffer(s) based on data sharing patterns of the workgroups. The system maps to each processing unit a separate portion of the data buffer(s) so as to maximize local memory accesses and minimize remote memory accesses.
-
-
-
-
-
-
-
-
-