-
公开(公告)号:US20230112432A1
公开(公告)日:2023-04-13
申请号:US17564747
申请日:2021-12-29
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: John J. Wuu , Jaroslaw Kuszczak , Gaurav Singla
IPC: G06F3/06
Abstract: A system and method for efficiently capturing data by sequential circuits across multiple operating conditions are described. In various implementations, an integrated circuit includes multiple signal arrival adjusters both at its I/O boundaries and across its die. The signal arrival adjuster includes two internal timing paths, each with a respective latency. The signal arrival adjuster receives an input signal, and generates an output signal from the a selected one of the first timing path and the second timing path. The signal arrival adjuster sends the output signal to a sequential circuit. The sequential circuit uses the output signal as one of an input data signal and an input clock signal. The selection between the two timing paths within the signal arrival adjuster aids satisfying the setup and hold time requirements of the sequential circuit.
-
公开(公告)号:US11625807B2
公开(公告)日:2023-04-11
申请号:US17181300
申请日:2021-02-22
Applicant: Advanced Micro Devices, Inc.
Inventor: Jiasheng Chen , Timour Paltashev , Alexander Lyashevsky , Carl Kittredge Wakeland , Michael J. Mantor
Abstract: Systems, apparatuses, and methods for implementing a graphics processing unit (GPU) coprocessor are disclosed. The GPU coprocessor includes a SIMD unit with the ability to self-schedule sub-wave procedures based on input data flow events. A host processor sends messages targeting the GPU coprocessor to a queue. In response to detecting a first message in the queue, the GPU coprocessor schedules a first sub-task for execution. The GPU coprocessor includes an inter-lane crossbar and intra-lane biased indexing mechanism for a vector general purpose register (VGPR) file. The VGPR file is split into two files. The first VGPR file is a larger register file with one read port and one write port. The second VGPR file is a smaller register file with multiple read ports and one write port. The second VGPR introduces the ability to co-issue more than one instruction per clock cycle.
-
313.
公开(公告)号:US11625249B2
公开(公告)日:2023-04-11
申请号:US17137140
申请日:2020-12-29
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Jagadish B. Kotra , John Kalamatianos
Abstract: Preserving memory ordering between offloaded instructions and non-offloaded instructions is disclosed. An offload instruction for an operation to be offloaded is processed and a lock is placed on a memory address associated with the offload instruction. In response to completing a cache operation targeting the memory address, the lock on the memory address is removed. For multithreaded applications, upon determining that a plurality of processor cores have each begun executing a sequence of offload instructions, the execution of non-offload instructions that are younger than any of the offload instructions is restricted. In response to determining that each processor core has completed executing its sequence of offload instructions, the restriction is removed. The remote device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.
-
公开(公告)号:US11620525B2
公开(公告)日:2023-04-04
申请号:US16141648
申请日:2018-09-25
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Abhinav Vishnu
Abstract: A heterogeneous processing system includes at least one central processing unit (CPU) core and at least one graphics processing unit (GPU) core. The heterogeneous processing system is configured to compute an activation for each one of a plurality of neurons for a first network layer of a neural network. The heterogeneous processing system randomly drops a first subset of the plurality of neurons for the first network layer and keeps a second subset of the plurality of neurons for the first network layer. Activation for each one of the second subset of the plurality of neurons is forwarded to the CPU core and coalesced to generate a set of coalesced activation sub-matrices.
-
公开(公告)号:US11620224B2
公开(公告)日:2023-04-04
申请号:US16709831
申请日:2019-12-10
Applicant: Advanced Micro Devices, Inc.
Inventor: Aparna Thyagarajan , Ashok Tirupathy Venkatachar , Marius Evers , Angelo Wong , William E. Jones
IPC: G06F12/0862 , G06F12/0875
Abstract: Techniques for controlling prefetching of instructions into an instruction cache are provided. The techniques include tracking either or both of branch target buffer misses and instruction cache misses, modifying a throttle toggle based on the tracking, and adjusting prefetch activity based on the throttle toggle.
-
公开(公告)号:US20230102901A1
公开(公告)日:2023-03-30
申请号:US17489221
申请日:2021-09-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Richard T. Schultz
Abstract: A system and method for creating layout for standard cells are described. In various implementations, a standard cell uses Cross field effect transistors (FETs) that include vertically stacked gate all around (GAA) transistors with conducting channels oriented in an orthogonal direction between them. The direction of current flow of the top GAA transistor is orthogonal to the direction of current flow of the bottom GAA transistor. The channels of the vertically stacked transistors use opposite doping polarities. The orthogonal orientation allows both the top and bottom GAA transistors to have the maximum mobility for their respective carriers based on their orientation. The Cross FETs utilize a single metal layer and a single via layer for connections between the top and bottom GAA transistors.
-
公开(公告)号:US20230102680A1
公开(公告)日:2023-03-30
申请号:US17491058
申请日:2021-09-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Hideki Kanayama , Eric M. Scott
IPC: G06F3/06
Abstract: A memory controller includes a command queue with multiple entry stacks, each with a plurality of entries holding memory access commands, one or more parameter indicators each holding a respective characteristic common to the plurality of entries, and a head indicator designating a current entry for arbitration. An arbiter has a single command input for each entry stack. A command queue loader circuit receives incoming memory access commands and loads entries of respective entry stacks with memory access commands having the respective characteristic of each of the one or more parameter indicators in common.
-
公开(公告)号:US20230102296A1
公开(公告)日:2023-03-30
申请号:US17490037
申请日:2021-09-30
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Michael W. Boyer , Ashish Gondimalla , Bradford M. Beckmann
IPC: G06F17/16
Abstract: A processing unit decomposes a matrix for partial processing at a processor-in-memory (PIM) device. The processing unit receives a matrix to be used as an operand in an arithmetic operation (e.g., a matrix multiplication operation). In response, the processing unit decomposes the matrix into two component matrices: a sparse component matrix and a dense component matrix. The processing unit itself performs the arithmetic operation with the dense component matrix, but sends the sparse component matrix to the PIM device for execution of the arithmetic operation. The processing unit thereby offloads at least some of the processing overhead to the PIM device, improving overall efficiency of the processing system.
-
公开(公告)号:US20230102183A1
公开(公告)日:2023-03-30
申请号:US17489182
申请日:2021-09-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Deepak Vasant Kulkarni , Rahul Agarwal , Rajasekaran Swaminathan , Chintan Buch
IPC: H01L23/495 , H01L23/14
Abstract: Apparatuses, systems and methods for efficiently generating a package substrate. A semiconductor fabrication process (or process) fabricates each of a first glass package substrate and a second glass package substrate with a redistribution layer on a single side of a respective glass wafer. The process flips the second glass package substrate upside down and connects the glass wafers of the first and second glass package substrates together using a wafer bonding technique. In some implementations, the process uses copper-based wafer bonding. The resulting bonding between the two glass wafers contains no air gap, no underfill, and no solder bumps. Afterward, the side of the first glass package substrate opposite the glass wafer is connected to at least one integrated circuit. Additionally, the side of the second glass package substrate opposite the glass wafer is connected to a component on the motherboard through pads on the motherboard.
-
公开(公告)号:US20230101640A1
公开(公告)日:2023-03-30
申请号:US17483698
申请日:2021-09-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Mihir Shaileshbhai Doctor , Alexander J. Branover , Benjamin Tsien , Indrani Paul , Christopher T. Weaver , Thomas J. Gibney , Stephen V. Kosonocky , John P. Petry
IPC: G06F1/3287 , G06F1/3234 , G06F1/3296
Abstract: Devices and methods for linear addressing are provided. A device is provided which comprises a plurality of components having assigned registers used to store data to execute a program and a power management controller, in communication with the components. The power management controller is configured to send one of a request to remove power to the components and a request to reduce power to the components when it is determined that the components are idle, execute a first process of one of removing power and reducing power to the components and entering a reduced power state when an acknowledgement of the request is received and execute a second process of restoring power to the components when one or more of the components are indicated to be active.
-
-
-
-
-
-
-
-
-