-
公开(公告)号:US20240273040A1
公开(公告)日:2024-08-15
申请号:US18390893
申请日:2023-12-20
Applicant: Advanced Micro Devices, Inc.
Inventor: Michael Ignatowski , Michael J. Schulte , Gabriel Hsiuwei Loh
IPC: G06F13/16 , G06F1/3234 , G06F3/06
CPC classification number: G06F13/1684 , G06F1/3275 , G06F3/0604 , G06F13/1694
Abstract: Multi-stack compute chip and memory architecture is described. In accordance with the described techniques, a package includes a plurality of computing stacks, and each computing stack includes at least one compute chip and a memory. The package also includes one or more interconnects that couple the computing stacks to at least one other computing stack for sharing the memory in a coherent fashion across the plurality of computing stacks.
-
公开(公告)号:US12062126B2
公开(公告)日:2024-08-13
申请号:US17489008
申请日:2021-09-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Todd Martin , Tad Robert Litwiller , Nishank Pathak , Randy Wayne Ramsey
IPC: G06T15/00
CPC classification number: G06T15/005
Abstract: Systems, apparatuses, and methods for loading multiple primitives per thread in a graphics pipeline are disclosed. A system includes a graphics pipeline frontend with a geometry engine, shader processor input (SPI), and a plurality of compute units. The geometry engine generates primitives which are accumulated by the SPI into primitive groups. While accumulating primitives, the SPI tracks the number of vertices and primitives per group. The SPI determines wavefront boundaries based on mapping a single vertex to each thread of the wavefront while allowing more than one primitive per thread. The SPI launches wavefronts with one vertex per thread and potentially multiple primitives per thread. The compute units execute a vertex phase and a multi-cycle primitive phase for wavefronts with multiple primitives per thread.
-
公开(公告)号:US12056352B2
公开(公告)日:2024-08-06
申请号:US17955286
申请日:2022-09-28
Applicant: Advanced Micro Devices, Inc.
Inventor: Harris Eleftherios Gasparakis
IPC: G06F3/06
CPC classification number: G06F3/0604 , G06F3/0655 , G06F3/0679
Abstract: Generating optimization instructions for data processing pipelines is described. A pipeline optimization system computes resource usage information that describes memory and compute usage metrics during execution of each stage of the data processing pipeline. The system additionally generates data storage information that describes how data output by each pipeline stage is utilized by other stages of the pipeline. The pipeline optimization system then generates the optimization instructions to control how memory operations are performed for a specific data processing pipeline during execution. In implementations, the optimization instructions cause a memory system to discard data (e.g., invalidate cache entries) without copying the discarded data to another storage location after the data is no longer needed by the pipeline. The optimization instructions alternatively or additionally control at least one of evicting, writing-back, or prefetching data to minimize latency during pipeline execution.
-
公开(公告)号:US20240259194A1
公开(公告)日:2024-08-01
申请号:US18212739
申请日:2023-06-22
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Donald Matthews, JR.
CPC classification number: H04L9/0861 , H04L63/0428 , H04L9/3242
Abstract: A computing node in a computing cluster includes at least a key generator and an encryption engine. The key generator implements a key derivation function and generates a first data encryption key based on a key derivation key. The key derivation key is a global security association encryption key shared by a plurality of nodes in the computing cluster. The first data encryption key is unique to a node pair comprising the first node and a second node of the plurality of nodes. The encryption engine encrypts a data packet using the first data encryption key.
-
公开(公告)号:US20240258322A1
公开(公告)日:2024-08-01
申请号:US18401038
申请日:2023-12-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Richard T. Schultz
IPC: H01L27/12 , H01L21/84 , H01L27/092 , H01L29/06
CPC classification number: H01L27/1203 , H01L21/84 , H01L27/092 , H01L29/0673
Abstract: A system and method for efficiently creating layout for memory bit cells are described. In various implementations, cells of a library use Cross field effect transistors (FETs) that include vertically stacked gate all around (GAA) transistors with conducting channels oriented in an orthogonal direction between them. The channels of the vertically stacked transistors use opposite doping polarities. A first category of cells includes devices where each of the two devices in a particular vertical stack receive a same input signal. The second category of cells includes devices where the two devices in a particular vertical stack receive different input signals. The cells of the second category have a larger height dimension than the cells of the first category.
-
公开(公告)号:US12051154B2
公开(公告)日:2024-07-30
申请号:US17562751
申请日:2021-12-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Anirudh R. Acharya , Ruijin Wu
CPC classification number: G06T17/10 , G06T1/20 , G06T15/005 , G06T15/40
Abstract: Systems and methods for distributed rendering using two-level binning include processing primitives of a frame to be rendered at a first graphics processing unit (GPU) chiplet in a set of GPU chiplets to generate visibility information of primitives for each coarse bin and providing the visibility information to the other GPU chiplets in the set of GPU chiplets. Each coarse bin is then assigned to one of the GPU chiplets of the set of GPU chiplets and rendered at the assigned GPU chiplet based on the corresponding visibility information.
-
公开(公告)号:US20240249996A1
公开(公告)日:2024-07-25
申请号:US18099949
申请日:2023-01-22
Applicant: Advanced Micro Devices, Inc.
Inventor: Michael FLYNN , Otto JOE
IPC: H01L23/42 , H01L21/48 , H01L23/00 , H01L23/367 , H01L25/065
CPC classification number: H01L23/42 , H01L21/4817 , H01L23/3675 , H01L24/16 , H01L24/73 , H01L25/0655 , H01L2224/16225 , H01L2224/73253 , H01L2924/15311
Abstract: A method and apparatus are provided which manages the movement of thermal interface material (TIM) squeezed out from between a lid and an IC die of an IC (chip) package. In one embodiment, a chip package is provided that includes an IC die mounted on a substrate and covered by a lid. A bottom surface of the lid has a die overlapped region facing a top surface of the IC die. The bottom surface of the lid has a first gutter formed therein. An outer sidewall of the first gutter is formed outward of the first die overlapped region as to receive TIM squeezed out from between a lid and an IC die.
-
公开(公告)号:US12045182B1
公开(公告)日:2024-07-23
申请号:US18298587
申请日:2023-04-11
Applicant: Advanced Micro Devices, Inc.
Inventor: Eric Christopher Morton , Pravesh Gupta , Bryan P Broussard , Li Ou
CPC classification number: G06F13/24 , G06F9/30101 , G06F9/4812 , G06F9/4818 , G06F9/4831 , G06F13/26 , G06F13/4221
Abstract: A computing system may implement a low priority arbitration interrupt method that includes receiving a message signaled interrupt (MSI) message from an input output hub (I/O hub) transmitted over an interconnect fabric, selecting a processor to interrupt from a cluster of processors based on arbitration parameters, and communicating an interrupt service routine to the selected processor, wherein the I/O hub and the cluster of processors are located within a common domain.
-
公开(公告)号:US12033721B2
公开(公告)日:2024-07-09
申请号:US17359446
申请日:2021-06-25
Applicant: Advanced Micro Devices, Inc.
Inventor: Arijit Banerjee , John J. Wuu , Russell Schreiber
IPC: G11C8/16 , G06F30/392 , G11C11/418 , G11C11/419
CPC classification number: G11C8/16 , G06F30/392 , G11C11/418 , G11C11/419
Abstract: An apparatus and method for providing efficient floor planning, power, and performance tradeoffs of memory accesses. Adjacent bit cells in a column of an array use a split read port such that the bit cells do not share a read bit line while sharing a write bit line. The adjacent bit cells include asymmetrical read access circuits that convey data stored by latch circuitry of a corresponding bit cell to a corresponding read bit line. The layout of adjacent bit cells provides a number of contacted gate pitches per bit cell that is less than a sum of the maximum number of metal gates in layout of each of the adjacent bit cells divided by the number of adjacent bit cells.
-
公开(公告)号:US12033238B2
公开(公告)日:2024-07-09
申请号:US17030852
申请日:2020-09-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Brian D. Emberling , Joseph Lee Greathouse , Anthony Thomas Gutierrez
Abstract: Systems, apparatuses, and methods for implementing register compaction with early release are disclosed. A processor includes at least a command processor, a plurality of compute units, a plurality of registers, and a control unit. Registers are statically allocated to wavefronts by the control unit when wavefronts are launched by the command processor on the compute units. In response to determining that a first set of registers, previously allocated to a first wavefront, are no longer needed, the first wavefront executes an instruction to release the first set of registers. The control unit detects the executed instruction and releases the first set of registers to the available pool of registers to potentially be used by other wavefronts. Then, the control unit can allocate the first set of registers to a second wavefront for use by threads of the second wavefront while the first wavefront is still active.
-
-
-
-
-
-
-
-
-