-
公开(公告)号:US20210141740A1
公开(公告)日:2021-05-13
申请号:US16683142
申请日:2019-11-13
Applicant: Advanced Micro Devices, Inc.
Inventor: John Kalamatianos , Apostolos Kokolis , Shrikanth Ganapathy
IPC: G06F12/126 , G06F12/1027 , G06F12/0804
Abstract: A technique for accessing a memory having a high latency portion and a low latency portion is provided. The technique includes detecting a promotion trigger to promote data from the high latency portion to the low latency portion, in response to the promotion trigger, copying cache lines associated with the promotion trigger from the high latency portion to the low latency portion, and in response to a read request, providing data from either or both of the high latency portion or the low latency portion, based on a state associated with data in the high latency portion and the low latency portion.
-
公开(公告)号:US20210141733A1
公开(公告)日:2021-05-13
申请号:US16680491
申请日:2019-11-11
Applicant: Advanced Micro Devices, Inc.
Inventor: Russell J. Schreiber
IPC: G06F12/0877 , G06F12/06 , G06F12/02 , G11C8/12 , G11C7/10
Abstract: Memories that are configurable to operate in either a banked mode or a bit-separated mode. The memories include a plurality of memory banks; multiplexing circuitry; input circuitry; and output circuitry. The input circuitry inputs at least a portion of a memory address and configuration information to the multiplexing circuitry. The multiplexing circuitry generates read data by combining a selected subset of data corresponding to the address from each of the plurality of memory banks, the subset selected based on the configuration information, if the configuration information indicates a bit-separated mode. The multiplexing circuitry generates the read data by combining data corresponding to the address from one of the memory banks, the one of the memory banks selected based on the configuration information, if the configuration information indicates a banked mode. The output circuitry outputs the generated read data from the memory.
-
公开(公告)号:US11004791B2
公开(公告)日:2021-05-11
申请号:US16382774
申请日:2019-04-12
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Richard Schultz
IPC: H01L23/528 , H01L21/768 , H01L23/522 , H01L23/532
Abstract: Various semiconductor chip metallization layers and methods of manufacturing the same are disclosed. In aspect, a semiconductor chip is provided that includes a substrate, plural metallization layers on the substrate, a first conductor line in one of the metallization layers and a second conductor line in the one of the metallization layers in spaced apart relation to the first conductor line, each of the first conductor line and the second conductor line has a first line portion and a second line portion stacked on the first line portion, and a dielectric layer that has a portion positioned between the first conductor line and a second line, the portion has an air gap.
-
公开(公告)号:US20210132985A1
公开(公告)日:2021-05-06
申请号:US16668469
申请日:2019-10-30
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Michael ESTLICK , Erik SWANSON
Abstract: A processing system includes a processor core and a scheduler coupled to the processor core. The processing system executes a first active thread and a second active thread in the processor core and detects a swap event for the first active thread or the second active thread. Based on the swap event, using a shadow-latch configured fixed mapping system, to the processing system replaces either the first active thread or the second active thread with a shadow-based thread, the shadow-based thread being stored in a shadow-latch configured register file.
-
公开(公告)号:US10990120B2
公开(公告)日:2021-04-27
申请号:US16452869
申请日:2019-06-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Bhuvanachandran K. Nair
Abstract: A method operates a first-in-first-out (FIFO) buffer with a first clock, and operates one of a read pointer or a write pointer of the FIFO buffer with the first clock while operating the other one of the read pointer or write pointer with a second clock. One of a serializer fed from the FIFO buffer output, or a de-serializer feeding the FIFO buffer input, is operated with the second clock. Timing pulses indicate that the pointer operating with the second clock has reached a predetermined point in its cycle. The phase of the second clock is adjusted based on a relationship between the timing pulses and an advance period of the pointer operating with the first clock. The pointer operating with the first clock is reset to achieve a desired value for the relationship. A skew created from adjusting the phase of the second clock is corrected.
-
公开(公告)号:US20210117806A1
公开(公告)日:2021-04-22
申请号:US17138709
申请日:2020-12-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Chao Liu , Daniel Isamu Lowell , Wen Heng Chung , Jing Zhang
Abstract: A technique for manipulating a generic tensor is provided. The technique includes receiving a first request to perform a first operation on a generic tensor descriptor associated with the generic tensor, responsive to the first request, performing the first operation on the generic tensor descriptor, receiving a second request to perform a second operation on generic tensor raw data associated with the generic tensor, and responsive to the second request, performing the second operation on the generic tensor raw data, the performing the second operation including mapping a tensor coordinate specified by the second request to a memory address, the mapping including evaluating a delta function to determine an address delta value to add to a previously determined address for a previously processed tensor coordinate.
-
公开(公告)号:US20210117196A1
公开(公告)日:2021-04-22
申请号:US16660495
申请日:2019-10-22
Applicant: ADVANCED MICRO DEVICES, INC. , ATI TECHNOLOGIES ULC
Inventor: Arun A. NAIR , Michael ESTLICK , Erik SWANSON , Sneha V. DESAI , Donglin JI
Abstract: A floating point unit includes a non-pickable scheduler queue (NSQ) that offers a load operation concurrently with a load store unit retrieving load data for an operand that is to be loaded by the load operation. The floating point unit also includes a renamer that renames architectural registers used by the load operation and allocates physical register numbers to the load operation in response to receiving the load operation from the NSQ. The floating point unit further includes a set of pickable scheduler queues that receive the load operation from the renamer and store the load operation prior to execution. A physical register file is implemented in the floating point unit and a free list is used to store physical register numbers of entries in the physical register file that are available for allocation.
-
公开(公告)号:US20210111861A1
公开(公告)日:2021-04-15
申请号:US17128720
申请日:2020-12-21
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Varun Gupta , Milam Paraschou , Gerald R. Talbot , Gurunath Dollin , Damon Tohidi , Eric Ian Carpenter , Chad S. Gallun , Jeffrey Cooper , Hanwoo Cho , Thomas H. Likens, III , Scott F. Dow , Michael J. Tresidder
Abstract: Systems, apparatuses, and methods for implementing a deskewing method for a physical layer interface on a multi-chip module are disclosed. A circuit connected to a plurality of communication lanes trains each lane to synchronize a local clock of the lane with a corresponding global clock at a beginning of a timing window. Next, the circuit symbol rotates each lane by a single step responsive to determining that all of the plurality of lanes have an incorrect symbol alignment. Responsive to determining that some but not all of the plurality of lanes have a correct symbol alignment, the circuit symbol rotates lanes which have an incorrect symbol alignment by a single step. When the end of the timing window has been reached, the circuit symbol rotates lanes which have a correct symbol alignment and adjusts a phase of a corresponding global clock to compensate for missed symbol rotations.
-
公开(公告)号:US10970120B2
公开(公告)日:2021-04-06
申请号:US16019374
申请日:2018-06-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Nicholas Malaya , Yasuko Eckert
Abstract: Methods and systems for opportunistic load balancing in deep neural networks (DNNs) using metadata. Representative computational costs are captured, obtained or determined for a given architectural, functional or computational aspect of a DNN system. The representative computational costs are implemented as metadata for the given architectural, functional or computational aspect of the DNN system. In an implementation, the computed computational cost is implemented as the metadata. A scheduler detects whether there are neurons in subsequent layers that are ready to execute. The scheduler uses the metadata and neuron availability to schedule and load balance across compute resources and available resources.
-
公开(公告)号:US20210098419A1
公开(公告)日:2021-04-01
申请号:US16585480
申请日:2019-09-27
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Skyler J. SALEH , Ruijin WU , Milind S. BHAGAVAT , Rahul AGARWAL
IPC: H01L23/00 , H01L25/065 , G06F8/41
Abstract: Various multi-die arrangements and methods of manufacturing the same are disclosed. In some embodiments, a method of manufacture includes a face-to-face process in which a first GPU chiplet and a second GPU chiplet are bonded to a temporary carrier wafer. A face surface of an active bridge chiplet is bonded to a face surface of the first and second GPU chiplets before mounting the GPU chiplets to a carrier substrate. In other embodiments, a method of manufacture includes a face-to-back process in which a face surface of an active bridge chiplet is bonded to a back surface of the first and second GPU chiplets.
-
-
-
-
-
-
-
-
-