-
公开(公告)号:US20220100813A1
公开(公告)日:2022-03-31
申请号:US17032314
申请日:2020-09-25
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Sateesh LAGUDU , Allen H. RUSH , Michael MANTOR , Arun Vaidyanathan ANANTHANARAYAN , Prasad NAGABHUSHANAMGARI
Abstract: An array processor includes processor element arrays distributed in rows and columns. The processor element arrays perform operations on parameter values. The array processor also includes memory interfaces that are dynamically mapped to mutually exclusive subsets of the rows and columns of the processor element arrays based on dimensions of matrices that provide the parameter values to the processor element arrays. In some cases, the processor element arrays are vector arithmetic logic unit (ALU) processors and the memory interfaces are direct memory access (DMA) engines. The rows of the processor element arrays in the subsets are mutually exclusive to the rows in the other subsets and the columns of the processor element arrays in the subsets are mutually exclusive to the columns in the other subsets. The matrices can be symmetric or asymmetric, e.g., one of the matrices can be a vector having a single column.
-
公开(公告)号:US20220100686A1
公开(公告)日:2022-03-31
申请号:US17548385
申请日:2021-12-10
Applicant: Advanced Micro Devices, Inc.
Inventor: Vydhyanathan Kalyanasundharam , Eric Christopher Morton , Bryan P. Broussard , Paul James Moyer , William Louie Walker
Abstract: Systems, apparatuses, and methods for routing interrupts on a coherency probe network are disclosed. A computing system includes a plurality of processing nodes, a coherency probe network, and one or more control units. The coherency probe network carries coherency probe messages between coherent agents. Interrupts that are detected by a control unit are converted into messages that are compatible with coherency probe messages and then routed to a target destination via the coherency probe network. Interrupts are generated with a first encoding while coherency probe messages have a second encoding. Cache subsystems determine whether a message received via the coherency probe network is an interrupt message or a coherency probe message based on an encoding embedded in the received message. Interrupt messages are routed to interrupt controller(s) while coherency probe messages are processed in accordance with a coherence probe action field embedded in the message.
-
公开(公告)号:US20220100664A1
公开(公告)日:2022-03-31
申请号:US17132769
申请日:2020-12-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Masab Ahmad , Derrick Allen Aguren
IPC: G06F12/0862
Abstract: A system and method for efficiently processing memory requests are described. A processing unit includes at least a processor core, a cache, and a non-cache storage buffer capable of storing data prevented from being stored in the cache. While processing a memory request targeting the non-cache storage buffer, the processor core inspects a flag stored in a tag of the memory request. The processor core prevents data prefetching into one or more of the non-cache storage buffer and the cache based on determining the flag specifies preventing data prefetching into one or more of the non-cache storage buffer and the cache using the target address of the memory request during processing of this instance of the memory request. While processing a prefetch hint instruction, the processor core determines from the tag whether to prevent prefetching.
-
公开(公告)号:US20220100663A1
公开(公告)日:2022-03-31
申请号:US17116950
申请日:2020-12-09
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Robert B. COHEN , Tzu-Wei LIN , Anthony J. BYBELL , Sudherssen KALAISELVAN , James MOSSMAN
IPC: G06F12/0855
Abstract: A processor employs a plurality of op cache pipelines to concurrently provide previously decoded operations to a dispatch stage of an instruction pipeline. In response to receiving a first branch prediction at a processor, the processor selects a first op cache pipeline of the plurality of op cache pipelines of the processor based on the first branch prediction, and provides a first set of operations associated with the first branch prediction to the dispatch queue via the selected first op cache pipeline.
-
公开(公告)号:US20220100567A1
公开(公告)日:2022-03-31
申请号:US17120215
申请日:2020-12-13
Applicant: Advanced Micro Devices, Inc.
Inventor: Guhan Krishnan
Abstract: An electronic device includes a memory; a plurality of clients; at least one arbiter circuit; and a management circuit. A given client of the plurality of clients communicates a request to the management circuit requesting an allocation of memory access bandwidth for accesses of the memory by the given client. The management circuit then determines, based on the request, a set of memory access bandwidths including a respective memory access bandwidth for each of the given client and other clients of the plurality of clients that are allocated memory access bandwidth. The management circuit next configures the at least one arbiter circuit to use respective memory access bandwidths from the set of memory access bandwidths for the given client and the other clients for subsequent accesses of the memory.
-
公开(公告)号:US20220100504A1
公开(公告)日:2022-03-31
申请号:US17032301
申请日:2020-09-25
Applicant: ADVANCED MICRO DEVICES, INC. , ATI TECHNOLOGIES ULC
Inventor: Benjamin TSIEN , Alexander J. BRANOVER , John PETRY , Chen-Ping YANG , Rostyslav KYRYCHYNSKYI , Vydhyanathan KALYANASUNDHARAM
IPC: G06F9/30 , G06F9/4401 , G06F9/38 , G06F9/46 , G06F13/40
Abstract: A processing system that includes a shared data fabric resets a first client processor while operating a second client processor. The first client processor is instructed to stop making requests to one or more devices of the shared data fabric. Status communications are blocked between the first client processor and a memory controller, the second client processor, or both, such that the first client processor enters a temporary offline state. The first client processor is indicated as being non-coherent. Accordingly, when the processor is reset some errors and efficiency losses due messages sent during or prior to the reset are prevented.
-
公开(公告)号:US20220100249A1
公开(公告)日:2022-03-31
申请号:US17127681
申请日:2020-12-18
Applicant: Advanced Micro Devices, Inc.
Inventor: Sriram Sambamurthy , Sriram Sundaram , Indrani Paul , Larry David Hewitt , Anil Harwani , Aaron Joseph Grenat , Dana Glenn Lewis , Leonardo Piga , Wonje Choi , Karthik Rao
Abstract: A system and method for updating power supply voltages due to variations from aging are described. A functional unit includes a power supply monitor capable of measuring power supply variations in a region of the functional unit. An age counter measures an age of the functional unit. A control unit notifies the power supply monitor to measure an operating voltage reference. When the control unit receives a measured operating voltage reference, the control unit determines an updated age of the region different from the current age based on the measured operating voltage reference. The control unit updates the age counter with the corresponding age, which is younger than the previous age in some cases due to the region not experiencing predicted stress and aging. The control unit is capable of determining a voltage adjustment for the operating voltage reference based on an age indicated by the age counter.
-
公开(公告)号:US20220093504A1
公开(公告)日:2022-03-24
申请号:US17030830
申请日:2020-09-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Richard T. Schultz , John J. Wuu
IPC: H01L23/528 , H01L27/11 , H01L23/522 , G11C11/418 , G11C11/419 , H01L29/06 , H01L29/423 , H01L29/786
Abstract: A layout for a 6T SRAM cell is disclosed. The cell layout takes a conventional 6T SRAM cell layout and restructures the layout into a more square cell layout with a single p-channel and a single n-channel across the width of the cell. Restructuring the cell layout reduces the height of wordlines and allows dual wordlines to be placed in the cell to reduce wordline resistance in the cell. Dual pairs of bitlines may also be placed in separate metal layers in the cell layout to reduce bitline resistance.
-
公开(公告)号:US20220091980A1
公开(公告)日:2022-03-24
申请号:US17031706
申请日:2020-09-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Onur Kayiran , Yasuko Eckert , Mark Henry Oskin , Gabriel H. Loh , Steven E. Raasch , Maxim V. Kazakov
IPC: G06F12/0811 , G06F12/084 , G06F12/0877 , G06F13/16 , G06F11/30
Abstract: A system and method for efficiently processing memory requests are described. A computing system includes multiple compute units, multiple caches of a memory hierarchy and a communication fabric. A compute unit generates a memory access request that misses in a higher level cache, which sends a miss request to a lower level shared cache. During servicing of the miss request, the lower level cache merges identification information of multiple memory access requests targeting a same cache line from multiple compute units into a merged memory access response. The lower level shared cache continues to insert information into the merged memory access response until the lower level shared cache is ready to issue the merged memory access response. An intermediate router in the communication fabric broadcasts the merged memory access response into multiple memory access responses to send to corresponding compute units.
-
公开(公告)号:US20220091880A1
公开(公告)日:2022-03-24
申请号:US17031424
申请日:2020-09-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexandru Dutu , Marcus Nathaniel Chow , Matthew D. Sinclair , Bradford M. Beckmann , David A. Wood
Abstract: Techniques for executing workgroups are provided. The techniques include executing, for a first workgroup of a first kernel dispatch, a workgroup dependency instruction that includes an indication to prioritize execution of a second workgroup of a second kernel dispatch, and in response to the workgroup dependency instruction, dispatching the second workgroup of the second kernel dispatch prior to dispatching a third workgroup of the second kernel dispatch, wherein no workgroup dependency instruction including an indication to prioritize execution of the third workgroup has been executed.
-
-
-
-
-
-
-
-
-