-
公开(公告)号:US09336004B2
公开(公告)日:2016-05-10
申请号:US13781403
申请日:2013-02-28
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: John M. King
CPC classification number: G06F9/384 , G06F9/30098 , G06F9/3857 , G06F9/3863
Abstract: The present invention provides a method and apparatus for checkpointing registers for transactional memory. Some embodiments of the apparatus include first rename logic configured to map up to a predetermined number of architectural registers to corresponding first physical registers that hold first values associated with the architectural registers. The mapping is responsive to a transaction modifying one or more of the first values associated with the architectural registers. Some embodiments of the apparatus also include microcode configured to write contents of the first physical registers to a memory in response to the transaction modifying first values associated with a number of the architectural registers that is larger than the predetermined number.
Abstract translation: 本发明提供了一种用于检查事件存储器的寄存器的方法和装置。 该装置的一些实施例包括第一重命名逻辑,其被配置为将高达预定数量的架构寄存器映射到相应的第一物理寄存器,该第一物理寄存器保存与架构寄存器相关联的第一值。 映射响应于修改与架构寄存器相关联的一个或多个第一值的事务。 该装置的一些实施例还包括被配置为响应于事务修改将第一物理寄存器的内容写入存储器的微代码,修改与多于该预定数量的多个架构寄存器相关联的第一值。
-
公开(公告)号:US20240330185A1
公开(公告)日:2024-10-03
申请号:US18127054
申请日:2023-03-28
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: John M. King
IPC: G06F12/0815
CPC classification number: G06F12/0815
Abstract: A buffer of a processing system allows younger stores to write to a data cache before an older store completes its write operation to the data cache while maintaining the appearance of committing stores in program order. To maintain the appearance that a blocked store completed its write operation to the data cache, the processing system cancels the blocked store while “locking” the cache line in the data cache in an exclusive state to which the blocked store is attempting to write. The data cache negatively acknowledges any probes to the cache line until the blocked store has completed the write operation. The buffer thus decouples completing the write operation from global observability of the write operation.
-
公开(公告)号:US11847463B2
公开(公告)日:2023-12-19
申请号:US16585973
申请日:2019-09-27
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Kai Troester , Scott Thomas Bingham , John M. King , Michael Estlick , Erik Swanson , Robert Weidner
CPC classification number: G06F9/3861 , G06F9/30036 , G06F9/30038 , G06F9/30043 , G06F9/3887 , G06F9/30018
Abstract: A processor includes a load/store unit and an execution pipeline to execute an instruction that represents a single-instruction-multiple-data (SIMD) operation, and which references a memory block storing operand data for one or more lanes of a plurality of lanes and a mask vector indicating which lanes of a plurality of lanes are enabled and which are disabled for the operation. The execution pipeline executes an instruction in a first execution mode unless a memory fault is generated during execution of the instruction in the first execution mode. In response to the memory fault, the execution pipeline re-executes the instruction in a second execution mode. In the first execution mode, a single load operation is attempted to access the memory block via the load/store unit. In the second execution mode, a separate load operation is performed by the load/store unit for each enabled lane of the plurality of lanes prior to executing the SIMD operation.
-
公开(公告)号:US20220100662A1
公开(公告)日:2022-03-31
申请号:US17547148
申请日:2021-12-09
Applicant: Advanced Micro Devices, Inc.
Inventor: John M. King , Gregory W. Smaus
IPC: G06F12/0844 , G06F12/0877 , G06F9/52 , G06F12/0815
Abstract: The techniques described herein improve cache traffic performance in the context of contended lock instructions. More specifically, each core maintains a lock address contention table that stores addresses corresponding to contended lock instructions. The lock address contention table also includes a state value that indicates progress through a series of states meant to track whether a load by the core in a spin-loop associated with semaphore acquisition has obtained the semaphore in an exclusive state. Upon detecting that a load in a spin-loop has obtained the semaphore in an exclusive state, the core responds to incoming requests for access to the semaphore with negative acknowledgments. This allows the core to maintain the semaphore cache line in an exclusive state, which allows it to acquire the semaphore faster and to avoid transmitting that cache line to other cores unnecessarily.
-
公开(公告)号:US11216378B2
公开(公告)日:2022-01-04
申请号:US15268798
申请日:2016-09-19
Applicant: Advanced Micro Devices, Inc.
Inventor: John M. King , Gregory W. Smaus
IPC: G06F12/08 , G06F12/0808 , G06F12/0815 , G06F12/0844 , G06F12/0877 , G06F9/52
Abstract: The techniques described herein improve cache traffic performance in the context of contended lock instructions. More specifically, each core maintains a lock address contention table that stores addresses corresponding to contended lock instructions. The lock address contention table also includes a state value that indicates progress through a series of states meant to track whether a load by the core in a spin-loop associated with semaphore acquisition has obtained the semaphore in an exclusive state. Upon detecting that a load in a spin-loop has obtained the semaphore in an exclusive state, the core responds to incoming requests for access to the semaphore with negative acknowledgments. This allows the core to maintain the semaphore cache line in an exclusive state, which allows it to acquire the semaphore faster and to avoid transmitting that cache line to other cores unnecessarily.
-
公开(公告)号:US20190171452A1
公开(公告)日:2019-06-06
申请号:US15828708
申请日:2017-12-01
Applicant: Advanced Micro Devices, Inc.
Inventor: John M. King
Abstract: A system and method for load fusion fuses small load operations into fewer, larger load operations. The system detects that a pair of adjacent operations are consecutive load operations, where the adjacent micro-operations refers to micro-operations flowing through adjacent dispatch slots and the consecutive load micro-operations refers to both of the adjacent micro-operations being load micro-operations. The consecutive load operations are then reviewed to determine if the data sizes are the same and if the load operation addresses are consecutive. The two load operations are then fused together to form one load micro-operation with twice the data size and one load data micro-operation with no load component.
-
公开(公告)号:US20190163475A1
公开(公告)日:2019-05-30
申请号:US15822515
申请日:2017-11-27
Applicant: Advanced Micro Devices, Inc.
Inventor: John M. King
IPC: G06F9/30 , G06F9/38 , G06F9/48 , G06F12/0875 , G06F7/57
Abstract: Described herein is a system and method for store fusion that fuses small store operations into fewer, larger store operations. The system detects that a pair of adjacent operations are consecutive store operations, where the adjacent micro-operations refers to micro-operations flowing through adjacent dispatch slots and the consecutive store micro-operations refers to both of the adjacent micro-operations being store micro-operations. The consecutive store operations are then reviewed to determine if the data sizes are the same and if the store operation addresses are consecutive. The two store operations are then fused together to form one store operation with twice the data size and one store data HI operation.
-
公开(公告)号:US11835988B2
公开(公告)日:2023-12-05
申请号:US15828708
申请日:2017-12-01
Applicant: Advanced Micro Devices, Inc.
Inventor: John M. King
CPC classification number: G06F9/3004 , G06F9/24 , G06F9/3017 , G06F9/30021 , G06F9/30043 , G06F9/30181 , G06F9/34 , G06F9/384 , G06F9/3842 , G06F9/3867
Abstract: A system and method for load fusion fuses small load operations into fewer, larger load operations. The system detects that a pair of adjacent operations are consecutive load operations, where the adjacent micro-operations refers to micro-operations flowing through adjacent dispatch slots and the consecutive load micro-operations refers to both of the adjacent micro-operations being load micro-operations. The consecutive load operations are then reviewed to determine if the data sizes are the same and if the load operation addresses are consecutive. The two load operations are then fused together to form one load micro-operation with twice the data size and one load data micro-operation with no load component.
-
公开(公告)号:US11768771B2
公开(公告)日:2023-09-26
申请号:US17547148
申请日:2021-12-09
Applicant: Advanced Micro Devices, Inc.
Inventor: John M. King , Gregory W. Smaus
IPC: G06F12/08 , G06F12/0808 , G06F12/0815 , G06F12/0844 , G06F12/0877 , G06F9/52
CPC classification number: G06F12/0844 , G06F9/522 , G06F12/0815 , G06F12/0877 , G06F2212/1008 , G06F2212/1016 , G06F2212/1032
Abstract: The techniques described herein improve cache traffic performance in the context of contended lock instructions. More specifically, each core maintains a lock address contention table that stores addresses corresponding to contended lock instructions. The lock address contention table also includes a state value that indicates progress through a series of states meant to track whether a load by the core in a spin-loop associated with semaphore acquisition has obtained the semaphore in an exclusive state. Upon detecting that a load in a spin-loop has obtained the semaphore in an exclusive state, the core responds to incoming requests for access to the semaphore with negative acknowledgments. This allows the core to maintain the semaphore cache line in an exclusive state, which allows it to acquire the semaphore faster and to avoid transmitting that cache line to other cores unnecessarily.
-
公开(公告)号:US11175916B2
公开(公告)日:2021-11-16
申请号:US15846457
申请日:2017-12-19
Applicant: Advanced Micro Devices, Inc.
Inventor: Gregory W. Smaus , John M. King
IPC: G06F9/30
Abstract: A system and method for a lightweight fence is described. In particular, micro-operations including a fencing micro-operation are dispatched to a load queue. The fencing micro-operation allows micro-operations younger than the fencing micro-operation to execute, where the micro-operations are related to a type of fencing micro-operation. The fencing micro-operation is executed if the fencing micro-operation is the oldest memory access micro-operation, where the oldest memory access micro-operation is related to the type of fencing micro-operation. The fencing micro-operation determines whether micro-operations younger than the fencing micro-operation have load ordering violations and if load ordering violations are detected, the fencing micro-operation signals the retire queue that instructions younger than the fencing micro-operation should be flushed. The instructions to be flushed should include all micro-operations with load ordering violations.
-
-
-
-
-
-
-
-
-