-
公开(公告)号:US11842200B2
公开(公告)日:2023-12-12
申请号:US16586247
申请日:2019-09-27
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: John M. King , Magiting Talisayon , Michael Estlick
CPC classification number: G06F9/3887 , G06F9/3013 , G06F9/30018 , G06F9/30036 , G06F9/30043 , G06F9/3861
Abstract: An apparatus includes a plurality of load buses and a load store unit that includes a plurality of load ports to access the plurality of load buses. The load store unit performs a gather operation to concurrently gather a plurality of subsets of data from a memory via the plurality of load buses in a first mode. The apparatus also includes a register that is partitioned into a plurality of portions to hold the plurality of subsets of data provided by the load store unit. The load store unit ignores exceptions or faults while performing the gather operation in the first mode and transitions to a second mode in response to an exception or fault. Two lanes are dispatched to concurrently perform the gather operation per clock cycle in the first mode and a single lane is dispatched to perform the gather operation per clock cycle in the second mode.
-
公开(公告)号:US20180081544A1
公开(公告)日:2018-03-22
申请号:US15273304
申请日:2016-09-22
Applicant: Advanced Micro Devices, Inc.
Inventor: Gregory W. Smaus , John M. King , Matthew A. Rafacz , Matthew M. Crum
IPC: G06F3/06 , G06F12/084 , G06F12/0842
Abstract: Techniques for selectively executing a lock instruction speculatively or non-speculatively based on lock address prediction and/or temporal lock prediction. including methods an devices for locking an entry in a memory device. In some techniques, a lock instruction executed by a thread for a particular memory entry of a memory device is detected. Whether contention occurred for the particular memory entry during an earlier speculative lock is detected on a condition that the lock instruction comprises a speculative lock instruction. The lock is executed non-speculatively if contention occurred for the particular memory entry during an earlier speculative lock. The lock is executed speculatively if contention did not occur for the particular memory entry during an earlier speculative lock.
-
公开(公告)号:US20150121010A1
公开(公告)日:2015-04-30
申请号:US14067564
申请日:2013-10-30
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: David A Kaplan , Daniel Hopper , John M. King , Jeff Rupley
CPC classification number: G06F12/0875 , G06F9/3826 , G06F9/3834 , Y02D10/13
Abstract: Embodiments herein provide for improved store-to-load-forwarding (STLF) logic and linear aliasing effect reduction logic. In one embodiment, a load instruction to be executed is selected. Whether a first linear address associated with said load instruction matches a linear address of a store instruction of a plurality of store instructions in a queue is determined. Data associated with said store instruction for executing said load instruction is forwarded, in response to determining that the first linear address matches the linear address of the store instruction.
Abstract translation: 这里的实施例提供了改进的存储到负载转发(STLF)逻辑和线性混叠效应降低逻辑。 在一个实施例中,选择要执行的加载指令。 确定与所述加载指令相关联的第一线性地址是否与队列中的多个存储指令的存储指令的线性地址匹配。 响应于确定第一线性地址与存储指令的线性地址匹配,转发与用于执行所述加载指令的所述存储指令相关联的数据。
-
公开(公告)号:US11868818B2
公开(公告)日:2024-01-09
申请号:US15273304
申请日:2016-09-22
Applicant: Advanced Micro Devices, Inc.
Inventor: Gregory W. Smaus , John M. King , Matthew A. Rafacz , Matthew M. Crum
Abstract: Techniques for selectively executing a lock instruction speculatively or non-speculatively based on lock address prediction and/or temporal lock prediction. including methods an devices for locking an entry in a memory device. In some techniques, a lock instruction executed by a thread for a particular memory entry of a memory device is detected. Whether contention occurred for the particular memory entry during an earlier speculative lock is detected on a condition that the lock instruction comprises a speculative lock instruction. The lock is executed non-speculatively if contention occurred for the particular memory entry during an earlier speculative lock. The lock is executed speculatively if contention did not occur for the particular memory entry during an earlier speculative lock.
-
公开(公告)号:US11106596B2
公开(公告)日:2021-08-31
申请号:US15389955
申请日:2016-12-23
Applicant: Advanced Micro Devices, Inc.
Inventor: John M. King , Michael T. Clark
IPC: G06F12/1027 , G06F12/123 , G06F12/127
Abstract: Methods, devices, and systems for determining an address in a physical memory which corresponds to a virtual address using a skewed-associative translation lookaside buffer (TLB) are described. A virtual address and a configuration indication are received using receiver circuitry. A physical address corresponding to the virtual address is output if a TLB hit occurs. A first subset of a plurality of ways of the TLB is configured to hold a first page size. The first subset includes a number of the ways based on the configuration indication. A physical address corresponding to the virtual address is retrieved from a page table if a TLB miss occurs, and at least a portion of the physical address is installed in a least recently used way of a subset of a plurality of ways the TLB, determined according to a replacement policy based on the configuration indication.
-
公开(公告)号:US20210157590A1
公开(公告)日:2021-05-27
申请号:US16698808
申请日:2019-11-27
Applicant: Advanced Micro Devices, Inc.
Inventor: John M. King , Matthew T. Sobel
IPC: G06F9/30 , G06F12/0811 , G06F9/54
Abstract: A technique for performing store-to-load forwarding is provided. The technique includes determining a virtual address for data to be loaded for the load instruction, identifying a matching store instruction from one or more store instruction memories by comparing a virtual-address-based comparison value for the load instruction to one or more virtual-address-based comparison values of one or more store instructions, determining a physical address for the load instruction, and validating the load instruction based on a comparison between the physical address of the load instruction and a physical address of the matching store instruction.
-
公开(公告)号:US10459726B2
公开(公告)日:2019-10-29
申请号:US15822515
申请日:2017-11-27
Applicant: Advanced Micro Devices, Inc.
Inventor: John M. King
IPC: G06F9/312 , G06F9/46 , G06F12/00 , G06F7/57 , G06F9/30 , G06F9/38 , G06F9/48 , G06F12/0875 , G06F8/41
Abstract: Described herein is a system and method for store fusion that fuses small store operations into fewer, larger store operations. The system detects that a pair of adjacent operations are consecutive store operations, where the adjacent micro-operations refers to micro-operations flowing through adjacent dispatch slots and the consecutive store micro-operations refers to both of the adjacent micro-operations being store micro-operations. The consecutive store operations are then reviewed to determine if the data sizes are the same and if the store operation addresses are consecutive. The two store operations are then fused together to form one store operation with twice the data size and one store data HI operation.
-
公开(公告)号:US20190163471A1
公开(公告)日:2019-05-30
申请号:US15824729
申请日:2017-11-28
Applicant: Advanced Micro Devices, Inc.
Inventor: John M. King
Abstract: A system and method for a virtual load queue is described. Load micro-operations are processed through an instruction pipeline without requiring an entry in a load queue (LDQ). An address generation scheduler queue (AGSQ) entry is allocated to the load micro-operation and a LDQ entry is not allocated to the load micro-operation. The LDQ entries are reserved for the N oldest load micro-operations, where N is the depth of the LDQ. Deallocation of the AGSQ entry is done if the load micro-operation is one of the N oldest load micro-operations, or upon successful completion of the load micro-operation. Deallocation of the AGSQ entry is not done if the load micro-operation gets a bad status and is not one of the N oldest micro-operations. Consequently, the AGSQ acts as a virtual queue for the LDQ and mitigates the limiting effect of the LDQ depth.
-
公开(公告)号:US10095637B2
公开(公告)日:2018-10-09
申请号:US15267094
申请日:2016-09-15
Applicant: Advanced Micro Devices, Inc.
Inventor: Gregory W. Smaus , John M. King , Michael D. Achenbach , Kevin M. Lepak , Matthew A. Rafacz , Noah Bamford
Abstract: Techniques for improving execution of a lock instruction are provided herein. A lock instruction and younger instructions are allowed to speculatively retire prior to the store portion of the lock instruction committing its value to memory. These instructions thus do not have to wait for the lock instruction to complete before retiring. In the event that the processor detects a violation of the atomic or fencing properties of the lock instruction prior to committing the value of the lock instruction, the processor rolls back state and executes the lock instruction in a slow mode in which younger instructions are not allowed to retire until the stored value of the lock instruction is committed. Speculative retirement of these instructions results in increased processing speed, as instructions no longer need to wait to retire after execution of a lock instruction.
-
公开(公告)号:US20180074977A1
公开(公告)日:2018-03-15
申请号:US15267094
申请日:2016-09-15
Applicant: Advanced Micro Devices, Inc.
Inventor: Gregory W. Smaus , John M. King , Michael D. Achenbach , Kevin M. Lepak , Matthew A. Rafacz , Noah Bamford
CPC classification number: G06F12/1466 , G06F9/3004 , G06F9/30043 , G06F9/30087 , G06F9/3834 , G06F9/3859 , G06F9/3863 , G06F9/528 , G06F2212/1052
Abstract: Techniques for improving execution of a lock instruction are provided herein. A lock instruction and younger instructions are allowed to speculatively retire prior to the store portion of the lock instruction committing its value to memory. These instructions thus do not have to wait for the lock instruction to complete before retiring. In the event that the processor detects a violation of the atomic or fencing properties of the lock instruction prior to committing the value of the lock instruction, the processor rolls back state and executes the lock instruction in a slow mode in which younger instructions are not allowed to retire until the stored value of the lock instruction is committed. Speculative retirement of these instructions results in increased processing speed, as instructions no longer need to wait to retire after execution of a lock instruction.
-
-
-
-
-
-
-
-
-