Multi-modal gather operation
    1.
    发明授权

    公开(公告)号:US11842200B2

    公开(公告)日:2023-12-12

    申请号:US16586247

    申请日:2019-09-27

    Abstract: An apparatus includes a plurality of load buses and a load store unit that includes a plurality of load ports to access the plurality of load buses. The load store unit performs a gather operation to concurrently gather a plurality of subsets of data from a memory via the plurality of load buses in a first mode. The apparatus also includes a register that is partitioned into a plurality of portions to hold the plurality of subsets of data provided by the load store unit. The load store unit ignores exceptions or faults while performing the gather operation in the first mode and transitions to a second mode in response to an exception or fault. Two lanes are dispatched to concurrently perform the gather operation per clock cycle in the first mode and a single lane is dispatched to perform the gather operation per clock cycle in the second mode.

    LOCK ADDRESS CONTENTION PREDICTOR
    2.
    发明申请

    公开(公告)号:US20180081544A1

    公开(公告)日:2018-03-22

    申请号:US15273304

    申请日:2016-09-22

    Abstract: Techniques for selectively executing a lock instruction speculatively or non-speculatively based on lock address prediction and/or temporal lock prediction. including methods an devices for locking an entry in a memory device. In some techniques, a lock instruction executed by a thread for a particular memory entry of a memory device is detected. Whether contention occurred for the particular memory entry during an earlier speculative lock is detected on a condition that the lock instruction comprises a speculative lock instruction. The lock is executed non-speculatively if contention occurred for the particular memory entry during an earlier speculative lock. The lock is executed speculatively if contention did not occur for the particular memory entry during an earlier speculative lock.

    UNIFIED STORE QUEUE
    3.
    发明申请
    UNIFIED STORE QUEUE 审中-公开
    统一存储队列

    公开(公告)号:US20150121010A1

    公开(公告)日:2015-04-30

    申请号:US14067564

    申请日:2013-10-30

    CPC classification number: G06F12/0875 G06F9/3826 G06F9/3834 Y02D10/13

    Abstract: Embodiments herein provide for improved store-to-load-forwarding (STLF) logic and linear aliasing effect reduction logic. In one embodiment, a load instruction to be executed is selected. Whether a first linear address associated with said load instruction matches a linear address of a store instruction of a plurality of store instructions in a queue is determined. Data associated with said store instruction for executing said load instruction is forwarded, in response to determining that the first linear address matches the linear address of the store instruction.

    Abstract translation: 这里的实施例提供了改进的存储到负载转发(STLF)逻辑和线性混叠效应降低逻辑。 在一个实施例中,选择要执行的加载指令。 确定与所述加载指令相关联的第一线性地址是否与队列中的多个存储指令的存储指令的线性地址匹配。 响应于确定第一线性地址与存储指令的线性地址匹配,转发与用于执行所述加载指令的所述存储指令相关联的数据。

    Lock address contention predictor

    公开(公告)号:US11868818B2

    公开(公告)日:2024-01-09

    申请号:US15273304

    申请日:2016-09-22

    CPC classification number: G06F9/52 G06F9/50

    Abstract: Techniques for selectively executing a lock instruction speculatively or non-speculatively based on lock address prediction and/or temporal lock prediction. including methods an devices for locking an entry in a memory device. In some techniques, a lock instruction executed by a thread for a particular memory entry of a memory device is detected. Whether contention occurred for the particular memory entry during an earlier speculative lock is detected on a condition that the lock instruction comprises a speculative lock instruction. The lock is executed non-speculatively if contention occurred for the particular memory entry during an earlier speculative lock. The lock is executed speculatively if contention did not occur for the particular memory entry during an earlier speculative lock.

    Configurable skewed associativity in a translation lookaside buffer

    公开(公告)号:US11106596B2

    公开(公告)日:2021-08-31

    申请号:US15389955

    申请日:2016-12-23

    Abstract: Methods, devices, and systems for determining an address in a physical memory which corresponds to a virtual address using a skewed-associative translation lookaside buffer (TLB) are described. A virtual address and a configuration indication are received using receiver circuitry. A physical address corresponding to the virtual address is output if a TLB hit occurs. A first subset of a plurality of ways of the TLB is configured to hold a first page size. The first subset includes a number of the ways based on the configuration indication. A physical address corresponding to the virtual address is retrieved from a page table if a TLB miss occurs, and at least a portion of the physical address is installed in a least recently used way of a subset of a plurality of ways the TLB, determined according to a replacement policy based on the configuration indication.

    TECHNIQUES FOR PERFORMING STORE-TO-LOAD FORWARDING

    公开(公告)号:US20210157590A1

    公开(公告)日:2021-05-27

    申请号:US16698808

    申请日:2019-11-27

    Abstract: A technique for performing store-to-load forwarding is provided. The technique includes determining a virtual address for data to be loaded for the load instruction, identifying a matching store instruction from one or more store instruction memories by comparing a virtual-address-based comparison value for the load instruction to one or more virtual-address-based comparison values of one or more store instructions, determining a physical address for the load instruction, and validating the load instruction based on a comparison between the physical address of the load instruction and a physical address of the matching store instruction.

    System and method for store fusion

    公开(公告)号:US10459726B2

    公开(公告)日:2019-10-29

    申请号:US15822515

    申请日:2017-11-27

    Inventor: John M. King

    Abstract: Described herein is a system and method for store fusion that fuses small store operations into fewer, larger store operations. The system detects that a pair of adjacent operations are consecutive store operations, where the adjacent micro-operations refers to micro-operations flowing through adjacent dispatch slots and the consecutive store micro-operations refers to both of the adjacent micro-operations being store micro-operations. The consecutive store operations are then reviewed to determine if the data sizes are the same and if the store operation addresses are consecutive. The two store operations are then fused together to form one store operation with twice the data size and one store data HI operation.

    SYSTEM AND METHOD FOR VIRTUAL LOAD QUEUE
    8.
    发明申请

    公开(公告)号:US20190163471A1

    公开(公告)日:2019-05-30

    申请号:US15824729

    申请日:2017-11-28

    Inventor: John M. King

    Abstract: A system and method for a virtual load queue is described. Load micro-operations are processed through an instruction pipeline without requiring an entry in a load queue (LDQ). An address generation scheduler queue (AGSQ) entry is allocated to the load micro-operation and a LDQ entry is not allocated to the load micro-operation. The LDQ entries are reserved for the N oldest load micro-operations, where N is the depth of the LDQ. Deallocation of the AGSQ entry is done if the load micro-operation is one of the N oldest load micro-operations, or upon successful completion of the load micro-operation. Deallocation of the AGSQ entry is not done if the load micro-operation gets a bad status and is not one of the N oldest micro-operations. Consequently, the AGSQ acts as a virtual queue for the LDQ and mitigates the limiting effect of the LDQ depth.

    Speculative retirement of post-lock instructions

    公开(公告)号:US10095637B2

    公开(公告)日:2018-10-09

    申请号:US15267094

    申请日:2016-09-15

    Abstract: Techniques for improving execution of a lock instruction are provided herein. A lock instruction and younger instructions are allowed to speculatively retire prior to the store portion of the lock instruction committing its value to memory. These instructions thus do not have to wait for the lock instruction to complete before retiring. In the event that the processor detects a violation of the atomic or fencing properties of the lock instruction prior to committing the value of the lock instruction, the processor rolls back state and executes the lock instruction in a slow mode in which younger instructions are not allowed to retire until the stored value of the lock instruction is committed. Speculative retirement of these instructions results in increased processing speed, as instructions no longer need to wait to retire after execution of a lock instruction.

Patent Agency Ranking