Retire queue compression
    1.
    发明授权

    公开(公告)号:US11144324B2

    公开(公告)日:2021-10-12

    申请号:US16586642

    申请日:2019-09-27

    IPC分类号: G06F9/38

    摘要: Systems, apparatuses, and methods for compressing multiple instruction operations together into a single retire queue entry are disclosed. A processor includes at least a scheduler, a retire queue, one or more execution units, and control logic. When the control logic detects a given instruction operation being dispatched by the scheduler to an execution unit, the control logic determines if the given instruction operation meets one or more conditions for being compressed with one or more other instruction operations into a single retire queue entry. If the one or more conditions are met, two or more instruction operations are stored together in a single retire queue entry. By compressing multiple instruction operations together into an individual retire queue entry, the retire queue is able to be used more efficiently, and the processor can speculatively execute more instructions without the retire queue exhausting its supply of available entries.

    TRACKING SOURCE AVAILABILITY FOR INSTRUCTIONS IN A SCHEDULER INSTRUCTION QUEUE
    2.
    发明申请
    TRACKING SOURCE AVAILABILITY FOR INSTRUCTIONS IN A SCHEDULER INSTRUCTION QUEUE 有权
    跟踪源码可用性在SCHEDULER指令队列中的指令

    公开(公告)号:US20160041853A1

    公开(公告)日:2016-02-11

    申请号:US14452923

    申请日:2014-08-06

    IPC分类号: G06F9/54 G06F9/48

    摘要: A processor includes an execution unit to execute instructions and a scheduler unit to store a queue of instructions for execution by the execution unit. The scheduler unit includes a wake array including a plurality of source slots to store source identifiers for sources associated with the instructions, a picker to schedule a particular instruction for execution in the execution unit, broadcast a destination identifier associated with the particular instruction to a first subset of the source slots, and a delay element to receive the destination identifier broadcast by the picker and communicate a delayed version of the destination identifier to a second subset of the source slots different from the first subset.

    摘要翻译: 处理器包括执行指令的执行单元和用于存储由执行单元执行的指令队列的调度器单元。 调度器单元包括包括多个源时隙的唤醒阵列,用于存储与指令相关联的源的源标识符;选择器,用于调度用于在执行单元中执行的特定指令;将与特定指令相关联的目的地标识符广播到第一 源时隙的子集,以及延迟元件,用于接收由选择器广播的目的地标识符,并将目的地标识符的延迟版本传送到与第一子集不同的源时隙的第二子集。

    STORE-TO-LOAD FORWARDING
    3.
    发明申请

    公开(公告)号:US20210311737A1

    公开(公告)日:2021-10-07

    申请号:US17324563

    申请日:2021-05-19

    IPC分类号: G06F9/30 G06F9/38

    摘要: An arithmetic unit performs store-to-load forwarding based on predicted dependencies between store instructions and load instructions. In some embodiments, the arithmetic unit maintains a table of store instructions that are awaiting movement to a load/store unit of the instruction pipeline. In response to receiving a load instruction that is predicted to be dependent on a store instruction stored at the table, the arithmetic unit causes the data associated with the store instruction to be placed into the physical register targeted by the load instruction. In some embodiments, the arithmetic unit performs the forwarding by mapping the physical register targeted by the load instruction to the physical register where the data associated with the store instruction is located.

    Dependent instruction suppression
    4.
    发明授权
    Dependent instruction suppression 有权
    依赖指令抑制

    公开(公告)号:US09489206B2

    公开(公告)日:2016-11-08

    申请号:US13943264

    申请日:2013-07-16

    IPC分类号: G06F9/00 G06F9/38

    摘要: A method includes suppressing execution of at least one dependent instruction of a first instruction by a processor responsive to an invalid status of an ancestor load instruction associated with the first instruction. A processor includes an instruction pipeline having an execution unit to execute instructions, a load store unit for retrieving data from a memory hierarchy, and a scheduler unit. The scheduler unit selects for execution in the execution unit a first load instruction having at least one dependent instruction linked to the first load instruction for data forwarding from the load store unit and suppresses execution of a second dependent instruction of the first dependent instruction responsive to an invalid status of the first load instruction.

    摘要翻译: 一种方法包括响应于与第一指令相关联的祖先加载指令的无效状态来抑制由处理器执行的第一指令的至少一个依赖指令的执行。 处理器包括具有执行指令的执行单元的指令流水线,用于从存储器层次中检索数据的加载存储单元和调度器单元。 调度器单元在执行单元中选择执行具有至少一个依赖指令的第一加载指令,该至少一个依赖指令与来自加载存储单元的数据转发的第一加载指令相关联,并且响应于第一加载指令执行第一依赖指令 第一次加载指令的无效状态。

    STORE-TO-LOAD FORWARDING
    5.
    发明申请
    STORE-TO-LOAD FORWARDING 审中-公开
    存储加载

    公开(公告)号:US20140181482A1

    公开(公告)日:2014-06-26

    申请号:US13723103

    申请日:2012-12-20

    IPC分类号: G06F9/30

    摘要: An arithmetic unit performs store-to-load forwarding based on predicted dependencies between store instructions and load instructions. In some embodiments, the arithmetic unit maintains a table of store instructions that are awaiting movement to a load/store unit of the instruction pipeline. In response to receiving a load instruction that is predicted to be dependent on a store instruction stored at the table, the arithmetic unit causes the data associated with the store instruction to be placed into the physical register targeted by the load instruction. In some embodiments, the arithmetic unit performs the forwarding by mapping the physical register targeted by the load instruction to the physical register where the data associated with the store instruction is located.

    摘要翻译: 算术单元根据存储指令和加载指令之间的预测依赖关系执行存储到载入转发。 在一些实施例中,算术单元保持正在等待移动到指令流水线的加载/存储单元的存储指令表。 响应于接收到被预测为依赖于存储在表中的存储指令的加载指令,运算单元使与存储指令相关联的数据被放入由加载指令所针对的物理寄存器中。 在一些实施例中,算术单元通过将由加载指令指定的物理寄存器映射到与存储指令相关联的数据所位于的物理寄存器来执行转发。

    RETIRE QUEUE COMPRESSION
    6.
    发明申请

    公开(公告)号:US20210096874A1

    公开(公告)日:2021-04-01

    申请号:US16586642

    申请日:2019-09-27

    IPC分类号: G06F9/38

    摘要: Systems, apparatuses, and methods for compressing multiple instruction operations together into a single retire queue entry are disclosed. A processor includes at least a scheduler, a retire queue, one or more execution units, and control logic. When the control logic detects a given instruction operation being dispatched by the scheduler to an execution unit, the control logic determines if the given instruction operation meets one or more conditions for being compressed with one or more other instruction operations into a single retire queue entry. If the one or more conditions are met, two or more instruction operations are stored together in a single retire queue entry. By compressing multiple instruction operations together into an individual retire queue entry, the retire queue is able to be used more efficiently, and the processor can speculatively execute more instructions without the retire queue exhausting its supply of available entries.

    PROCESSOR AND METHODS FOR IMMEDIATE HANDLING AND FLAG HANDLING
    8.
    发明申请
    PROCESSOR AND METHODS FOR IMMEDIATE HANDLING AND FLAG HANDLING 审中-公开
    处理器和方法立即处理和标记处理

    公开(公告)号:US20150121041A1

    公开(公告)日:2015-04-30

    申请号:US14523718

    申请日:2014-10-24

    IPC分类号: G06F9/38 G06F9/30

    摘要: Described herein are methods and processors for flag renaming in groups to eliminate dependencies of instructions. Decoder and execution units in the processor may be configured to rename flags into groups that allow each group to be treated separately as appropriate. This flag renaming eliminates flag dependencies with respect to instructions. This allows an instruction to write exactly the flags that the instruction wants without having to create merge dependencies. Methods and processors are provided for handling immediate values embedded in instructions. A 16 bit immediate bus and a 4 bit encoding/control bus are added at the interface between decode and execution units. For an 8 or 12 bit immediate, the upper 4 bits of the immediate bus contain the encoding bits. For a 16 bit immediate, the encoding/control bus contains the encoding bits. The encoding/control bus indicates when to look at the top four bits of the immediate bus.

    摘要翻译: 这里描述了用于组中的标志重命名以消除指令的依赖性的方法和处理器。 处理器中的解码器和执行单元可以被配置为将标记重新命名为允许每个组在适当时分开对待的组。 该标志重命名消除了关于指令的标志依赖性。 这允许指令准确地写入指令所需的标志,而无需创建合并依赖关系。 提供了方法和处理器来处理嵌入在指令中的立即值。 解码和执行单元之间的接口添加了一个16位立即总线和一个4位编码/控制总线。 对于8位或12位立即数,立即总线的高4位包含编码位。 对于16位立即数,编码/控制总线包含编码位。 编码/控制总线指示何时查看立即总线的前四位。

    PROCESSOR AND METHODS FOR FLOATING POINT REGISTER ALIASING
    9.
    发明申请
    PROCESSOR AND METHODS FOR FLOATING POINT REGISTER ALIASING 审中-公开
    浮点注入器的处理器和方法

    公开(公告)号:US20150121040A1

    公开(公告)日:2015-04-30

    申请号:US14523660

    申请日:2014-10-24

    IPC分类号: G06F9/30

    摘要: Methods, devices, and systems for accessing packed registers are presented. A state of the packed registers may be tracked and it may be determined whether the register is directly accessible based on the state. If the register is not directly accessible, an action may be performed which allows the register to be accessed directly. The action may include injecting at least one uop for reorganizing the physical storage of the register such that it is directly accessible. The action may include aligning the data with the least significant bit of a physical register or otherwise aligning the data with the datapath. The action may also include changing the state of the packed registers.

    摘要翻译: 介绍了访问打包寄存器的方法,设备和系统。 可以跟踪打包寄存器的状态,并且可以基于状态确定寄存器是否可直接访问。 如果寄存器不可直接访问,则可以执行允许直接访问寄存器的动作。 该动作可以包括至少注入一个uop来重新组织寄存器的物理存储器,使得它可以直接访问。 该动作可以包括将数据与物理寄存器的最低有效位对准,或者使数据与数据通路对准。 该动作还可以包括改变打包寄存器的状态。

    Store-to-load forwarding
    10.
    发明授权

    公开(公告)号:US11379234B2

    公开(公告)日:2022-07-05

    申请号:US17324563

    申请日:2021-05-19

    IPC分类号: G06F9/30 G06F9/38

    摘要: An arithmetic unit performs store-to-load forwarding based on predicted dependencies between store instructions and load instructions. In some embodiments, the arithmetic unit maintains a table of store instructions that are awaiting movement to a load/store unit of the instruction pipeline. In response to receiving a load instruction that is predicted to be dependent on a store instruction stored at the table, the arithmetic unit causes the data associated with the store instruction to be placed into the physical register targeted by the load instruction. In some embodiments, the arithmetic unit performs the forwarding by mapping the physical register targeted by the load instruction to the physical register where the data associated with the store instruction is located.