Age-based management of instruction blocks in a processor instruction window

    公开(公告)号:US09946548B2

    公开(公告)日:2018-04-17

    申请号:US14752747

    申请日:2015-06-26

    Abstract: A processor core in an instruction block-based microarchitecture includes a control unit that explicitly tracks instruction block state including age or priority for current blocks that have been fetched from an instruction cache. Tracked instruction blocks are maintained in an age-ordered or priority-ordered list. When an instruction block is identified by the control unit for commitment, the list is checked for a match and a matching instruction block can be refreshed without re-fetching from the instruction cache. If a match is not found, an instruction block can be committed and replaced based on either age or priority. Such instruction state tracking typically consumes little overhead and enables instruction blocks to be reused and mispredicted instructions to be skipped to increase processor core efficiency.

    DECOUPLED PROCESSOR INSTRUCTION WINDOW AND OPERAND BUFFER
    2.
    发明申请
    DECOUPLED PROCESSOR INSTRUCTION WINDOW AND OPERAND BUFFER 审中-公开
    解码处理器指令窗口和操作缓冲区

    公开(公告)号:US20160378479A1

    公开(公告)日:2016-12-29

    申请号:US14752724

    申请日:2015-06-26

    Abstract: A processor core in an instruction block-based microarchitecture is configured so that an instruction window and operand buffers are decoupled for independent operation in which instructions in the block are not tied to resources such as control bits and operands that are maintained in the operand buffers. Instead, pointers are established among instructions in the block and the resources so that control state can be established for a refreshed instruction block (i.e., an instruction block that is reused without re-fetching it from an instruction cache) by following the pointers. Such decoupling of the instruction window from the operand space can provide greater processor efficiency, particularly in multiple core arrays where refreshing is utilized (for example when executing program code that uses tight loops), because the operands and control bits are pre-validated.

    Abstract translation: 在基于指令块的微架构中的处理器核心被配置为使得指令窗口和操作数缓冲器被去耦以用于独立操作,其中该块中的指令不被绑定到诸如操作数缓冲器中保持的控制位和操作数之类的资源。 相反,在块和资源中的指令之间建立指针,使得可以通过跟随指针来为刷新的指令块(即,重新使用而不从指令高速缓存重新获取的指令块)建立控制状态。 指令窗口与操作数空间的这种去耦可以提供更高的处理器效率,特别是在使用刷新的多个核心阵列(例如执行使用紧密循环的程序代码时),因为操作数和控制位被预先验证。

    AGE-BASED MANAGEMENT OF INSTRUCTION BLOCKS IN A PROCESSOR INSTRUCTION WINDOW
    3.
    发明申请
    AGE-BASED MANAGEMENT OF INSTRUCTION BLOCKS IN A PROCESSOR INSTRUCTION WINDOW 有权
    基于年龄的处理器指令窗口中的指令块管理

    公开(公告)号:US20160378502A1

    公开(公告)日:2016-12-29

    申请号:US14752747

    申请日:2015-06-26

    Abstract: A processor core in an instruction block-based microarchitecture includes a control unit that explicitly tracks instruction block state including age or priority for current blocks that have been fetched from an instruction cache. Tracked instruction blocks are maintained in an age-ordered or priority-ordered list. When an instruction block is identified by the control unit for commitment, the list is checked for a match and a matching instruction block can be refreshed without re-fetching from the instruction cache. If a match is not found, an instruction block can be committed and replaced based on either age or priority. Such instruction state tracking typically consumes little overhead and enables instruction blocks to be reused and mispredicted instructions to be skipped to increase processor core efficiency.

    Abstract translation: 基于指令块的微架构中的处理器核心包括控制单元,其显式地跟踪指令块状态,包括从指令高速缓存取出的当前块的年龄或优先级。 跟踪的指令块保存在年龄排序或优先级排序的列表中。 当由控制单元识别用于承诺的指令块时,检查列表以进行匹配,并且可以刷新匹配指令块而不从指令高速缓存重新取出。 如果没有找到匹配,则可以根据年龄或优先级来提交和替换指令块。 这种指令状态跟踪通常消耗很少的开销,并且使指令块能够被重复使用,并且可以跳过误预测的指令以提高处理器核心效率。

    BULK ALLOCATION OF INSTRUCTION BLOCKS TO A PROCESSOR INSTRUCTION WINDOW
    4.
    发明申请
    BULK ALLOCATION OF INSTRUCTION BLOCKS TO A PROCESSOR INSTRUCTION WINDOW 有权
    指定块分配给处理器指令窗口

    公开(公告)号:US20160378493A1

    公开(公告)日:2016-12-29

    申请号:US14752685

    申请日:2015-06-26

    Abstract: A processor core in an instruction block-based microarchitecture includes a control unit that allocates instructions into an instruction window in bulk by fetching blocks of instructions and associated resources including control bits and operands at once. Such bulk allocation supports increased efficiency in processor core operations by enabling consistent management and policy implementation across all the instructions in the block during execution. For example, when an instruction block branches back on itself, it may be reused in a refresh process rather than being re-fetched from the instruction cache. As all of the resources for that instruction block are in one place, the instructions can remain in place and only valid bits need to be cleared. Bulk allocation also facilitates operand sharing by instructions in a block and explicit messaging among instructions.

    Abstract translation: 基于指令块的微架构中的处理器核心包括控制单元,其通过一次获取指令块和相关资源(包括控制位和操作数)来批量地将指令分配到指令窗口中。 这种批量分配通过在执行期间通过在块中的所有指令实现一致的管理和策略实现来支持提高处理器核心操作的效率。 例如,当指令块自身分支时,它可以在刷新过程中被重用,而不是从指令高速缓存重新获取。 由于该指令块的所有资源都在一个位置,所以指令可以保持原位,只有有效位需要清除。 批量分配还通过指令中的指令和指令之间的显式消息传递方便操作数共享。

    Decoupled processor instruction window and operand buffer

    公开(公告)号:US11048517B2

    公开(公告)日:2021-06-29

    申请号:US16450172

    申请日:2019-06-24

    Abstract: A processor core in an instruction block-based microarchitecture is configured so that an instruction window and operand buffers are decoupled for independent operation in which instructions in the block are not tied to resources such as control bits and operands that are maintained in the operand buffers. Instead, pointers are established among instructions in the block and the resources so that control state can be established for a refreshed instruction block (i.e., an instruction block that is reused without re-fetching it from an instruction cache) by following the pointers. Such decoupling of the instruction window from the operand space can provide greater processor efficiency, particularly in multiple core arrays where refreshing is utilized (for example when executing program code that uses tight loops), because the operands and control bits are pre-validated.

    Decoding information about a group of instructions including a size of the group of instructions

    公开(公告)号:US10409599B2

    公开(公告)日:2019-09-10

    申请号:US14752682

    申请日:2015-06-26

    Abstract: A method including fetching a group of instructions, where the group of instructions is configured to execute atomically by a processor is provided. The method further includes decoding at least one of a first instruction or a second instruction, where: (1) decoding the first instruction results in a processing of information about a group of instructions, including information about a size of the group of instructions, and (2) decoding the second instruction results in a processing of at least one of: (a) a reference to a memory location having the information about the group of instructions, including information about the size of the group of instructions or (b) a processor status word having information about the group of instructions, including information about the size of the group of instructions.

    Parallel decision tree processor architecture

    公开(公告)号:US10332008B2

    公开(公告)日:2019-06-25

    申请号:US14216990

    申请日:2014-03-17

    Abstract: A decision tree multi-processor system includes a plurality of decision tree processors that access a common feature vector and execute one or more decision trees with respect to the common feature vector. A related method includes providing a common feature vector to a plurality of decision tree processors implemented within an on-chip decision tree scoring system, and executing, by the plurality of decision tree processors, a plurality off decision trees, by reference to the common feature vector. A related decision tree-walking system includes feature storage that stores a common feature vector and a plurality of decision tree processors that access the common feature vector from the feature storage and execute a plurality of decision trees by comparing threshold values of the decision trees to feature values within the common feature vector.

    Bulk allocation of instruction blocks to a processor instruction window

    公开(公告)号:US09720693B2

    公开(公告)日:2017-08-01

    申请号:US14752685

    申请日:2015-06-26

    Abstract: A processor core in an instruction block-based microarchitecture includes a control unit that allocates instructions into an instruction window in bulk by fetching blocks of instructions and associated resources including control bits and operands at once. Such bulk allocation supports increased efficiency in processor core operations by enabling consistent management and policy implementation across all the instructions in the block during execution. For example, when an instruction block branches back on itself, it may be reused in a refresh process rather than being re-fetched from the instruction cache. As all of the resources for that instruction block are in one place, the instructions can remain in place and only valid bits need to be cleared. Bulk allocation also facilitates operand sharing by instructions in a block and explicit messaging among instructions.

    Explicit Instruction Scheduler State Information for a Processor
    9.
    发明申请
    Explicit Instruction Scheduler State Information for a Processor 审中-公开
    处理器的显式指令调度器状态信息

    公开(公告)号:US20160378496A1

    公开(公告)日:2016-12-29

    申请号:US14752797

    申请日:2015-06-26

    CPC classification number: G06F9/3836 G06F8/41 G06F9/38 G06F9/3802 G06F9/3814

    Abstract: A method including fetching a group of instructions, where the group of instructions is configured to execute atomically by a processor, is provided. The method further includes scheduling at least one of the group of instructions for execution by the processor before decoding the at least one of the group of instructions based at least on pre-computed ready state information associated with the at least one of the group of instructions.

    Abstract translation: 提供了一种方法,其包括获取一组指令,其中指令组被配置为由处理器原子地执行。 所述方法还包括:至少基于与所述指令组中的至少一个指令相关联的预先计算的就绪状态信息,在所述指令组中的所述至少一个指令解码之前调度所述指令组中的至少一个以供所述处理器执行 。

    DECOUPLED PROCESSOR INSTRUCTION WINDOW AND OPERAND BUFFER

    公开(公告)号:US20190310852A1

    公开(公告)日:2019-10-10

    申请号:US16450172

    申请日:2019-06-24

    Abstract: A processor core in an instruction block-based microarchitecture is configured so that an instruction window and operand buffers are decoupled for independent operation in which instructions in the block are not tied to resources such as control bits and operands that are maintained in the operand buffers. Instead, pointers are established among instructions in the block and the resources so that control state can be established for a refreshed instruction block (i.e., an instruction block that is reused without re-fetching it from an instruction cache) by following the pointers. Such decoupling of the instruction window from the operand space can provide greater processor efficiency, particularly in multiple core arrays where refreshing is utilized (for example when executing program code that uses tight loops), because the operands and control bits are pre-validated.

Patent Agency Ranking