Processors having fully-connected interconnects shared by vector conflict instructions and permute instructions

    公开(公告)号:US10678541B2

    公开(公告)日:2020-06-09

    申请号:US13977126

    申请日:2011-12-29

    IPC分类号: G06F9/30 G06F9/38

    摘要: An apparatus includes a decode unit to decode a permute instruction and a vector conflict instruction. A vector execution unit is coupled with the decode unit and includes a fully-connected interconnect. The fully-connected interconnect has at least four inputs to receive at least four corresponding data elements of at least one source vector. The fully-connected interconnect has at least four outputs. Each of the at least four inputs is coupled with each of the at least four outputs. The execution unit also includes a permute instruction execution logic coupled with the at least four outputs and operable to store a first vector result in response to the permute instruction. The execution unit also includes a vector conflict instruction execution logic coupled with the at least four outputs and operable to store a second vector result in a destination storage location in response to the vector conflict instruction.

    PROCESSORS HAVING FULLY-CONNECTED INTERCONNECTS SHARED BY VECTOR CONFLICT INSTRUCTIONS AND PERMUTE INSTRUCTIONS
    2.
    发明申请
    PROCESSORS HAVING FULLY-CONNECTED INTERCONNECTS SHARED BY VECTOR CONFLICT INSTRUCTIONS AND PERMUTE INSTRUCTIONS 审中-公开
    具有由VECTOR CONFLICT指令和指令说明共享的完全连接的互连的处理程序

    公开(公告)号:US20140181466A1

    公开(公告)日:2014-06-26

    申请号:US13977126

    申请日:2011-12-29

    IPC分类号: G06F9/30 G06F9/38

    摘要: An apparatus includes a decode unit to decode a permute instruction and a vector conflict instruction. A vector execution unit is coupled with the decode unit and includes a fully-connected interconnect. The fully-connected interconnect has at least four inputs to receive at least four corresponding data elements of at least one source vector. The fully-connected interconnect has at least four outputs. Each of the at least four inputs is coupled with each of the at least four outputs. The execution unit also includes a permute instruction execution logic coupled with the at least four outputs and operable to store a first vector result in response to the permute instruction. The execution unit also includes a vector conflict instruction execution logic coupled with the at least four outputs and operable to store a second vector result in a destination storage location in response to the vector conflict instruction.

    摘要翻译: 一种装置包括解码单元,用于解码置换指令和向量冲突指令。 向量执行单元与解码单元耦合并且包括完全连接的互连。 完全连接的互连具有至少四个输入以接收至少一个源向量的至少四个对应的数据元素。 完全连接的互连至少有四个输出。 所述至少四个输入中的每一个与所述至少四个输出中的每一个耦合。 所述执行单元还包括与所述至少四个输出耦合的置换指令执行逻辑,并且可操作以响应于所述置换指令来存储第一向量结果。 执行单元还包括与至少四个输出耦合的向量冲突指令执行逻辑,并且可操作以响应于向量冲突指令将第二向量结果存储在目的地存储位置。

    METHOD AND APPARATUS FOR EFFICIENTLY MANAGING ARCHITECTURAL REGISTER STATE OF A PROCESSOR
    6.
    发明申请
    METHOD AND APPARATUS FOR EFFICIENTLY MANAGING ARCHITECTURAL REGISTER STATE OF A PROCESSOR 有权
    有效管理处理者建筑登记状态的方法和装置

    公开(公告)号:US20160179527A1

    公开(公告)日:2016-06-23

    申请号:US14581535

    申请日:2014-12-23

    IPC分类号: G06F9/30

    摘要: An apparatus and method for efficiently managing the architectural state of a processor. For example, one embodiment of a processor comprises: a source mask register to be logically subdivided into at least a first portion to store a usable portion of a mask value and a second portion to store an indication of whether the usable portion of the mask value has been updated; a control register to store an unusable portion of the mask value; architectural state management logic to read the indication to determine whether the mask value has been updated prior to performing a store operation, wherein if the mask value has been updated, then the architectural state management logic is to read the usable portion of the mask value from the first portion of the source mask register and zero out bits of the unusable portion of the mask value to generate a final mask value to be saved to memory, and wherein if the mask value has not been updated, then the architectural state management logic is to concatenate the usable portion of the mask value with the unusable portion of the mask value read from the control register to generate a final mask value to be saved to memory.

    摘要翻译: 一种用于有效管理处理器的架构状态的装置和方法。 例如,处理器的一个实施例包括:源屏蔽寄存器,其逻辑地细分为至少第一部分以存储掩模值的可用部分,以及第二部分,用于存储掩模值的可用部分的指示 已经升级; 控制寄存器,用于存储掩模值的不可用部分; 架构状态管理逻辑,用于读取指示以确定在执行存储操作之前是否更新了掩码值,其中如果掩码值已被更新,则架构状态管理逻辑将从掩码值的可用部分读取 源掩码寄存器的第一部分和掩模值的不可用部分的零输出位,以产生要保存到存储器的最终掩码值,并且其中如果掩码值尚未被更新,则架构状态管理逻辑是 将掩模值的可用部分与从控制寄存器读取的掩模值的不可用部分连接,以生成要保存到存储器的最终掩模值。