Apparatus and method of improved extract instructions
    3.
    发明授权
    Apparatus and method of improved extract instructions 有权
    改进提取指令的装置和方法

    公开(公告)号:US09588764B2

    公开(公告)日:2017-03-07

    申请号:US13976998

    申请日:2011-12-23

    IPC分类号: G06F9/30

    摘要: An apparatus is described that includes instruction execution circuitry to execute first, second, third, and fourth instructions, the first and second instructions select a first group of input vector elements from one of multiple first non-overlapping sections of respective first and second input vectors. Each of the multiple first non-overlapping sections have a same bit width as the first group. Both the third and fourth instructions select a second group of input vector elements from one of multiple second non overlapping sections of respective third and fourth input vectors. The second group has a second bit width that is larger than the first bit width. Each of multiple second non overlapping sections have a same bit width as the second group. The apparatus includes masking layer circuitry to mask the first and second groups at a first granularity and second granularity.

    摘要翻译: 描述了一种装置,其包括执行第一,第二,第三和第四指令的指令执行电路,第一和第二指令从第一和第二输入向量的多个第一非重叠部分之一中选择第一组输入向量元素 。 多个第一非重叠部分中的每一个具有与第一组相同的位宽度。 第三和第四指令都从相应的第三和第四输入向量的多个第二非重叠部分之一中选择第二组输入向量元素。 第二组具有比第一位宽大的第二位宽度。 多个第二非重叠部分中的每一个具有与第二组相同的位宽度。 该装置包括掩蔽层电路,以第一粒度和第二粒度掩蔽第一和第二组。

    Fusible instructions and logic to provide OR-test and AND-test functionality using multiple test sources
    4.
    发明授权
    Fusible instructions and logic to provide OR-test and AND-test functionality using multiple test sources 有权
    使用多个测试源提供OR-test和AND-test功能的易熔指令和逻辑

    公开(公告)号:US09483266B2

    公开(公告)日:2016-11-01

    申请号:US13843020

    申请日:2013-03-15

    IPC分类号: G06F9/30 G06F9/38

    摘要: Fusible instructions and logic provide OR-test and AND-test functionality on multiple test sources. Some embodiments include a processor decode stage to decode a test instruction for execution, the instruction specifying first, second and third source data operands, and an operation type. Execution units, responsive to the decoded test instruction, perform one logical operation, according to the specified operation type, between data from the first and second source data operands, and perform a second logical operation between the data from the third source data operand and the result of the first logical operation to set a condition flag. Some embodiments generate the test instruction dynamically by fusing one logical instruction with a prior-art test instruction. Other embodiments generate the test instruction through a just-in-time compiler. Some embodiments also fuse the test instruction with a subsequent conditional branch instruction, and perform a branch according to how the condition flag is set.

    摘要翻译: 易熔指令和逻辑在多个测试源上提供OR测试和与测试功能。 一些实施例包括解码用于执行的测试指令的处理器解码级,指定第一,第二和第三源数据操作数的指令以及操作类型。 执行单元响应于解码的测试指令,根据指定的操作类型在来自第一和第二源数据操作数的数据之间执行一个逻辑操作,并且执行来自第三源数据操作数的数据和 第一个逻辑运算结果设置条件标志。 一些实施例通过将一个逻辑指令与现有技术的测试指令进行融合来动态地产生测试指令。 其他实施例通过即时编译器生成测试指令。 一些实施例还将测试指令与随后的条件分支指令融合,并且根据条件标志的设置来执行分支。

    METHODS, APPARATUS, INSTRUCTIONS, AND LOGIC TO PROVIDE VECTOR ADDRESS CONFLICT DETECTION FUNCTIONALITY
    5.
    发明申请
    METHODS, APPARATUS, INSTRUCTIONS, AND LOGIC TO PROVIDE VECTOR ADDRESS CONFLICT DETECTION FUNCTIONALITY 有权
    方法,装置,说明和逻辑提供矢量地址冲突检测功能

    公开(公告)号:US20140189308A1

    公开(公告)日:2014-07-03

    申请号:US13731006

    申请日:2012-12-29

    IPC分类号: G06F9/30

    摘要: Instructions and logic provide SIMD address conflict detection functionality. Some embodiments include processors with a register with a variable plurality of data fields, each of the data fields to store an offset for a data element in a memory. A destination register has corresponding data fields, each of these data fields to store a variable second plurality of bits to store a conflict mask having a mask bit for each offset. Responsive to decoding a vector conflict instruction, execution units compare the offset in each data field with every less significant data field to determine if they hold a matching offset, and in corresponding conflict masks in the destination register, set any mask bits corresponding to a less significant data field with a matching offset. Vector address conflict detection can be used with variable sized elements and to generate conflict masks to resolve dependencies in gather-modify-scatter SIMD operations.

    摘要翻译: 指令和逻辑提供SIMD地址冲突检测功能。 一些实施例包括具有可变多个数据字段的寄存器的处理器,每个数据字段存储用于存储器中的数据元素的偏移量。 目的地寄存器具有对应的数据字段,这些数据字段中的每一个用于存储可变的第二多个位以存储具有每个偏移的掩码位的冲突掩码。 响应于对向量冲突指令进行解码,执行单元将每个数据字段中的偏移量与每个较不重要的数据字段进行比较,以确定它们是否保持匹配的偏移,并且在目标寄存器中的相应冲突掩码中,设置对应于较少 具有匹配偏移的重要数据字段。 向量地址冲突检测可以与可变大小的元素一起使用,并生成冲突掩码来解决收集修改分散SIMD操作中的依赖关系。

    INSTRUCTION EXECUTION UNIT THAT BROADCASTS DATA VALUES AT DIFFERENT LEVELS OF GRANULARITY
    7.
    发明申请
    INSTRUCTION EXECUTION UNIT THAT BROADCASTS DATA VALUES AT DIFFERENT LEVELS OF GRANULARITY 有权
    指定执行单位在不同级别的范围内广播数据值

    公开(公告)号:US20130339664A1

    公开(公告)日:2013-12-19

    申请号:US13976003

    申请日:2011-12-23

    IPC分类号: G06F9/30

    摘要: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The first data structure is four times as large as the second data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second instruction to create a second replication data structure.

    摘要翻译: 描述了包括执行第一指令和第二指令的执行单元的装置。 执行单元包括输入寄存器空间,用于在执行第一指令时存储要复制的第一数据结构,并且在执行第二指令时存储要复制的第二数据结构。 第一和第二数据结构都是打包数据结构。 第一打包数据结构的数据值是第二打包数据结构的数据值的两倍。 第一个数据结构是第二个数据结构的四倍。 执行单元还包括复制逻辑电路,以在执行第一指令以创建第一复制数据结构时复制第一数据结构,并且在执行第二指令以创建第二复制数据结构时复制第二数据结构。

    APPARATUS AND METHOD OF IMPROVED EXTRACT INSTRUCTIONS
    8.
    发明申请
    APPARATUS AND METHOD OF IMPROVED EXTRACT INSTRUCTIONS 有权
    改进提取说明的装置和方法

    公开(公告)号:US20130275730A1

    公开(公告)日:2013-10-17

    申请号:US13976998

    申请日:2011-12-23

    IPC分类号: G06F9/30

    摘要: An apparatus is described that includes instruction execution logic circuitry to execute first, second, third and fourth instructions. Both the first instruction and the second instruction select a first group of input vector elements from one of multiple first non overlapping sections of respective first and second input vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction select a second group of input vector elements from one of multiple second non overlapping sections of respective third and fourth input vectors. The second group has a second bit width that is larger than the first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group. The apparatus includes masking layer circuitry to mask the first and second groups of the first and third instructions at a first granularity, where, respective resultants produced therewith are respective resultants of the first and third instructions. The masking circuitry is also to mask the first and second groups of the second and fourth instructions at a second granularity, where, respective resultants produced therewith are respective resultants of the second and fourth instructions.

    摘要翻译: 描述了包括执行第一,第二,第三和第四指令的指令执行逻辑电路的装置。 第一指令和第二指令都从相应的第一和第二输入向量的多个第一非重叠部分之一中选择第一组输入向量元素。 第一组具有第一位宽度。 多个第一非重叠部分中的每一个具有与第一组相同的位宽度。 第三指令和第四指令都从相应的第三和第四输入向量的多个第二非重叠部分之一中选择第二组输入向量元素。 第二组具有比第一位宽大的第二位宽度。 多个第二非重叠部分中的每一个具有与第二组相同的位宽度。 该装置包括掩蔽层电路,以第一粒度掩蔽第一和第三指令的第一和第二组,其中由其产生的相应结果是第一和第三指令的相应结果。 掩蔽电路还以第二粒度掩蔽第二和第四指令的第一和第二组,其中由其产生的相应结果是第二和第四指令的相应结果。