PROCESSOR HAVING INCREASED PERFORMANCE VIA ELIMINATION OF SERIAL DEPENDENCIES
    2.
    发明申请
    PROCESSOR HAVING INCREASED PERFORMANCE VIA ELIMINATION OF SERIAL DEPENDENCIES 审中-公开
    处理者通过排除序列依赖性具有提高的性能

    公开(公告)号:US20120166769A1

    公开(公告)日:2012-06-28

    申请号:US12979946

    申请日:2010-12-28

    CPC classification number: G06F9/3838 G06F9/3017

    Abstract: Methods and apparatuses are provided for achieving increased performance via elimination of serial dependencies in instructions or instruction sequences. The apparatus comprises an operational unit for determining whether an instruction will cause dependencies during completion in an execution unit. Responsive to that determination the instruction is replaced with an alternative instruction for completion in the execution unit. In this way, the alternative instruction is completed without causing dependencies in the execution unit. The method comprises determining that an instruction will cause dependencies during completion in a processor and replacing the instruction with an alternative instruction for completion in the processor.

    Abstract translation: 提供了通过消除指令或指令序列中的串行依赖性来实现提高性能的方法和装置。 该装置包括用于在执行单元中完成期间确定指令是否将引起依赖性的操作单元。 响应于该确定,指令被替换为在执行单元中完成的替代指令。 以这种方式,完成替代指令而不会在执行单元中引起相关性。 该方法包括确定在处理器完成期间指令将引起相关性,并用替代指令替换指令以在处理器中完成。

    THREE OPERAND INSTRUCTION EXTENSION FOR X86 ARCHITECTURE
    3.
    发明申请
    THREE OPERAND INSTRUCTION EXTENSION FOR X86 ARCHITECTURE 有权
    X86架构的三个操作指导扩展

    公开(公告)号:US20090031116A1

    公开(公告)日:2009-01-29

    申请号:US11954623

    申请日:2007-12-12

    Abstract: A method and apparatus are contemplated for increasing the number of available instructions in an instruction set architecture. The new instructions extend the number of general-purpose registers and include three or more operands. A combination of an escape code field, an opcode field, an operation configuration field and an operation size field determines a unique new instruction operation. A source operand extension field includes bits to be combined with other fields in order to extend the number of source operand values for general-purpose registers.

    Abstract translation: 预期方法和装置用于增加指令集架构中可用指令的数量。 新指令扩展通用寄存器的数量,并包括三个或更多个操作数。 转义码字段,操作码字段,操作配置字段和操作大小字段的组合决定了唯一的新指令操作。 源操作数扩展字段包括要与其他字段组合的位,以便扩展通用寄存器的源操作数值的数量。

    Reliable execution using compare and transfer instruction on an SMT machine
    4.
    发明授权
    Reliable execution using compare and transfer instruction on an SMT machine 有权
    在SMT机器上使用比较和传输指令可靠执行

    公开(公告)号:US08082425B2

    公开(公告)日:2011-12-20

    申请号:US12432146

    申请日:2009-04-29

    Abstract: A system and method for efficient reliable execution on a simultaneous multithreading machine. A processor is placed in a reliable execution mode (REM) to detect possible errors during execution of a software application. Only two threads may be configured to operate in this mode. Floating-point store and integer-transfer unary instructions may be converted to new instructions. Each new instruction has two source operands, each corresponding to a different thread is specified by a same logical register number as a single source operand of the original unary instruction. All other instructions are replicated, wherein the original instruction and its twin are assigned to different threads. Simultaneous multi-threaded (SMT) floating-point logic may only be able to provide lockstep execution when it communicates using the new instruction with instantiated integer independent clusters. The new instruction cannot begin until both source operands are ready, which are subsequently compared to determine any mismatches or errors.

    Abstract translation: 一种用于在同时多线程机上高效可靠执行的系统和方法。 将处理器放置在可靠的执行模式(REM)中,以在软件应用程序的执行期间检测可能的错误。 只有两个线程可以配置为在此模式下运行。 浮点存储和整数转移一元指令可以转换为新指令。 每个新指令有两个源操作数,每个对应一个不同的线程由与原始一元指令的单个源操作数相同的逻辑寄存器号指定。 复制所有其他指令,其中原始指令及其双指针分配给不同的线程。 同步多线程(SMT)浮点逻辑只能在使用具有实例化的整数独立簇的新指令进行通信时提供锁步执行。 在两个源操作数准备就绪之前,新指令才能开始,随后进行比较以确定任何不匹配或错误。

    Superscalar register-renaming for a stack-addressed architecture
    5.
    发明授权
    Superscalar register-renaming for a stack-addressed architecture 有权
    堆栈寻址架构的超标量寄存器重命名

    公开(公告)号:US08539397B2

    公开(公告)日:2013-09-17

    申请号:US12482977

    申请日:2009-06-11

    CPC classification number: G06F9/30032 G06F9/30134 G06F9/3869 G06F17/5045

    Abstract: A system and method for increasing processor throughput by decreasing a loop critical path. In one embodiment, a table comprises multiple stack entries, each comprising an x87 floating-point (FP) stack specifier. The combinatorial logic for operand translation of N FP instructions per clock cycle may require N instantiated copies of a combinatorial logic block. Each instantiated copy may determine a new ordering of the stack entries. Control logic may receive necessary information from the corresponding N FP instructions and determine a corresponding combined computational effect, or stack reordering, on entries within the table based on two or more instructions. Resulting control signals are conveyed to the N instantiated copies. A resulting accumulative delay from an input of the first copy to the output of the Nth copy may be less than or equal to (N−1)*time_delay versus a longer N*time_delay.

    Abstract translation: 一种通过减少循环关键路径来提高处理器吞吐量的系统和方法。 在一个实施例中,表包括多个堆栈条目,每个堆栈条目包括x87浮点(FP)堆栈说明符。 用于每个时钟周期的NFP指令的操作数转换的组合逻辑可能需要组合逻辑块的N个实例化副本。 每个实例化的副本可以确定堆栈条目的新排序。 控制逻辑可以从相应的NFP指令接收必要的信息,并且基于两个或更多个指令来确定在表内的条目上的对应的组合计算效果或堆栈重新排序。 所得到的控制信号被传送到N个实例复制。 从第一副本的输入到第N个副本的输出的结果累积延迟可以小于或等于(N-1)* time_delay与较长的N * time_delay。

    COMBINED BYTE-PERMUTE AND BIT SHIFT UNIT
    6.
    发明申请
    COMBINED BYTE-PERMUTE AND BIT SHIFT UNIT 有权
    组合式字节和转换单元

    公开(公告)号:US20100318771A1

    公开(公告)日:2010-12-16

    申请号:US12482974

    申请日:2009-06-11

    CPC classification number: G06F9/30032 G06F9/3851 G06F9/3891

    Abstract: A processor includes a decode unit and a byte permute unit. The byte permute unit receives an instruction from the decode unit. The byte permute unit determines whether the instruction corresponds to a shuffle instruction or a shift instruction. For a shuffle instruction, the byte permute unit uses a byte shuffler to perform a shuffle operation indicated by the instruction. For a shift instruction that indicates a shift magnitude, the byte permute unit uses the byte shuffler to byte-level shift a source operand corresponding to the instruction by an integer number of bytes. The byte permute unit also generates a sequence of output bits by bit-shifting the byte-level shifted source operand by a number of bits such that the sum of the number of bits and the integer number of bytes is equal to the shift magnitude.

    Abstract translation: 处理器包括解码单元和字节置换单元。 字节置换单元从解码单元接收指令。 字节置换单元确定指令是否对应于混洗指令或移位指令。 对于洗牌指令,字节置换单元使用字节洗牌器执行指令所指示的随机操作。 对于指示移位幅度的移位指令,字节置换单元使用字节洗牌器将对应于该指令的源操作数字节级移位整数个字节。 字节置换单元还通过将字节电平移位的源操作数进行比特移位多个位来产生输出比特序列,使得比特数和整数字节的和等于移位量。

    RELIABLE EXECUTION USING COMPARE AND TRANSFER INSTRUCTION ON AN SMT MACHINE
    7.
    发明申请
    RELIABLE EXECUTION USING COMPARE AND TRANSFER INSTRUCTION ON AN SMT MACHINE 有权
    使用SMT机器的比较和传输指令进行可靠的执行

    公开(公告)号:US20100281239A1

    公开(公告)日:2010-11-04

    申请号:US12432146

    申请日:2009-04-29

    Abstract: A system and method for efficient reliable execution on a simultaneous multithreading machine. A processor is placed in a reliable execution mode (REM) to detect possible errors during execution of a mission critical software application. Only two threads may be configured to operate in this mode. Floating-point store and integer-transfer unary instructions may be converted to new binary instructions. Each new instruction has two source operands, each one corresponding to a different thread is specified by a same logical register number as a single source operand of the original unary instruction. All other instructions are replicated, wherein the original instruction and its twin are assigned to different threads. Simultaneous multi-threaded (SMT) floating-point logic may only be able to provide lockstep execution when it communicates using the new instruction with instantiated integer independent clusters. The new instruction cannot begin until both source operands are ready, which are subsequently compared to determine any mismatches or errors.

    Abstract translation: 一种用于在同时多线程机上高效可靠执行的系统和方法。 将处理器置于可靠的执行模式(REM)中,以检测任务关键型软件应用程序执行期间的可能错误。 只有两个线程可以配置为在此模式下运行。 浮点存储和整数传递一元指令可以转换为新的二进制指令。 每个新指令都有两个源操作数,每一个对应一个不同的线程由与原始一元指令的单个源操作数相同的逻辑寄存器号来指定。 复制所有其他指令,其中原始指令及其双指针分配给不同的线程。 同步多线程(SMT)浮点逻辑只能在使用具有实例化的整数独立簇的新指令进行通信时提供锁步执行。 在两个源操作数准备就绪之前,新指令才能开始,随后进行比较以确定任何不匹配或错误。

    Combined byte-permute and bit shift unit
    8.
    发明授权
    Combined byte-permute and bit shift unit 有权
    组合字节置换和位移单元

    公开(公告)号:US08909904B2

    公开(公告)日:2014-12-09

    申请号:US12482974

    申请日:2009-06-11

    CPC classification number: G06F9/30032 G06F9/3851 G06F9/3891

    Abstract: A processor includes a decode unit and a byte permute unit. The byte permute unit receives an instruction from the decode unit. The byte permute unit determines whether the instruction corresponds to a shuffle instruction or a shift instruction. For a shuffle instruction, the byte permute unit uses a byte shuffler to perform a shuffle operation indicated by the instruction. For a shift instruction that indicates a shift magnitude, the byte permute unit uses the byte shuffler to byte-level shift a source operand corresponding to the instruction by an integer number of bytes. The byte permute unit also generates a sequence of output bits by bit-shifting the byte-level shifted source operand by a number of bits such that the sum of the number of bits and the integer number of bytes is equal to the shift magnitude.

    Abstract translation: 处理器包括解码单元和字节置换单元。 字节置换单元从解码单元接收指令。 字节置换单元确定指令是否对应于混洗指令或移位指令。 对于洗牌指令,字节置换单元使用字节洗牌器执行指令所指示的随机操作。 对于指示移位幅度的移位指令,字节置换单元使用字节洗牌器将对应于该指令的源操作数字节级移位整数个字节。 字节置换单元还通过将字节电平移位的源操作数进行比特移位多个位来产生输出比特序列,使得比特数和整数字节的和等于移位量。

    Three operand instruction extension for X86 architecture
    9.
    发明授权
    Three operand instruction extension for X86 architecture 有权
    X86架构的三个操作指令扩展

    公开(公告)号:US07836278B2

    公开(公告)日:2010-11-16

    申请号:US11954623

    申请日:2007-12-12

    Abstract: A method and apparatus are contemplated for increasing the number of available instructions in an instruction set architecture. The new instructions extend the number of general-purpose registers and include three or more operands. A combination of an escape code field, an opcode field, an operation configuration field and an operation size field determines a unique new instruction operation. A source operand extension field includes bits to be combined with other fields in order to extend the number of source operand values for general-purpose registers.

    Abstract translation: 预期方法和装置用于增加指令集架构中可用指令的数量。 新指令扩展通用寄存器的数量,并包括三个或更多个操作数。 转义码字段,操作码字段,操作配置字段和操作大小字段的组合决定了唯一的新指令操作。 源操作数扩展字段包括要与其他字段组合的位,以便扩展通用寄存器的源操作数值的数量。

    PROCESSOR HAVING INCREASED PERFORMANCE AND ENERGY SAVING VIA MOVE ELIMINATION
    10.
    发明申请
    PROCESSOR HAVING INCREASED PERFORMANCE AND ENERGY SAVING VIA MOVE ELIMINATION 审中-公开
    处理器具有增强的性能和通过移动消除的能量消耗

    公开(公告)号:US20120005459A1

    公开(公告)日:2012-01-05

    申请号:US12979948

    申请日:2010-12-28

    CPC classification number: G06F9/384 G06F9/30032 G06F9/3017

    Abstract: Methods and apparatuses are provided for increasing processor performance and energy saving via eliminating physical data movement to accomplish a move instruction. The apparatus comprises a first plurality of available physical registers mapped to a second plurality of logical registers, including a source logical register and a destination logical register. A renaming unit remaps the destination logical register to the same physical register mapping as the source logical register in response to a move instruction. In this way, the move instruction is effectively executed without moving data between physical registers. A method is provided for increasing processor performance and energy saving via eliminating physical data movement to accomplish a move instruction. The method comprises determining a mapping of a logical source register and a logical destination register to physical registers of a processor and then remapping the logical destination register to the same physical register mapping as the logical source register to affect an equivalent of the move instruction with actual data movement between physical registers.

    Abstract translation: 提供了通过消除物理数据移动来实现移动指令来提高处理器性能和节能的方法和装置。 该装置包括映射到包括源逻辑寄存器和目的地逻辑寄存器的第二多个逻辑寄存器的第一多个可用物理寄存器。 重命名单元将目的地逻辑寄存器重新映射到与源逻辑寄存器相同的物理寄存器映射以响应移动指令。 以这种方式,在不在物理寄存器之间移动数据的情况下,有效地执行移动指令。 提供了一种通过消除物理数据移动来实现移动指令来提高处理器性能和节能的方法。 该方法包括确定逻辑源寄存器和逻辑目标寄存器到处理器的物理寄存器的映射,然后将逻辑目标寄存器重映射到与逻辑源寄存器相同的物理寄存器映射,以影响具有实际值的移位指令的等效值 物理寄存器之间的数据移动。

Patent Agency Ranking