Processor and method for out-of-order completion of floating-point
operations during load/store multiple operations
    1.
    发明授权
    Processor and method for out-of-order completion of floating-point operations during load/store multiple operations 失效
    用于在加载/存储多个操作期间浮点运算的无序完成的处理器和方法

    公开(公告)号:US5850563A

    公开(公告)日:1998-12-15

    申请号:US526610

    申请日:1995-09-11

    IPC分类号: G06F9/312 G06F9/38

    摘要: A method and apparatus in a superscalar microprocessor for early completion of floating-point instructions prior to a previous load/store multiple instruction is provided. The microprocessor's load/store execution unit loads or stores data to or from the general purpose registers, and the microprocessor's dispatch unit dispatches instructions to a plurality of execution units, including the load/store execution unit and the floating point execution unit. The method comprises the dispatch unit dispatching a multi-register instruction to the load/store unit to begin execution of the multi-register instruction, wherein the multi-register instruction, such as a store multiple or a load multiple, stores or loads data from more than one of the plurality of general purpose registers to memory, and further, prior to the multi-register instruction finishing execution in the load/store unit, the dispatch unit dispatches a floating-point instruction, which is dependent upon source operand data stored in one or more floating-point registers of the plurality of floating point registers, to the floating-point execution unit, wherein the dispatched floating-point instruction completes execution prior to the multi-register instruction finishing execution.

    摘要翻译: 提供了在先前加载/存储多个指令之前的超标量微处理器中用于早期完成浮点指令的方法和装置。 微处理器的加载/存储执行单元向通用寄存器加载或存储数据,微处理器的调度单元将指令分派到包括加载/存储执行单元和浮点执行单元的多个执行单元。 该方法包括:调度单元向加载/存储单元分配多寄存器指令以开始执行多寄存器指令,其中多寄存器指令(例如存储器多个或加载倍数)存储或加载来自 多个通用寄存器中的多于一个存储器,此外,在多个寄存器指令在加载/存储单元中完成执行之前,调度单元调度浮点指令,其依赖于存储的源操作数据 在多个浮点寄存器的一个或多个浮点寄存器中,提供给浮点执行单元,其中调度浮点指令在多寄存器指令完成执行之前完成执行。

    Method and system for selective support of non-architected instructions
within a superscaler processor system utilizing a special access bit
within a machine state register
    2.
    发明授权
    Method and system for selective support of non-architected instructions within a superscaler processor system utilizing a special access bit within a machine state register 失效
    在利用机器状态寄存器内的特殊访问位的超标量处理器系统内选择性地支持非架构指令的方法和系统

    公开(公告)号:US5758141A

    公开(公告)日:1998-05-26

    申请号:US386977

    申请日:1995-02-10

    IPC分类号: G06F9/30 G06F9/318 G06F9/455

    摘要: A method and system for permitting the selective support of non-architected instructions within a superscalar processor system. A special access bit within the system machine state register is provided and set in response to each initiation of an application during which execution of non-architected instructions is desired. Thereafter, each time a non-architected instruction is decoded the status of the special access bit is determined. The non-architected instruction is executed in response to a set state of the special access bit. The illegal instruction program interrupt is issued in response to an attempted execution of a non-architected instruction if the special access bit is not set. In this manner, for example, complex instruction set computing (CISC) instructions may be selectively enabled for execution within a reduced instruction set computing (RISC) data processing system while maintaining full architectural compliance with the reduced instruction set computing (RISC) instructions.

    摘要翻译: 一种用于允许在超标量处理器系统内选择性地支持非架构指令的方法和系统。 系统机器状态寄存器内的特殊访问位被提供和设置为响应于期望执行非架构指令的应用程序的每个启动。 此后,每当非架构化指令被解码时,确定特殊访问位的状态。 响应于特殊访问位的设置状态执行非架构指令。 如果未设置特殊访问位,则响应于非架构化指令的尝试执行而发出非法指令程序中断。 以这种方式,例如,复杂指令集计算(CISC)指令可以选择性地启用以在精简指令集计算(RISC)数据处理系统中执行,同时保持与精简指令集计算(RISC)指令的完全架构符合性。

    Method for executing speculative load instructions in high-performance
processors
    3.
    发明授权
    Method for executing speculative load instructions in high-performance processors 失效
    在高性能处理器中执行推测加载指令的方法

    公开(公告)号:US5611063A

    公开(公告)日:1997-03-11

    申请号:US597647

    申请日:1996-02-06

    IPC分类号: G06F9/312 G06F9/38 G06F9/30

    摘要: A method for selectively executing speculative load instructions in a high-performance processor is disclosed. In accordance with the present disclosure, when a speculative load instruction for which the data is not stored in a data cache is encountered, a bit within an enable speculative load table which is associated with that particular speculative load instruction is read in order to determine a state of the bit. If the associated bit is in a first state, data for the speculative load instruction is requested from a system bus and further execution of the speculative load instruction is then suspended to wait for control signals from a branch processing unit. If the associated bit is in a second state, the execution of the speculative load instruction is immediately suspended to wait for control signals from the branch processing unit. If the speculative load instruction is executed in response to the control signals, then the associated bit in the enable speculative load table will be set to the first state. However, if the speculative load instruction is not executed in response to the control signals, then the associated bit in the enable speculative load table is set to the second state. In this manner, the displacement of useful data in the data cache due to wrongful execution of the speculative load instruction is avoided.

    摘要翻译: 公开了一种用于选择性地执行高性能处理器中的推测性加载指令的方法。 根据本公开,当遇到数据未被存储在数据高速缓冲存储器中的推测性加载指令时,读取与该特定推测加载指令相关联的使能投机载入表中的位,以便确定 状态的位。 如果关联位处于第一状态,则从系统总线请求用于推测加载指令的数据,然后暂停推测加载指令的进一步执行,以等待来自分支处理单元的控制信号。 如果相关联的位处于第二状态,则推测加载指令的执行被立即停止,以等待来自分支处理单元的控制信号。 如果响应于控制信号执行推测加载指令,则使能推测加载表中的关联位将被设置为第一状态。 然而,如果不响应于控制信号执行推测加载指令,则使能推测负载表中的关联位被设置为第二状态。 以这种方式,避免了由于推测加载指令的错误执行而在数据高速缓存中的有用数据的位移。

    Method and processor that permit concurrent execution of a store
multiple instruction and a dependent instruction
    4.
    发明授权
    Method and processor that permit concurrent execution of a store multiple instruction and a dependent instruction 失效
    允许并发执行存储多指令和依赖指令的方法和处理器

    公开(公告)号:US5867684A

    公开(公告)日:1999-02-02

    申请号:US873013

    申请日:1997-06-11

    IPC分类号: G06F9/312 G06F9/38 G06F12/00

    摘要: A method and device of executing a load multiple instruction in a superscaler microprocessor is provided. The method comprises the steps of dispatching a load multiple instruction to a load/store unit, wherein the load/store unit begins execution of a dispatched load multiple instruction, and wherein the load multiple instruction loads data from memory into a plurality of registers. The method further includes the step of maintaining a table that lists each register of the plurality of registers and that indicates when data has been loaded into each register by the executing load multiple instruction. The method concludes by executing an instruction that is dependent upon source operand data loaded by the load multiple instruction into a register of the plurality of registers indicated by the instruction as a source register, prior to the load multiple instruction completing its execution, when the table indicates the source operand data has been loaded into the source register. Also, according to the present invention, a method of executing a store multiple instruction in a superscaler microprocessor is provided. This method comprises the steps of dispatching a store multiple instruction to a load/store unit, whereupon the load/store unit begins executing the store multiple instruction, wherein the load store instruction stores data from a plurality of registers to memory; and executing a fixed point instruction that is dependent upon data being stored by the store multiple instruction from a register of the plurality of registers indicated by the fixed point instruction as a source register, prior to the store multiple instruction completing its execution, but prohibiting the executing fixed point instruction from writing to a register of the plurality of registers prior to the store multiple instruction completing.

    摘要翻译: 提供了一种在超标量微处理器中执行加载多指令的方法和装置。 该方法包括以下步骤:向加载/存储单元发送加载多个指令,其中加载/存储单元开始执行分派的加载多个指令,并且其中加载多个指令将数据从存储器加载到多个寄存器中。 该方法还包括维护列出多个寄存器的每个寄存器并且通过执行加载多个指令指示何时将数据加载到每个寄存器中的表的步骤。 该方法通过在载入多个指令完成其执行之前执行依赖于由加载多个指令加载的源操作数数据到由指令指示的多个寄存器的寄存器中作为源寄存器的指令,当该表 表示源操作数数据已加载到源寄存器中。 此外,根据本发明,提供了一种在超标量微处理器中执行存储多重指令的方法。 该方法包括以下步骤:将存储多重指令分派到加载/存储单元,从而加载/存储单元开始执行存储多指令,其中加载存储指令将数据从多个寄存器存储到存储器; 并且在存储多个指令完成其执行之前,执行依赖于由所述固定点指令指示的多个寄存器的寄存器作为源寄存器的存储多个指令存储的数据的固定点指令,但是禁止 在存储多个指令完成之前,从写入到多个寄存器的寄存器执行固定点指令。

    Method and system for enhanced management operation utilizing intermixed
user level and supervisory level instructions with partial concept
synchronization
    5.
    发明授权
    Method and system for enhanced management operation utilizing intermixed user level and supervisory level instructions with partial concept synchronization 失效
    利用混合用户级别和部分概念同步的监督级别指令来增强管理操作的方法和系统

    公开(公告)号:US5764969A

    公开(公告)日:1998-06-09

    申请号:US387149

    申请日:1995-02-10

    CPC分类号: G06F9/461 G06F12/1475

    摘要: A method and system for enhanced system management operations in a superscalar data processing system. Those supervisory level instructions which execute selected privileged operations within protected memory space are first identified as not requiring a full context synchronization. Each time execution of such an instruction is initiated an enable special access (ESA) instruction is executed as an entry point to that instruction or group of instructions. A portion of the machine state register for the data processing system is stored and the machine state register is then modified as follows: a problem bit is set, changing the execution privilege state to "supervisor;" external interrupts are disabled; and access privilege state bit is set; and, a special access mode bit is set, allowing execution of special instructions. The instructions which execute the selected privileged operations within the protected memory space are then executed. A disable special access (DSA) instruction is then executed which restores the bits within the machine state register which were modified during the ESA instruction. The ESA and DSA instructions are implemented without modifying the instruction stream by utilizing user level procedure calls, thereby reducing the overhead of the branch table necessary to determine the desired execution path.

    摘要翻译: 一种用于在超标量数据处理系统中增强系统管理操作的方法和系统。 在受保护的存储器空间内执行所选特权操作的这些监督级指令首先被识别为不需要完整的上下文同步。 每次执行这样的指令时,执行使能特殊访问(ESA)指令作为该指令或指令组的入口点。 存储用于数据处理系统的机器状态寄存器的一部分,然后如下修改机器状态寄存器:设置问题位,将执行特权状态改变为“主管”; 外部中断被禁用; 并设置访问权限状态位; 并设置特殊访问模式位,允许执行特殊指令。 然后执行在受保护的存储器空间内执行所选择的特权操作的指令。 然后执行禁用特殊访问(DSA)指令,其恢复机器状态寄存器中在ESA指令期间被修改的位。 通过利用用户级过程调用来实现ESA和DSA指令而不修改指令流,从而减少确定所需执行路径所需的分支表的开销。

    Method and device for early deallocation of resources during load/store
multiple operations to allow simultaneous dispatch/execution of
subsequent instructions

    公开(公告)号:US5694565A

    公开(公告)日:1997-12-02

    申请号:US526343

    申请日:1995-09-11

    IPC分类号: G06F9/312 G06F9/38 G06F12/00

    摘要: A method and device of executing a load multiple instruction in a superscaler microprocessor is provided. The method comprises the steps of dispatching a load multiple instruction to a load/store unit, wherein the load/store unit begins execution of a dispatched load multiple instruction, and wherein the load multiple instruction loads data from memory into a plurality of registers. The method further includes the step of maintaining a table that lists each register of the plurality of registers and that indicates when data has been loaded into each register by the executing load multiple instruction. The method concludes by executing an instruction that is dependent upon source operand data loaded by the load multiple instruction into a register of the plurality of registers indicated by the instruction as a source register, prior to the load multiple instruction completing its execution, when the table indicates the source operand data has been loaded into the source register. Also, according to the present invention, a method of executing a store multiple instruction in a superscaler microprocessor is provided. This method comprises the steps of dispatching a store multiple instruction to a load/store unit, whereupon the load/store unit begins executing the store multiple instruction, wherein the load store instruction stores data from a plurality of registers to memory; and executing a fixed point instruction that is dependent upon data being stored by the store multiple instruction from a register of the plurality of registers indicated by the fixed point instruction as a source register, prior to the store multiple instruction completing its execution, but prohibiting the executing fixed point instruction from writing to a register of the plurality of registers prior to the store multiple instruction completing.

    Apparatus and method for generating packed sum of absolute differences
    7.
    发明授权
    Apparatus and method for generating packed sum of absolute differences 有权
    用于产生绝对差的压缩和的装置和方法

    公开(公告)号:US08051116B2

    公开(公告)日:2011-11-01

    申请号:US12037596

    申请日:2008-02-26

    IPC分类号: G06F7/00

    CPC分类号: G06F9/3001 G06F9/30036

    摘要: A method for executing an MMX PSADBW instruction by a microprocessor. The method includes generating packed differences of packed operands of the instruction and generating borrow bits associated with each of the packed differences; for each of the packed differences: determining whether the borrow bit indicates the packed difference is positive or negative and selecting a value in response to the determining, the value comprising the packed difference if the associated borrow bit is positive and a complement of the packed difference if the associated borrow bit is negative; adding the selected values to generate a first sum and a first carry and in parallel adding the borrow bits to generate a second sum and a second carry; adding the first and second sums and the first and second carries to generate a result of the instruction; storing the result in a register of the microprocessor.

    摘要翻译: 一种用于由微处理器执行MMX PSADBW指令的方法。 该方法包括生成指令的打包操作数的压缩差,并产生与每个压缩差相关联的借位; 对于每个压缩差异:确定借位位是否指示压缩差是正还是负,并且响应于确定选择值,如果相关联的借位位为正,则包含压缩差的值,以及压缩差的补数 如果相关的借位是负数; 添加所选择的值以产生第一和和第一进位并且并行地加上借位位以产生第二和和第二进位; 添加第一和第二和和第一和第二输入以产生指令的结果; 将结果存储在微处理器的寄存器中。

    EFFICIENT DATA PREFETCHING IN THE PRESENCE OF LOAD HITS
    8.
    发明申请
    EFFICIENT DATA PREFETCHING IN THE PRESENCE OF LOAD HITS 有权
    有效的数据预处理在负载的存在

    公开(公告)号:US20110010501A1

    公开(公告)日:2011-01-13

    申请号:US12763938

    申请日:2010-04-20

    IPC分类号: G06F12/08 G06F12/00

    摘要: A BIU prioritizes L1 requests above L2 requests. The L2 generates a first request to the BIU and detects the generation of a snoop request and L1 request to the same cache line. The L2 determines whether a bus transaction to fulfill the first request may be retried and, if so, generates a miss, and otherwise generates a hit. Alternatively, the L2 detects the L1 generated a request to the L2 for the same line and responsively requests the BIU to refrain from performing a transaction on the bus to fulfill the first request if the BIU has not yet been granted the bus. Alternatively, a prefetch cache and the L2 allow the same line to be simultaneously present. If an L1 request hits in both the L2 and in the prefetch cache, the prefetch cache invalidates its copy of the line and the L2 provides the line to the L1.

    摘要翻译: BIU将L1请求优先于L2请求。 L2产生对BIU的第一个请求,并检测到窥探请求和L1请求到同一个高速缓存行的生成。 L2确定是否可以重试履行第一请求的总线事务,如果是,则产生未命中,否则生成命中。 或者,L2检测到L1产生对同一行的L2的请求,并且如果BIU尚未被授予总线,则响应地请求BIU在总线上不执行交易以执行第一请求。 或者,预取高速缓存和L2允许同时存在同一行。 如果L1请求都在L2和预取缓存中同时进行,则预取缓存使其副本无效,并且L2向L1提供该行。

    EFFICIENT DATA PREFETCHING IN THE PRESENCE OF LOAD HITS
    9.
    发明申请
    EFFICIENT DATA PREFETCHING IN THE PRESENCE OF LOAD HITS 有权
    有效的数据预处理在负载的存在

    公开(公告)号:US20120272003A1

    公开(公告)日:2012-10-25

    申请号:US13535152

    申请日:2012-06-27

    IPC分类号: G06F12/08

    摘要: A microprocessor configured to access an external memory includes a first-level cache, a second-level cache, and a bus interface unit (BIU) configured to interface the first-level and second-level caches to a bus used to access the external memory. The BIU is configured to prioritize requests from the first-level cache above requests from the second-level cache. The second-level cache is configured to generate a first request to the BIU to fetch a cache line from the external memory. The second-level cache is also configured to detect that the first-level cache has subsequently generated a second request to the second-level cache for the same cache line. The second-level cache is also configured to request the BIU to refrain from performing a transaction on the bus to fulfill the first request if the BIU has not yet been granted ownership of the bus to fulfill the first request.

    摘要翻译: 被配置为访问外部存储器的微处理器包括:第一级高速缓存,第二级高速缓存和总线接口单元(BIU),其被配置为将第一级和第二级高速缓存连接到用于访问外部存储器的总线 。 BIU被配置为优先考虑来自第二级缓存的来自第二级缓存的请求的请求。 第二级缓存被配置为生成对BIU的第一请求以从外部存储器获取高速缓存行。 第二级缓存还被配置为检测第一级高速缓存随后已经为同一高速缓存行生成了第二级缓存的第二请求。 第二级缓存还被配置为如果BIU尚未被授予总线的所有权以满足第一请求,则要求BIU避免在总线上执行事务以满足第一请求。

    APPARATUS AND METHOD FOR GENERATING PACKED SUM OF ABSOLUTE DIFFERENCES
    10.
    发明申请
    APPARATUS AND METHOD FOR GENERATING PACKED SUM OF ABSOLUTE DIFFERENCES 有权
    用于产生绝对差异的包装的装置和方法

    公开(公告)号:US20080162896A1

    公开(公告)日:2008-07-03

    申请号:US12037596

    申请日:2008-02-26

    IPC分类号: G06F9/302

    CPC分类号: G06F9/3001 G06F9/30036

    摘要: A method for executing an MMX PSADBW instruction by a microprocessor. The method includes generating packed differences of packed operands of the instruction and generating borrow bits associated with each of the packed differences; for each of the packed differences: determining whether the borrow bit indicates the packed difference is positive or negative and selecting a value in response to the determining, the value comprising the packed difference if the associated borrow bit is positive and a complement of the packed difference if the associated borrow bit is negative; adding the selected values to generate a first sum and a first carry and in parallel adding the borrow bits to generate a second sum and a second carry; adding the first and second sums and the first and second carries to generate a result of the instruction; storing the result in a register of the microprocessor.

    摘要翻译: 一种用于由微处理器执行MMX PSADBW指令的方法。 该方法包括生成指令的打包操作数的压缩差,并产生与每个压缩差相关联的借位; 对于每个压缩差异:确定借位位是否指示压缩差是正还是负,并且响应于确定选择值,如果相关联的借位位为正,则包含压缩差的值,以及压缩差的补数 如果相关的借位是负数; 添加所选择的值以产生第一和和第一进位并且并行地加上借位位以产生第二和和第二进位; 添加第一和第二和和第一和第二输入以产生指令的结果; 将结果存储在微处理器的寄存器中。