Computer system performance evaluation system and method
    1.
    发明授权
    Computer system performance evaluation system and method 失效
    计算机系统性能评估系统及方法

    公开(公告)号:US5450349A

    公开(公告)日:1995-09-12

    申请号:US967110

    申请日:1992-10-27

    IPC分类号: G06F11/267 G06F11/22

    CPC分类号: G06F11/2236

    摘要: A system for evaluating the performance of a computer system having a processor that passes through a plurality of processor states during operation and an associated system memory includes an operating unit for receiving a request to monitor specific process states from a user. Firmware causes the processor to enter the desired processor state requested by the user. The hardware identifies the occurrence of the desired processor state. Information relating to the occurrence of the desired process state is accumulated the memory. The accumulated information is read from memory and a report is provided to the user.

    摘要翻译: 一种用于评估在操作期间具有通过多个处理器状态的处理器的计算机系统的性能的系统,以及相关联的系统存储器包括用于从用户接收监视特定处理状态的请求的操作单元。 固件使处理器进入用户请求的所需处理器状态。 硬件识别出所需的处理器状态。 与所期望的处理状态的发生有关的信息被存储。 从存储器读取累积的信息,并向用户提供报告。

    Decode and execution synchronized pipeline processing using decode generated memory read queue with stop entry to allow execution generated memory read
    2.
    发明授权
    Decode and execution synchronized pipeline processing using decode generated memory read queue with stop entry to allow execution generated memory read 失效
    解码和执行同步流水线处理使用解码生成的内存读取队列与停止条目允许执行生成的内存读取

    公开(公告)号:US06240508B1

    公开(公告)日:2001-05-29

    申请号:US08505810

    申请日:1995-07-21

    IPC分类号: G06F938

    摘要: A macropipelined microprocessor chip adheres to strict read and write ordering by sequentially buffering operands in queues during instruction decode, then removing the operands in order during instruction execution. Any instruction that requires additional access to memory inserts the requests into the queued sequence (in a specifier queue) such that read and write ordering is preserved. A specifier queue synchronization counter captures synchronization points to coordinate memory request operations among the autonomous instruction decode unit, instruction execution unit, and memory sub-system. The synchronization method does not restrict the benefit of overlapped execution in the pipelined. Another feature is treatment of a variable bit field operand type that does not restrict the location of operand data. Instruction execution flows in a pipelined processor having such an operand type are vastly different depending on whether operand data resides in registers or memory. Thus, an operand context queue (field queue) is used to simplify context-dependent execution flow and increase overlap. The field queue allows the instruction decode unit to issue instructions with variable bit field operands normally, sequentially identifying and fetching operands, and communicating the operand context that specifies register or memory residence across the pipeline boundaries to the autonomous execution unit. The mechanism creates opportunity for increasing the overlap of pipelined functions and greatly simplifies the splitting of execution flows.

    摘要翻译: 宏指令微处理器芯片通过在指令解码期间依次缓冲队列中的操作数,然后在指令执行期间依次移除操作数,从而遵循严格的读写顺序。 任何需要对内存进行访问的指令将请求插入排队的序列(在指定符队列中),以便保留读写顺序。 指定符队列同步计数器捕获同步点以协调自主指令解码单元,指令执行单元和存储器子系统之间的存储器请求操作。 同步方法不限制流水线重叠执行的好处。 另一个特征是处理不限制操作数数据位置的可变位字段操作数类型。 具有这种操作数类型的流水线处理器中的指令执行流程根据操作数数据位于寄存器或存储器中而大不相同。 因此,操作数上下文队列(字段队列)用于简化上下文相关的执行流程并增加重叠。 字段队列允许指令解码单元通常发送具有可变位字段操作数的指令,顺序地识别和取出操作数,以及将指定流水线边界的寄存器或存储器驻留的操作数上下文传送到自主执行单元。 该机制为增加流水线功能的重叠创造了机会,并大大简化了执行流程的拆分。

    Method and apparatus for tracing unpredictable execution flows in a
trace buffer of a high-speed computer system
    3.
    发明授权
    Method and apparatus for tracing unpredictable execution flows in a trace buffer of a high-speed computer system 失效
    用于在高速计算机系统的跟踪缓冲器中跟踪不可预测的执行流的方法和装置

    公开(公告)号:US5802272A

    公开(公告)日:1998-09-01

    申请号:US359252

    申请日:1994-12-19

    IPC分类号: G06F11/34 G06F11/36

    摘要: An operation of a processor is traced while fetching instructions from a memory to operate the processor. The tracing involves detecting an unpredictable fetching of instructions on the assumption that a predictable fetching can be reconstructed without any further input. The unpredictable fetching is identified as being due to either computable, conditional, or unanticipated events. Upon detecting the events, process control information, such as the next instruction to be fetched is recorded in a queue, and from the queue the information can be stored in a trace buffer. During reconstruction of the operation, the trace buffer, and the image including the instructions can be examined to analyze the real-time operation of the processor.

    摘要翻译: 在从存储器取指令以操作处理器的同时追踪处理器的操作。 跟踪涉及检测指令的不可预测的取指,假设可以重建无法进一步输入的可预测提取。 不可预测的提取被识别为由于可计算的,有条件的或意外的事件。 在检测到事件时,诸如要获取的下一个指令的处理控制信息被记录在队列中,并且从队列中可以将信息存储在跟踪缓冲器中。 在重建操作期间,可以检查跟踪缓冲区和包括指令的图像,以分析处理器的实时操作。

    Apparatus and method for tracing data flows in high-speed computer
systems
    4.
    发明授权
    Apparatus and method for tracing data flows in high-speed computer systems 失效
    在高速计算机系统中跟踪数据流的装置和方法

    公开(公告)号:US5764885A

    公开(公告)日:1998-06-09

    申请号:US359216

    申请日:1994-12-19

    IPC分类号: G06F11/36 G06F11/00

    CPC分类号: G06F11/3636

    摘要: A data flow of a processor is traced while accessing data stored in a memory and in a plurality of registers during operation of the processor. The tracing involves detecting an unpredictable accessing of data on the assumption that a predictable accessing can be reconstructed without any further input. The unpredictable accessing is identified by setting and clearing a trace bit associated with each of the registers according to identifying the accessing as direct memory-to-register, register-to-register, constant-to-register, and indirect memory. If a trace bit is set on a register storing data being used as a base address during the indirect memory acceding, data flow control information, such as the base address stored in the register being used during the indirect acceding is recorded in a queue, and from the queue the information can be stored in a trace buffer. During reconstruction of the operation, the trace buffer, and a copy of the data having an initial state can be examined to analyze the data flows during the real-time operation of the processor.

    摘要翻译: 在处理器的操作期间,访问处理器的数据流,同时访问存储在存储器中以及多个寄存器中的数据。 跟踪涉及检测数据的不可预知的访问,假设可以重建可预测的访问而无需任何进一步的输入。 通过根据将访问识别为直接存储器到寄存器,寄存器到寄存器,恒定寄存器和间接存储器来设置和清除与每个寄存器相关联的跟踪位来识别不可预测的访问。 如果在间接存储器加入期间在存储用作基地址的数据的寄存器上设置跟踪位,则将间接加入期间使用的存储在寄存器中的基地址的数据流控制信息记录在队列中, 从队列中可以将信息存储在跟踪缓冲区中。 在重构操作期间,可以检查跟踪缓冲器和具有初始状态的数据的副本,以在处理器的实时操作期间分析数据流。

    Atomic update of CPO state
    5.
    发明授权
    Atomic update of CPO state 有权
    原子更新的CPO状态

    公开(公告)号:US07185183B1

    公开(公告)日:2007-02-27

    申请号:US09921400

    申请日:2001-08-02

    申请人: G. Michael Uhler

    发明人: G. Michael Uhler

    IPC分类号: G06F9/48

    摘要: A group of bit set and bit clear instructions are provided for a microprocessor to allow atomic modification of privileged architecture control registers. The bit set and bit clear instructions include an opcode that designates to the microprocessor that the instructions are to execute in privileged (kernel) state only, and that the instructions are to communicate with privileged control registers. Two operands are provided for the instructions, the first designating which of the privileged control registers is to be modified, the second designating a general purpose register that contains a bit mask. The bit set instructions set bits in the designated control register according to bits set in the bit mask. The bit clear instructions clear bits in the designated control register according to bits set in the bit mask. By atomically modifying privileged control registers, a requirement for strict nesting of interrupt routines is eliminated.

    摘要翻译: 为微处理器提供了一组位设置和位清除指令,以允许特权体系结构控制寄存器的原子修改。 位设置和位清除指令包括指定给微处理器的操作码,指令将仅在特权(内核)状态下执行,并且指令将与特权控制寄存器进行通信。 为指令提供两个操作数,第一个指定要修改哪个特权控制寄存器,第二个指定包含位掩码的通用寄存器。 位设置指令根据位掩码中设置的位设置指定控制寄存器中的位。 位清零指令根据位掩码中设置的位清零指定控制寄存器中的位。 通过原子地修改特权控制寄存器,消除了严格嵌套中断程序的要求。

    Processor having an arithmetic extension of an instruction set architecture
    6.
    发明授权
    Processor having an arithmetic extension of an instruction set architecture 有权
    具有指令集架构的算术扩展的处理器

    公开(公告)号:US06714197B1

    公开(公告)日:2004-03-30

    申请号:US09364787

    申请日:1999-07-30

    IPC分类号: G06T1520

    摘要: A processor having an arithmetic extension of an instruction set architecture which incorporates a set of high performance floating point operations. The instruction set architecture incorporates a variety of data formats including single precision and double precision data formats, as well as the paired-single data format that allows two simultaneous operations on a pair of operands. The extension includes instructions directed to reduction add, reduction multiply, reciprocal, and reciprocal square root.

    摘要翻译: 具有包含一组高性能浮点运算的指令集架构的算术扩展的处理器。 指令集架构包含多种数据格式,包括单精度和双精度数据格式,以及允许在一对操作数上同时进行两次操作的配对单数据格式。 该扩展包括针对减少添加,减少乘数,倒数和倒数平方根的指令。

    Locked read/write on separate address/data bus using write barrier
    7.
    发明授权
    Locked read/write on separate address/data bus using write barrier 有权
    使用写入屏障在单独的地址/数据总线上锁定读/写

    公开(公告)号:US06490642B1

    公开(公告)日:2002-12-03

    申请号:US09373092

    申请日:1999-08-12

    IPC分类号: G06F1300

    CPC分类号: G06F13/364

    摘要: An apparatus is presented for improving the efficiency of data transfers between devices interconnected over an on-chip system bus a multi-master computer system configuration. Bus efficiency is improved by providing an apparatus for controlling a read-modify-write transaction to an address in a bus slave device that does not suspend essential features of the system bus during the transaction, namely, pipelining and transaction splitting. The apparatus includes transaction control logic in a bus master device and transaction response logic in a bus slave device. The transaction control logic provides a write barrier command from the bus master device over the on-chip system bus to the bus slave device. The transaction response logic receives the write barrier command, and precludes execution of future transactions to the address within the bus slave device until completion of the read-modify-write transaction while allowing execution of transactions to other addresses within the bus slave device to complete.

    摘要翻译: 提出了一种用于提高通过片上系统总线互连的设备之间的数据传输的效率的装置,其中多主计算机系统配置。 通过提供一种用于控制在业务中不暂停系统总线的基本特征(即,流水线和事务分割)的总线从设备中的地址的读 - 修改 - 写事务的装置。 该装置包括总线主设备中的事务控制逻辑和总线从设备中的事务响应逻辑。 交易控制逻辑通过片上系统总线向总线从设备提供来自总线主设备的写屏障命令。 交易响应逻辑接收写入障碍命令,并且排除对总线从设备中的地址的未来事务的执行,直到完成读 - 修改 - 写入事务,同时允许对总线从设备中的其他地址执行事务来完成。

    Pipelined computer with operand context queue to simplify
context-dependent execution flow
    8.
    发明授权
    Pipelined computer with operand context queue to simplify context-dependent execution flow 失效
    具有操作数上下文队列的流水线计算机,以简化与上下文相关的执行流程

    公开(公告)号:US5542058A

    公开(公告)日:1996-07-30

    申请号:US317427

    申请日:1994-10-04

    IPC分类号: F02B75/02 G06F9/38 G06F9/30

    摘要: A macropipelined microprocessor chip adheres to strict read and write ordering by sequentially buffering operands in queues during instruction decode, then removing the operands in order during instruction execution. Any instruction that requires additional access to memory inserts the requests into the queued sequence (in a specifier queue) such that read and write ordering is preserved. A specifier queue synchronization counter captures synchronization points to coordinate memory request operations among the autonomous instruction decode unit, instruction execution unit, and memory sub-system. The synchronization method does not restrict the benefit of overlapped execution in the pipelined. Another feature is treatment of a variable bit field operand type that does not restrict the location of operand data. Instruction execution flows in a pipelined processor having such an operand type are vastly different depending on whether operand data resides in registers or memory. Thus, an operand context queue (field queue) is used to simplify context-dependent execution flow and increase overlap. The field queue allows the instruction decode unit to issue instructions with variable bit field operands normally, sequentially identifying and fetching operands, and communicating the operand context that specifies register or memory residence across the pipeline boundaries to the autonomous execution unit. The mechanism creates opportunity for increasing the overlap of pipelined functions and greatly simplifies the splitting of execution flows.

    摘要翻译: 宏指令微处理器芯片通过在指令解码期间依次缓冲队列中的操作数,然后在指令执行期间依次移除操作数,从而遵循严格的读写顺序。 任何需要对内存进行访问的指令将请求插入排队的序列(在指定符队列中),以便保留读写顺序。 指定符队列同步计数器捕获同步点以协调自主指令解码单元,指令执行单元和存储器子系统之间的存储器请求操作。 同步方法不限制流水线重叠执行的好处。 另一个特征是处理不限制操作数数据位置的可变位字段操作数类型。 具有这种操作数类型的流水线处理器中的指令执行流程根据操作数数据位于寄存器或存储器中而大不相同。 因此,操作数上下文队列(字段队列)用于简化上下文相关的执行流程并增加重叠。 字段队列允许指令解码单元通常发送具有可变位字段操作数的指令,顺序地识别和取出操作数,以及将指定流水线边界的寄存器或存储器驻留的操作数上下文传送到自主执行单元。 该机制为增加流水线功能的重叠创造了机会,并大大简化了执行流程的拆分。

    Conversion of internal processor register commands to I/O space addresses
    9.
    发明授权
    Conversion of internal processor register commands to I/O space addresses 失效
    将内部处理器寄存器命令转换为I / O空间地址

    公开(公告)号:US5481689A

    公开(公告)日:1996-01-02

    申请号:US106317

    申请日:1993-08-13

    摘要: A pipelined CPU executing instructions of variable length, and referencing memory using various data widths. Macroinstruction pipelining is employed (instead of microinstruction pipelining), with queuing between units of the CPU to allow flexibility in instruction execution times. A wide bandwidth is available for memory access; fetching 64-bit data blocks on each cycle. Internal processor registers are accessed with short (byte width) addresses instead of full physical addresses as used for memory and I/O references, but off-chip processor registers are memory-mapped and accessed by the same busses using the same controls as the memory and I/O.

    摘要翻译: 执行可变长度指令的流水线CPU,并使用各种数据宽度引用存储器。 使用宏指令流水线(而不是微指令流水线),在CPU的单元之间排队,以允许指令执行时间的灵活性。 宽带宽可用于存储器访问; 在每个周期获取64位数据块。 使用短(字节宽度)地址访问内部处理器寄存器,而不是用于存储器和I / O引用的完整物理地址,但片外处理器寄存器由同一总线使用与内存相同的控制器进行存储器映射和访问 和I / O。

    Method and apparatus for filtering invalidate requests
    10.
    发明授权
    Method and apparatus for filtering invalidate requests 失效
    用于过滤无效请求的方法和装置

    公开(公告)号:US5058006A

    公开(公告)日:1991-10-15

    申请号:US212416

    申请日:1988-06-27

    CPC分类号: G06F12/0808

    摘要: An apparatus which filters the number of invalidates to be propagated onto a private processor bus is provided. This is desirable so that the processor bus is not overloaded with invalidate requests. The present invention describes a method of filtering the number of invalidates to be propagated to each processor. A memory interface filters the invalidates by using a second private bus, the invalidate bus, which communicates with the cache controller. The cache controller can tell the memory interface whether data corresponding to the address on the invalidate bus is resident in the private cache memory of that processor. In this way, the memory interface only has to request the private processor bus when necessary, in order to perform the invalidate.