Method and apparatus for instruction sampling for performance monitoring and debug
    1.
    发明授权
    Method and apparatus for instruction sampling for performance monitoring and debug 失效
    用于性能监控和调试的指令采样方法和装置

    公开(公告)号:US06574727B1

    公开(公告)日:2003-06-03

    申请号:US09435069

    申请日:1999-11-04

    IPC分类号: G06F900

    摘要: A method and apparatus for selecting an instruction to be monitored within a pipelined processor in a data processing system is presented. A plurality of instructions are fetched, and the plurality of instructions are matched against at least one match condition to generate instructions that are eligible for sampling. The match conditions may include matching the opcode of an instruction, the pre-decode bits of an instruction, a type of instruction, or other conditions. The matched instructions may be marked using a match bit that accompanies the instruction through the selection process. The instructions eligible for sampling are then sampled to generate a sampled instruction. A sampled instruction may be marked with a sample bit that accompanies the instruction through the instruction execution process in order to monitor the sampled instruction while it is executing within the pipelined processor.

    摘要翻译: 提出了一种在数据处理系统中选择流水线处理器内要监视的指令的方法和装置。 获取多个指令,并且将多个指令与至少一个匹配条件进行匹配,以生成符合抽样要求的指令。 匹配条件可以包括匹配指令的操作码,指令的预解码位,指令的类型或其他条件。 可以使用通过选择过程伴随指令的匹配位来标记匹配的指令。 然后对符合抽样要求的指令进行采样以产生采样指令。 采样指令可以通过指令执行过程伴随指令的采样位进行标记,以便在流水线处理器中执行时监视采样指令。

    Method and apparatus for identifying instructions for performance monitoring in a microprocessor
    2.
    发明授权
    Method and apparatus for identifying instructions for performance monitoring in a microprocessor 失效
    用于识别用于微处理器中的性能监视的指令的方法和装置

    公开(公告)号:US06539502B1

    公开(公告)日:2003-03-25

    申请号:US09436109

    申请日:1999-11-08

    IPC分类号: G06F1130

    摘要: A method and apparatus for selecting an instruction to be monitored within a pipelined processor is presented. One or more pairs of match values stored in control registers are allocated for use in instruction sampling or instruction matching. These pairs, referred to as V0 and V1, are used together to filter instructions for sampling or for instruction matching. During the fetch or decode stage, the instruction word is compared bit by bit to the V0 and V1 pair(s). For each bit in the instruction word, the corresponding bit in V0 and V1 are used to determine if a match exists. If every bit position in the instruction word results in a match, the instruction is eligible for sampling. If any bit position does not match, the instruction is not eligible. In response to a determination that the instruction is eligible for sampling, the execution of the instruction may be monitored.

    摘要翻译: 提出了一种在流水线处理器内选择要监视的指令的方法和装置。 存储在控制寄存器中的一对或多对匹配值被分配用于指令采样或指令匹配。 这些对,称为V0和V1,一起用于过滤用于采样或指令匹配的指令。 在提取或解码阶段,将指令字逐位比较为V0和V1对。 对于指令字中的每个位,V0和V1中的相应位用于确定是否存在匹配。 如果指令字中的每个位都产生匹配,则该指令有资格进行采样。 如果任何位位置不匹配,则说明不符合条件。 响应于确定该指令有资格进行采样,可以监视该指令的执行。

    Method and apparatus for monitoring the performance of internal queues in a microprocessor
    3.
    发明授权
    Method and apparatus for monitoring the performance of internal queues in a microprocessor 失效
    用于监视微处理器内部队列性能的方法和装置

    公开(公告)号:US06530042B1

    公开(公告)日:2003-03-04

    申请号:US09436108

    申请日:1999-11-08

    IPC分类号: G06F1130

    摘要: A method and apparatus for monitoring an internal queue within a processor, such as an instruction completion table or instruction re-order buffer, is presented. The performance monitoring unit of the processor contains multiple counters, and each counter counts occurrences of specified events. An internal queue of the processor may be specified to be monitored. A count of event signals indicating a successful allocation request for an entry in the internal queue is divided by a count of event signals indicating a passage of units of time to obtain the average rate for allocation requests for queue entries in the specified internal queue. A count of event signals indicating an occupation of a specific entry in the internal queue during a unit of time is divided by a count of event signals indicating an allocation of a specific entry in the internal queue to obtain the average time spent in the internal queue. An average number of entries in the internal queue is computed as a product of the average rate for allocation requests for queue entries and the average time spent in the internal queue. An event signal that indicates failure of an allocation request for an entry in the internal queue may be monitored.

    摘要翻译: 提出了一种用于监视处理器内的内部队列的方法和装置,例如指令完成表或指令重新排序缓冲器。 处理器的性能监视单元包含多个计数器,每个计数器计数指定事件的出现次数。 可以指定处理器的内部队列进行监视。 指示对内部队列中的条目的成功分配请求的事件信号的计数除以指示通过时间单位的事件信号的计数,以获得指定的内部队列中的队列条目的分配请求的平均速率。 指示在时间单位内对内部队列中的特定条目的占用的事件信号的计数除以表示内部队列中的特定条目的分配的事件信号的计数,以获得在内部队列中花费的平均时间 。 内部队列中的平均条目数量计算为队列条目的分配请求的平均速率和在内部队列中花费的平均时间的乘积。 可以监视指示内部队列中的条目的分配请求失败的事件信号。

    Method and system for detecting a flush of an instruction without a flush indicator
    5.
    发明授权
    Method and system for detecting a flush of an instruction without a flush indicator 失效
    用于检测没有冲洗指示器的指令冲洗的方法和系统

    公开(公告)号:US06550002B1

    公开(公告)日:2003-04-15

    申请号:US09435067

    申请日:1999-11-04

    IPC分类号: G06F1130

    摘要: A method and system for detecting flushed instructions without a flush indicator is provided. In order to monitor the flushing of an instruction in an instruction pipeline of a processor, an instruction is selected as a sampled instruction and the progress of the sampled instruction through the instruction pipeline is monitored. Upon selection of an instruction as a sampled instruction, a countdown value is initialized to a value equal to the maximum number of instructions within the instruction pipeline, and as instructions complete, the countdown value is decremented. If progress of the sampled instruction is detected as the instruction moves through the instruction pipeline, the countdown value is reinitialized. If the countdown value reaches zero, then a flush of the sampled instruction from the instruction pipeline is presumed, and an indication that the sampled instruction has been flushed is generated. In response to the indication that the sampled instruction has been flushed, a subsequent instruction may be selected as a subsequently sampled instruction.

    摘要翻译: 提供一种用于检测没有冲洗指示器的冲洗指令的方法和系统。 为了监视处理器的指令流水线中的指令的刷新,选择指令作为采样指令,并监视通过指令流水线的采样指令的进度。 在选择指令作为采样指令时,将倒数值初始化为等于指令流水线内最大指令数的值,并且随着指令的完成,递减计数值。 如果指令在指令流水线中移动,则检测到采样指令的进度,则重新初始化倒数值。 如果倒计时值为零,则假定来自指令流水线的采样指令的刷新,并且产生已经刷新了采样指令的指示。 响应于采样指令已被刷新的指示,随后的指令可以被选择为随后采样的指令。

    Method and system for providing temporal threshold support during performance monitoring of a pipelined processor
    6.
    发明授权
    Method and system for providing temporal threshold support during performance monitoring of a pipelined processor 有权
    用于在流水线处理器的性能监视期间提供临时阈值支持的方法和系统

    公开(公告)号:US06446029B1

    公开(公告)日:2002-09-03

    申请号:US09343449

    申请日:1999-06-30

    IPC分类号: G06F1500

    摘要: A method and system for monitoring the performance of a instruction pipeline is provided. The processor may contain a performance monitor for monitoring for the occurrence of an event within a data processing system. An event to be monitored may be specified through software control, and the occurrence of the specified event is monitored during the execution of an instruction in the execution pipeline of the processor. A particular instruction may be specified to execute within a threshold time for each stage of the instruction pipeline. The specified event may be the completion of a single tagged instruction beyond the specified threshold interval for a stage of the instruction pipeline. The performance monitor may contain a number of counters for counting multiple occurrences of specified events during the execution of multiple instructions, in which case the specified events may be the completion of tagged instructions beyond a threshold interval for any stage of the multiple stages of the execution pipeline. As the instruction moves through the processor, the performance monitor collects the events and provides the events for optimization analysis.

    摘要翻译: 提供了一种用于监视指令流水线性能的方法和系统。 处理器可以包含用于监视数据处理系统内的事件发生的性能监视器。 可以通过软件控制指定要监视的事件,并且在执行处理器的执行流水线中的指令期间监视指定事件的发生。 可以指定特定指令以在指令流水线的每个阶段的阈值时间内执行。 指定的事件可以是超出指令流水线阶段的指定阈值间隔的单个标记指令的完成。 性能监视器可以包含多个计数器,用于在执行多个指令期间对多次发生的指定事件进行计数,在这种情况下,指定事件可以是执行多个执行阶段的任何阶段的阈值间隔之外的已标记指令的完成 管道。 当指令移动通过处理器时,性能监视器收集事件并提供事件进行优化分析。

    Hierarchical selection of direct and indirect counting events in a performance monitor unit
    7.
    发明授权
    Hierarchical selection of direct and indirect counting events in a performance monitor unit 有权
    在性能监视器单元中分层选择直接和间接计数事件

    公开(公告)号:US06718403B2

    公开(公告)日:2004-04-06

    申请号:US09734116

    申请日:2000-12-11

    IPC分类号: G06E300

    摘要: A microprocessor including a performance monitor unit is disclosed. The performance monitor unit includes a set of performance monitor counters and a corresponding set of control circuits and programmable control registers. The performance monitor unit receives a first set of event signals from functional units of the processor. Each of the first set of events is routed directly from the appropriate functional unit to the performance monitor unit. The performance monitor unit further receives at least a second set of event signals. In one embodiment, the second set of event signals is received via a performance monitor bus of the processor. The performance monitor bus is typically a shared bus that may receive signals from any of the functional units of the processor. The functional units may include multiplexing circuitry that determines which of the functional units has mastership of the shared bus. Whereas the performance monitor unit is typically capable of monitoring the direct event signals in any of its counters, the indirect event signals may be selectively routed to the counters. The shared bus may be divided into sub-groups or byte lanes where the byte lanes are selectively routed to the set of performance monitor counters. The state of a control register may determine the event that is monitored in the corresponding counter. In one embodiment, the control register provides a set of signals that are connected to the select inputs of one or more multiplexers. The multiplexers receive multiple events signals and, based on the state of their select signals, route one of the received event signals to the corresponding performance monitor counter. Specified states of the select signals may result in the disabling of the corresponding counter or enabling the counter to count system clock cycles rather than any performance event.

    摘要翻译: 公开了一种包括性能监视器单元的微处理器。 性能监视器单元包括一组性能监视计数器和一组相应的控制电路和可编程控制寄存器。 性能监视器单元从处理器的功能单元接收第一组事件信号。 第一组事件中的每一个直接从适当的功能单元路由到性能监视器单元。 性能监视器单元进一步接收至少第二组事件信号。 在一个实施例中,经由处理器的性能监视总线接收第二组事件信号。 性能监视器总线通常是可以从处理器的任何功能单元接收信号的共享总线。 功能单元可以包括复用电路,其确定哪个功能单元具有共享总线的掌握。 而性能监视器单元通常能够监视任何其计数器中的直接事件信号,间接事件信号可被选择性地路由到计数器。 共享总线可以被划分成子组或字节通道,其中字节通道被选择性地路由到一组性能监视计数器。 控制寄存器的状态可以确定在相应计数器中监视的事件。 在一个实施例中,控制寄存器提供连接到一个或多个多路复用器的选择输入的一组信号。 多路复用器接收多个事件信号,并且基于其选择信号的状态,将接收的事件信号中的一个路由到相应的性能监视计数器。 选择信号的指定状态可能导致禁用相应的计数器或使计数器能够对系统时钟周期进行计数,而不是任何性能事件。

    Method and system for tracking the progress of an instruction in an out-of-order processor
    8.
    发明授权
    Method and system for tracking the progress of an instruction in an out-of-order processor 失效
    用于跟踪无序处理器中的指令进度的方法和系统

    公开(公告)号:US06415378B1

    公开(公告)日:2002-07-02

    申请号:US09343359

    申请日:1999-06-30

    IPC分类号: G06F1100

    CPC分类号: G06F11/3466

    摘要: A method and system for debugging the execution of an instruction within an instruction pipeline is provided. A processor in a data processing system contains instruction pipeline units. An instruction may be tagged, and in response to an instruction pipeline unit completing its processing of the tagged instruction, a stage completion signal is asserted. An execution monitor external to the pipelined processor monitors the stage completion signals during the execution of the tagged instruction. The execution monitor may be a logic analyzer that displays the stage completion signals in real-time on a display device of the execution monitor. An instruction to be tagged may be selected based upon an instruction selection rule, such as the address of the instruction.

    摘要翻译: 提供了一种用于调试指令管线内的指令执行的方法和系统。 数据处理系统中的处理器包含指令流水线单元。 指令可以被标记,并且响应于指令流水线单元完成其对带标签的指令的处理,声明级完成信号。 流水线处理器外部的执行监视器在执行标记指令期间监视阶段完成信号。 执行监视器可以是在执行监视器的显示装置上实时显示级完成信号的逻辑分析器。 可以基于诸如指令的地址的指令选择规则来选择要被标记的指令。

    Method system and apparatus for instruction tracing with out of order processors
    9.
    发明授权
    Method system and apparatus for instruction tracing with out of order processors 失效
    用于无序处理器的指令跟踪的方法系统和装置

    公开(公告)号:US06694427B1

    公开(公告)日:2004-02-17

    申请号:US09552859

    申请日:2000-04-20

    IPC分类号: G06F900

    摘要: A method, system and apparatus for instruction tracing with out of order speculative processors. With the present invention, information corresponding to the state of an instruction cache and a data cache is stored in a trace storage device along with information corresponding to instructions fetched by the processor. When a cache load is necessary, updated cache information is stored in the trace storage device. Thereby, the state of the cache at all times during fetching of instructions may be known from the information stored in the trace storage device. Additionally, the particular instructions fetched is known from the fetched instructions information stored in the trace storage device. Hence the instruction stream may be reconstructed from the information stored in the trace storage device.

    摘要翻译: 用于无序推测处理器的指令跟踪的方法,系统和装置。 利用本发明,与指令高速缓存和数据高速缓存的状态相对应的信息与对应于由处理器获取的指令的信息一起存储在跟踪存储设备中。 当需要缓存加载时,更新的缓存信息被存储在跟踪存储设备中。 因此,可以从存储在跟踪存储装置中的信息中知道在取指令期间的任何时候的高速缓存的状态。 此外,从存储在跟踪存储设备中的获取的指令信息中可以获得所提取的特定指令。 因此,可以从存储在跟踪存储设备中的信息重建指令流。

    Autonomic Hotspot Profiling Using Paired Performance Sampling
    10.
    发明申请
    Autonomic Hotspot Profiling Using Paired Performance Sampling 有权
    使用配对性能采样的自动热点分析

    公开(公告)号:US20140059334A1

    公开(公告)日:2014-02-27

    申请号:US14067212

    申请日:2013-10-30

    IPC分类号: G06F9/38

    摘要: A processor performance profiler is enabled to for identify specific instructions causing performance issues within a program being executed by a microprocessor through random sampling to find the worst-case offenders of a particular event type such as a cache miss or a branch mis-prediction. Tracking all instructions causing a particular event generates large data logs, creates performance penalties, and makes code analysis more difficult. However, by identifying and tracking the worst offenders within a random sample of events without having to hash all events results in smaller memory requirements for the performance profiler, lower performance impact while profiling, and decreased complexity to analyze the program to identify major performance issues, which, in turn, enables better optimization of the program in shorter developer time.

    摘要翻译: 处理器性能分析器能够用于识别由微处理器通过随机采样来执行的程序中导致性能问题的特定指令,以找到诸如高速缓存未命中或分支误预测的特定事件类型的最坏情况的违规者。 跟踪导致特定事件的所有指令会生成大量数据日志,创建性能损失,并使代码分析更加困难。 然而,通过识别和跟踪随机事件样本中的最坏罪犯,而不必对所有事件进行散列,从而导致性能分析器的较小内存需求,降低性能影响,同时分析并降低分析程序以识别主要性能问题的复杂性, 这反过来,可以在较短的开发人员时间内更好地优化程序。