Counting latencies of an instruction table flush, refill and instruction execution using a plurality of assigned counters
    1.
    发明授权
    Counting latencies of an instruction table flush, refill and instruction execution using a plurality of assigned counters 失效
    使用多个分配的计数器计数指令表的等待时间,刷新,补充和指令执行

    公开(公告)号:US06970999B2

    公开(公告)日:2005-11-29

    申请号:US10210415

    申请日:2002-07-31

    IPC分类号: G06F9/38 G06F9/44 G06F15/00

    摘要: A method and system for analyzing cycles per instruction (CPI) performance in a processor. A completion table corresponds to the instructions in a group to be processed by the processor. An empty completion table indicates that there has been some type of catastrophe that caused a table flush. While the table is empty, a performance monitoring counter (PMC), located in a performance monitoring unit (PMU) in the processor, counts the number of clock cycles that the table is empty. Preferably, a separate PMC is utilized depending on the reason that the completion table is empty. A second PMC likewise counts the number of clock cycles spent re-filling the empty completion table. A third PMC counts the number of clock cycles spent actually executing the instructions in the completion table. The information in the PMC's can be used to evaluate the true cause for degradation of CPI performance.

    摘要翻译: 一种用于分析处理器中每条指令(CPI)性能的循环的方法和系统。 完成表对应于要由处理器处理的组中的指令。 一个空的完成表表明有一些类型的灾难导致表冲洗。 当表为空时,位于处理器中的性能监视单元(PMU)中的性能监视计数器(PMC)会计数表为空的时钟周期数。 优选地,根据完成表为空的原因,使用单独的PMC。 第二个PMC同样计算重新填充空完成表的时钟周期数。 第三个PMC计算在完成表中实际执行指令花费的时钟周期数。 PMC中的信息可用于评估CPI性能下降的真正原因。

    Analyzing instruction completion delays in a processor
    3.
    发明授权
    Analyzing instruction completion delays in a processor 失效
    分析处理器中的指令完成延迟

    公开(公告)号:US07047398B2

    公开(公告)日:2006-05-16

    申请号:US10210358

    申请日:2002-07-31

    IPC分类号: G06F11/34

    摘要: A method and system for identifying instruction completion delays for a group of instructions in a computer processor. Each instruction in the group of instructions has a status indicator that identifies what is preventing that instruction from completing execution. Examples of completion delays are cache misses, data dependencies or simply the time required for an execution unit in the computer processor to process the instruction. As each instruction finishes executing, its associated status indicator is cleared to indicate that the instruction is no longer waiting to execute. The last instruction to execute is the instruction that is holding up completion of the entire group, and thus the cause for the completion delay of the last instruction is recorded as the cause of completion delay for the entire group.

    摘要翻译: 一种用于识别计算机处理器中的一组指令的指令完成延迟的方法和系统。 指令组中的每个指令都有一个状态指示器,用于标识阻止该指令完成执行的内容。 完成延迟的示例是缓存未命中,数据依赖性或简单地计算机处理器中的执行单元处理指令所需的时间。 每个指令执行完毕后,相关状态指示灯将被清除,表示该指令不再等待执行。 执行的最后一条指令是保持整个组的完成的指令,因此将最后指令的完成延迟的原因记录为整个组的完成延迟的原因。

    Method of seamlessly integrating thermal event information data with performance monitor data
    5.
    发明授权
    Method of seamlessly integrating thermal event information data with performance monitor data 有权
    将热事件信息数据与性能监视数据无缝集成的方法

    公开(公告)号:US07472315B2

    公开(公告)日:2008-12-30

    申请号:US11054292

    申请日:2005-02-09

    IPC分类号: G06F11/00

    CPC分类号: G06F11/00

    摘要: An apparatus, system and method of integrating performance monitor data with thermal event information are provided. A thermal event, in this case, is when the temperature of a chip within which is embedded a processor exceeds a user-configurable value while the processor is processing instructions and/or using storage devices that are being monitored. In any event, when the thermal event occurs, the temperature of the chip along with the performance monitor data is stored for future uses, which include performance and diagnostic analyses.

    摘要翻译: 提供了一种将性能监控数据与热事件信息集成的设备,系统和方法。 在这种情况下,当事件处理器处理指令和/或使用正被监视的存储设备时,嵌入处理器的芯片的温度超过用户可配置值时,就会发生热事件。 无论如何,当发生热事件时,将存储芯片的温度以及性能监视数据以供将来使用,包括性能和诊断分析。

    Apparatus, system and computer program product for seamlessly integrating thermal event information data with performance monitor data
    6.
    发明授权
    Apparatus, system and computer program product for seamlessly integrating thermal event information data with performance monitor data 失效
    用于将热事件信息数据与性能监视数据无缝集成的装置,系统和计算机程序产品

    公开(公告)号:US07711994B2

    公开(公告)日:2010-05-04

    申请号:US12131070

    申请日:2008-05-31

    IPC分类号: G06F11/00

    CPC分类号: G06F11/00

    摘要: An apparatus, system and method of integrating performance monitor data with thermal event information are provided. A thermal event, in this case, is when the temperature of a chip within which is embedded a processor exceeds a user-configurable value while the processor is processing instructions and/or using storage devices that are being monitored. In any event, when the thermal event occurs, the temperature of the chip along with the performance monitor data is stored for future uses, which include performance and diagnostic analyses.

    摘要翻译: 提供了一种将性能监控数据与热事件信息集成的设备,系统和方法。 在这种情况下,当事件处理器处理指令和/或使用正被监视的存储设备时,嵌入处理器的芯片的温度超过用户可配置值时,就会发生热事件。 无论如何,当发生热事件时,将存储芯片的温度以及性能监视数据以供将来使用,包括性能和诊断分析。

    Methods to randomly or pseudo-randomly, without bias, select instruction for performance analysis in a microprocessor
    7.
    发明授权
    Methods to randomly or pseudo-randomly, without bias, select instruction for performance analysis in a microprocessor 失效
    方法随机或伪随机,无偏差,选择微处理器性能分析指令

    公开(公告)号:US07620801B2

    公开(公告)日:2009-11-17

    申请号:US11055848

    申请日:2005-02-11

    IPC分类号: G06F9/30

    摘要: A method for pseudo-randomly, without bias, selecting instructions for marking in a microprocessor. Responsive to reading an instruction from an instruction cache, an instruction tag associated with the instruction is compared against a pseudo-randomly generated value in a linear feedback shift register (LFSR). If the instruction tag matches the value in the LFSR, a mark bit, indicating the instruction is a marked instruction, is sent with the instruction to an execution unit. Responsive to an indication from the performance monitor, the value in the LFSR is incremented prior to selecting a next instruction to mark. If the value equals a predetermined prime number of increments, the value is reset to all ones to avoid any harmonics with the code stream being executed. Upon receiving the marked instruction, the execution unit combines the marked bit with a selected event and reports the marked event to the performance monitor.

    摘要翻译: 一种用于伪随机,无偏差的方法,用于在微处理器中选择用于标记的指令。 响应于从指令高速缓存读取指令,将与指令相关联的指令标记与线性反馈移位寄存器(LFSR)中的伪随机生成值进行比较。 如果指令标签与LFSR中的值相匹配,则表示指令是标记指令的标记位与指令一起发送到执行单元。 响应于性能监视器的指示,LFSR中的值在选择下一个要标记的指令之前递增。 如果该值等于预定的素数增量,则该值被重置为全部值,以避免与正在执行的码流的任何谐波。 在接收到标记指令之后,执行单元将所标记的位与所选择的事件相结合,并将标记的事件报告给性能监视器。

    APPARATUS, SYSTEM AND COMPUTER PROGRAM PRODUCT FOR SEAMLESSLY INTEGRATING THERMAL EVENT INFORMATION DATA WITH PERFORMANCE MONITOR DATA
    8.
    发明申请
    APPARATUS, SYSTEM AND COMPUTER PROGRAM PRODUCT FOR SEAMLESSLY INTEGRATING THERMAL EVENT INFORMATION DATA WITH PERFORMANCE MONITOR DATA 失效
    装置,系统和计算机程序产品,用于无缝集成具有性能监视数据的热事件信息数据

    公开(公告)号:US20080244330A1

    公开(公告)日:2008-10-02

    申请号:US12131070

    申请日:2008-05-31

    IPC分类号: G06F11/30

    CPC分类号: G06F11/00

    摘要: An apparatus, system and method of integrating performance monitor data with thermal event information are provided. A thermal event, in this case, is when the temperature of a chip within which is embedded a processor exceeds a user-configurable value while the processor is processing instructions and/or using storage devices that are being monitored. In any event, when the thermal event occurs, the temperature of the chip along with the performance monitor data is stored for future uses, which include performance and diagnostic analyses.

    摘要翻译: 提供了一种将性能监控数据与热事件信息集成的设备,系统和方法。 在这种情况下,当事件处理器处理指令和/或使用正被监视的存储设备时,嵌入处理器的芯片的温度超过用户可配置值时,就会发生热事件。 无论如何,当发生热事件时,将存储芯片的温度以及性能监视数据以供将来使用,包括性能和诊断分析。

    Method system and apparatus for instruction tracing with out of order processors
    9.
    发明授权
    Method system and apparatus for instruction tracing with out of order processors 失效
    用于无序处理器的指令跟踪的方法系统和装置

    公开(公告)号:US06694427B1

    公开(公告)日:2004-02-17

    申请号:US09552859

    申请日:2000-04-20

    IPC分类号: G06F900

    摘要: A method, system and apparatus for instruction tracing with out of order speculative processors. With the present invention, information corresponding to the state of an instruction cache and a data cache is stored in a trace storage device along with information corresponding to instructions fetched by the processor. When a cache load is necessary, updated cache information is stored in the trace storage device. Thereby, the state of the cache at all times during fetching of instructions may be known from the information stored in the trace storage device. Additionally, the particular instructions fetched is known from the fetched instructions information stored in the trace storage device. Hence the instruction stream may be reconstructed from the information stored in the trace storage device.

    摘要翻译: 用于无序推测处理器的指令跟踪的方法,系统和装置。 利用本发明,与指令高速缓存和数据高速缓存的状态相对应的信息与对应于由处理器获取的指令的信息一起存储在跟踪存储设备中。 当需要缓存加载时,更新的缓存信息被存储在跟踪存储设备中。 因此,可以从存储在跟踪存储装置中的信息中知道在取指令期间的任何时候的高速缓存的状态。 此外,从存储在跟踪存储设备中的获取的指令信息中可以获得所提取的特定指令。 因此,可以从存储在跟踪存储设备中的信息重建指令流。

    Autonomic Hotspot Profiling Using Paired Performance Sampling
    10.
    发明申请
    Autonomic Hotspot Profiling Using Paired Performance Sampling 有权
    使用配对性能采样的自动热点分析

    公开(公告)号:US20140059334A1

    公开(公告)日:2014-02-27

    申请号:US14067212

    申请日:2013-10-30

    IPC分类号: G06F9/38

    摘要: A processor performance profiler is enabled to for identify specific instructions causing performance issues within a program being executed by a microprocessor through random sampling to find the worst-case offenders of a particular event type such as a cache miss or a branch mis-prediction. Tracking all instructions causing a particular event generates large data logs, creates performance penalties, and makes code analysis more difficult. However, by identifying and tracking the worst offenders within a random sample of events without having to hash all events results in smaller memory requirements for the performance profiler, lower performance impact while profiling, and decreased complexity to analyze the program to identify major performance issues, which, in turn, enables better optimization of the program in shorter developer time.

    摘要翻译: 处理器性能分析器能够用于识别由微处理器通过随机采样来执行的程序中导致性能问题的特定指令,以找到诸如高速缓存未命中或分支误预测的特定事件类型的最坏情况的违规者。 跟踪导致特定事件的所有指令会生成大量数据日志,创建性能损失,并使代码分析更加困难。 然而,通过识别和跟踪随机事件样本中的最坏罪犯,而不必对所有事件进行散列,从而导致性能分析器的较小内存需求,降低性能影响,同时分析并降低分析程序以识别主要性能问题的复杂性, 这反过来,可以在较短的开发人员时间内更好地优化程序。