Determining each stall reason for each stalled instruction within a group of instructions during a pipeline stall
    1.
    发明授权
    Determining each stall reason for each stalled instruction within a group of instructions during a pipeline stall 失效
    在流水线停止期间确定一组指令内每个停顿的指令的每个失速原因

    公开(公告)号:US08635436B2

    公开(公告)日:2014-01-21

    申请号:US13097284

    申请日:2011-04-29

    IPC分类号: G06F11/30

    摘要: During a pipeline stall in an out of order processor, until a next to complete instruction group completes, a monitoring unit receives, from a completion unit of a processor, a next to finish indicator indicating the finish of an oldest previously unfinished instruction from among a plurality of instructions of a next to complete instruction group. The monitoring unit receives, from a plurality of functional units of the processor, a plurality of finish reports including completion reasons for a plurality of separate instructions. The monitoring unit determines at least one stall reason from among multiple stall reasons for the oldest instruction from a selection of completion reasons from a selection of finish reports aligned with the next to finish indicator from among the plurality of finish reports. Once the monitoring unit receives a complete indicator from the completion unit, indicating the completion of the next to complete instruction group, the monitoring unit stores each determined stall reason aligned with each next to finish indicator in memory.

    摘要翻译: 在处理器处于不规则处理器的流水线停止期间,直到完成指令组的下一个完成为止,监视单元从处理器的完成单元接收到指示完成以前未完成的指令的完成的下一个完成指示, 下一个完成指令组的多个指令。 监视单元从处理器的多个功能单元接收多个完成报告,包括多个单独指令的完成原因。 从多个完成报告中的与下一个完成指示符对齐的完成报告的选择完成原因的选择中,监视单元从最多的指令的多个失败原因中确定至少一个失败原因。 一旦监视单元从完成单元接收到完整的指示符,指示完成下一个完成指令组,则监视单元将每个确定的停顿原因与每个下一个完成指示符对准在存储器中。

    DELAY IDENTIFICATION IN DATA PROCESSING SYSTEMS
    2.
    发明申请
    DELAY IDENTIFICATION IN DATA PROCESSING SYSTEMS 有权
    数据处理系统中的延迟识别

    公开(公告)号:US20130151816A1

    公开(公告)日:2013-06-13

    申请号:US13314052

    申请日:2011-12-07

    IPC分类号: G06F9/30 G06F9/312

    摘要: Methods, systems, and computer program products may provide delay-identification in data processing systems. An apparatus may include a delay-identification unit having a delay counter, a threshold register, a delay register, and a delay detector. The delay detector may be configured to start the delay counter in response to detecting that one group of instructions is delayed, and stop the delay counter in response to detecting that the one group of instructions is no longer delayed. The delay detector may additionally be configured to compare the number of cycles counted by the delay counter with a threshold number of cycles in the threshold register, and store at least one effective address of one of the instructions of the one group of instructions when the number of cycles counted by the delay counter is greater than the threshold number of cycles stored in the threshold register.

    摘要翻译: 方法,系统和计算机程序产品可以在数据处理系统中提供延迟识别。 一种装置可以包括具有延迟计数器,阈值寄存器,延迟寄存器和延迟检测器的延迟识别单元。 延迟检测器可以被配置为响应于检测到一组指令被延迟而启动延迟计数器,并且响应于检测到一组指令不再被延迟而停止延迟计数器。 延迟检测器可以另外被配置为将由延迟计数器计数的周期数与阈值寄存器中的阈值数量进行比较,并且当数字的数量存储至少一个指令的一个指令的有效地址时, 由延迟计数器计数的周期大于存储在阈值寄存器中的阈值周期数。

    IDENTIFYING LOAD-HIT-STORE CONFLICTS

    公开(公告)号:US20140075158A1

    公开(公告)日:2014-03-13

    申请号:US13611006

    申请日:2012-09-12

    IPC分类号: G06F9/312

    CPC分类号: G06F9/44552 G06F9/3834

    摘要: A computing device identifies a load instruction and store instruction pair that causes a load-hit-store conflict. A processor tags a first load instruction that instructs the processor to load a first data set from memory. The processor stores an address at which the first load instruction is located in memory in a special purpose register. The processor determines where the first load instruction has a load-hit-store conflict with a first store instruction. If the processor determines the first load instruction has a load-hit store conflict with the first store instruction, the processor stores an address at which the first data set is located in memory in a second special purpose register, tags the first data set being stored by the first store instruction, stores an address at which the first store instruction is located in memory in a third special purpose register and increases a conflict counter.

    HARDWARE ASSIST THREAD FOR DYNAMIC PERFORMANCE PROFILING
    4.
    发明申请
    HARDWARE ASSIST THREAD FOR DYNAMIC PERFORMANCE PROFILING 失效
    用于动态性能配置的硬件辅助螺丝

    公开(公告)号:US20110302395A1

    公开(公告)日:2011-12-08

    申请号:US12796124

    申请日:2010-06-08

    IPC分类号: G06F9/30

    摘要: A method and data processing system for managing running of instructions in a program. A processor of the data processing system receives a monitoring instruction of a monitoring unit. The processor determines if at least one secondary thread of a set of secondary threads is available for use as an assist thread. The processor selects the at least one secondary thread from the set of secondary threads to become the assist thread in response to a determination that the at least one secondary thread of the set of secondary threads is available for use as an assist thread. The processor changes profiling of running of instructions in the program from the main thread to the assist thread.

    摘要翻译: 一种用于管理程序中的指令的运行的方法和数据处理系统。 数据处理系统的处理器接收监视单元的监视指令。 处理器确定一组辅助线程的至少一个辅助线程是否可用作辅助线程。 响应于确定所述一组次要线程的至少一个辅助线程可用作辅助线程,所述处理器从所述辅助线程组中选择所述至少一个辅助线程以成为所述辅助线程。 处理器将程序中指令的运行情况从主线程更改为辅助线程。

    TEMPORAL LOCALITY AWARE INSTRUCTION SAMPLING
    7.
    发明申请
    TEMPORAL LOCALITY AWARE INSTRUCTION SAMPLING 审中-公开
    时间局部性特征采样

    公开(公告)号:US20140075164A1

    公开(公告)日:2014-03-13

    申请号:US13610958

    申请日:2012-09-12

    IPC分类号: G06F9/30 G06F11/30

    摘要: A method and system are disclosed for sampling instructions executing on a computer processor. A computer processor determines a number of times a specified event has occurred within a specified temporal window. The computer processor determines to mark an instruction to be executed for monitoring based on the number of times the specified event has occurred within the temporal window, and in response, the computer processor marks the instruction.

    摘要翻译: 公开了一种用于在计算机处理器上执行的采样指令的方法和系统。 计算机处理器确定在指定的时间窗口内发生指定事件的次数。 计算机处理器根据在时间窗口内发生的指定事件的次数来确定要执行的用于监视的指令,并且作为响应,计算机处理器标记指令。

    Identifying load-hit-store conflicts
    8.
    发明授权
    Identifying load-hit-store conflicts 有权
    识别加载命中商店冲突

    公开(公告)号:US09229745B2

    公开(公告)日:2016-01-05

    申请号:US13611006

    申请日:2012-09-12

    IPC分类号: G06F9/00 G06F9/445 G06F9/38

    CPC分类号: G06F9/44552 G06F9/3834

    摘要: A computing device identifies a load instruction and store instruction pair that causes a load-hit-store conflict. A processor tags a first load instruction that instructs the processor to load a first data set from memory. The processor stores an address at which the first load instruction is located in memory in a special purpose register. The processor determines where the first load instruction has a load-hit-store conflict with a first store instruction. If the processor determines the first load instruction has a load-hit store conflict with the first store instruction, the processor stores an address at which the first data set is located in memory in a second special purpose register, tags the first data set being stored by the first store instruction, stores an address at which the first store instruction is located in memory in a third special purpose register and increases a conflict counter.

    摘要翻译: 计算设备识别导致加载命中 - 存储冲突的加载指令和存储指令对。 处理器标记指示处理器从存储器加载第一数据集的第一加载指令。 处理器将特定目的寄存器中的第一加载指令所在的地址存储在存储器中。 处理器确定第一个加载指令与第一个存储指令的加载命中 - 存储冲突的位置。 如果处理器确定第一加载指令具有与第一存储指令的加载命中存储冲突,则处理器将第一数据集所在的地址存储在第二专用寄存器中的存储器中,对存储的第一数据集进行标记 通过第一存储指令,将第一存储指令所在的地址存储在第三专用寄存器中,并增加冲突计数器。

    Workload performance projection for future information handling systems using microarchitecture dependent data
    10.
    发明授权
    Workload performance projection for future information handling systems using microarchitecture dependent data 有权
    使用微架构依赖数据的未来信息处理系统的工作负载性能预测

    公开(公告)号:US09135142B2

    公开(公告)日:2015-09-15

    申请号:US12343482

    申请日:2008-12-24

    摘要: A performance projection system includes a test IHS and a currently existing IHS. The performance projection system includes surrogate programs and user application software. The test IHS employs a memory that includes a virtual future IHS, currently existing IHS, surrogate programs, and user application software for determination of runtime and HW counter performance data. The user application software and surrogate programs execute on the currently existing MS to provide designers with runtime data and HW counter or microarchitecture dependent data. Designers execute surrogate programs on the future IHS to provide runtime and HW counter data. Designers normalize and weight the runtime and HW counter data to provide a representative surrogate program for comparison to user application software performance on the future IHS. Using a scaling factor, designers may generate a projection of runtime performance for the user application software executing on the future IHS.

    摘要翻译: 性能投影系统包括测试IHS和当前存在的IHS。 性能投影系统包括代理程序和用户应用软件。 测试IHS采用包含虚拟未来IHS,现有IHS,替代程序和用户应用软件的存储器,用于确定运行时和硬件计数器性能数据。 用户应用软件和代理程序在当前现有的MS上执行,为设计人员提供运行时数据和HW计数器或微体系结构依赖数据。 设计人员在未来的IHS上执行代理程序来提供运行时和硬件计数器数据。 设计师对运行时和HW计数器数据进行规范化和加权,以提供代表性的代理程序,以便与未来IHS的用户应用软件性能进行比较。 使用缩放因子,设计人员可以为未来IHS上执行的用户应用软件生成运行时性能的投影。