Counting latencies of an instruction table flush, refill and instruction execution using a plurality of assigned counters
    1.
    发明授权
    Counting latencies of an instruction table flush, refill and instruction execution using a plurality of assigned counters 失效
    使用多个分配的计数器计数指令表的等待时间,刷新,补充和指令执行

    公开(公告)号:US06970999B2

    公开(公告)日:2005-11-29

    申请号:US10210415

    申请日:2002-07-31

    IPC分类号: G06F9/38 G06F9/44 G06F15/00

    摘要: A method and system for analyzing cycles per instruction (CPI) performance in a processor. A completion table corresponds to the instructions in a group to be processed by the processor. An empty completion table indicates that there has been some type of catastrophe that caused a table flush. While the table is empty, a performance monitoring counter (PMC), located in a performance monitoring unit (PMU) in the processor, counts the number of clock cycles that the table is empty. Preferably, a separate PMC is utilized depending on the reason that the completion table is empty. A second PMC likewise counts the number of clock cycles spent re-filling the empty completion table. A third PMC counts the number of clock cycles spent actually executing the instructions in the completion table. The information in the PMC's can be used to evaluate the true cause for degradation of CPI performance.

    摘要翻译: 一种用于分析处理器中每条指令(CPI)性能的循环的方法和系统。 完成表对应于要由处理器处理的组中的指令。 一个空的完成表表明有一些类型的灾难导致表冲洗。 当表为空时,位于处理器中的性能监视单元(PMU)中的性能监视计数器(PMC)会计数表为空的时钟周期数。 优选地,根据完成表为空的原因,使用单独的PMC。 第二个PMC同样计算重新填充空完成表的时钟周期数。 第三个PMC计算在完成表中实际执行指令花费的时钟周期数。 PMC中的信息可用于评估CPI性能下降的真正原因。

    Multiprocessor system
    2.
    发明申请
    Multiprocessor system 失效
    多处理器系统

    公开(公告)号:US20050198441A1

    公开(公告)日:2005-09-08

    申请号:US11044454

    申请日:2005-01-28

    申请人: Masahiro Tokoro

    发明人: Masahiro Tokoro

    IPC分类号: G06F12/08 G06F12/00

    CPC分类号: G06F12/0817

    摘要: A shared memory multiprocessor is provided which includes a plurality of nodes connected to one another. Each node includes: a main memory for storing data; a cache memory for storing a copy of data obtained from the main memory; and a CPU for accessing the main memory and the cache memory and processing data. The node further includes a directory and a memory region group. The directory is made up of directory entries each including one or more directory bits which each indicate whether the cache memory of another node stores a copy of a part of a memory region group of the main memory of this node. The memory region group includes of memory regions having the same memory address portion including a cache index portion. Each node is assigned to one of the one or more directory bits.

    摘要翻译: 提供了一种共享存储器多处理器,其包括彼此连接的多个节点。 每个节点包括:用于存储数据的主存储器; 用于存储从主存储器获得的数据的副本的高速缓冲存储器; 以及用于访问主存储器和高速缓存存储器并处理数据的CPU。 节点还包括目录和存储器区域组。 目录由目录条目组成,每个目录条目包括一个或多个目录位,每个目录位都指示另一个节点的高速缓存存储器是否存储该节点的主存储器的存储器区域组的一部分的副本。 存储器区域组包括具有包括缓存索引部分的相同存储器地址部分的存储器区域。 每个节点被分配到一个或多个目录位之一。

    Multiprocessor system and method ensuring coherency between a main memory and a cache memory
    3.
    发明授权
    Multiprocessor system and method ensuring coherency between a main memory and a cache memory 失效
    多处理器系统和方法确保主存储器和高速缓冲存储器之间的一致性

    公开(公告)号:US07613884B2

    公开(公告)日:2009-11-03

    申请号:US11044454

    申请日:2005-01-28

    申请人: Masahiro Tokoro

    发明人: Masahiro Tokoro

    IPC分类号: G06F13/00

    CPC分类号: G06F12/0817

    摘要: A directory of each node in a shared memory multiprocessor is made up of directory entries each including one or more directory bits indicating whether the cache memory of another node stores a copy of a part of a memory region group of the main memory of one node. The memory region group includes memory regions having the same memory address portion including a cache index portion. Each node is assigned one of the directory bits. When accessing the main memory, the node checks whether the directory bits of the directory entry corresponding to a memory region to be accessed are set to a predetermined value, and if one or more of the directory bits of the directory entry are set to the predetermined value, an access address is multicast or broadcast to other nodes to perform coherency control.

    摘要翻译: 共享存储器多处理器中的每个节点的目录由目录条目组成,每个目录条目包括一个或多个目录位,指示另一个节点的高速缓存存储器是否存储一个节点的主存储器的存储器区域组的一部分的副本。 存储器区域组包括具有包括高速缓存索引部分的相同存储器地址部分的存储区域。 每个节点都被分配一个目录位。 当访问主存储器时,节点检查与要访问的存储器区域相对应的目录条目的目录位是否被设置为预定值,并且如果目录条目的目录比特中的一个或多个被设置为预定值 值,访问地址是组播或广播到其他节点以执行一致性控制。

    Analyzing instruction completion delays in a processor
    5.
    发明授权
    Analyzing instruction completion delays in a processor 失效
    分析处理器中的指令完成延迟

    公开(公告)号:US07047398B2

    公开(公告)日:2006-05-16

    申请号:US10210358

    申请日:2002-07-31

    IPC分类号: G06F11/34

    摘要: A method and system for identifying instruction completion delays for a group of instructions in a computer processor. Each instruction in the group of instructions has a status indicator that identifies what is preventing that instruction from completing execution. Examples of completion delays are cache misses, data dependencies or simply the time required for an execution unit in the computer processor to process the instruction. As each instruction finishes executing, its associated status indicator is cleared to indicate that the instruction is no longer waiting to execute. The last instruction to execute is the instruction that is holding up completion of the entire group, and thus the cause for the completion delay of the last instruction is recorded as the cause of completion delay for the entire group.

    摘要翻译: 一种用于识别计算机处理器中的一组指令的指令完成延迟的方法和系统。 指令组中的每个指令都有一个状态指示器,用于标识阻止该指令完成执行的内容。 完成延迟的示例是缓存未命中,数据依赖性或简单地计算机处理器中的执行单元处理指令所需的时间。 每个指令执行完毕后,相关状态指示灯将被清除,表示该指令不再等待执行。 执行的最后一条指令是保持整个组的完成的指令,因此将最后指令的完成延迟的原因记录为整个组的完成延迟的原因。