Processor with prefetch function
    21.
    发明申请
    Processor with prefetch function 审中-公开
    具有预取功能的处理器

    公开(公告)号:US20090106499A1

    公开(公告)日:2009-04-23

    申请号:US12071022

    申请日:2008-02-14

    IPC分类号: G06F12/08

    摘要: Non-speculatively prefetched data is prevented from being discarded from a cache memory before being accessed. In a cache memory including a cache control unit for reading data from a main memory into the cache memory and registering the data in the cache memory upon reception of a fill request from a processor and for accessing the data in the cache memory upon reception of a memory instruction from the processor, a cache line of the cache memory includes a registration information storage unit for storing information indicating whether the registered data is written into the cache line in response to the fill request and whether the registered data is accessed by the memory instruction. The cache control unit sets information in the registration information storage unit for performing a prefetch based on the fill request and resets the information for accessing the cache line based on the memory instruction.

    摘要翻译: 非推测性预取数据被阻止在被访问之前被从高速缓冲存储器中丢弃。 一种高速缓冲存储器,包括:高速缓存控制单元,用于从主存储器读取数据到高速缓冲存储器中,并且在接收到来自处理器的填充请求时将数据登记在高速缓冲存储器中,并且在接收到高速缓冲存储器时访问数据 来自处理器的存储器指令,高速缓冲存储器的高速缓存行包括注册信息存储单元,用于存储指示是否将注册数据写入高速缓存行中的信息,以及是否通过存储器指令访问注册数据 。 高速缓存控制单元在注册信息存储单元中设置用于基于填充请求执行预取的信息,并且基于存储器指令重新设置用于访问高速缓存行的信息。

    Computing system and control method
    22.
    发明授权
    Computing system and control method 失效
    计算系统和控制方法

    公开(公告)号:US07293092B2

    公开(公告)日:2007-11-06

    申请号:US10341505

    申请日:2003-01-14

    申请人: Naonobu Sukegawa

    发明人: Naonobu Sukegawa

    IPC分类号: G06F15/173

    CPC分类号: G06F9/4843

    摘要: A parallel or grid computing system that having a plurality of nodes and achieves job scheduling for the nodes with a view toward system efficiency optimization. The parallel or grid computing system has a plurality of nodes for transmitting and receiving data and a communication path for exchanging data among the nodes, which are either a transmitting node for transmitting data or a receiving node for processing a job dependent on transmitted data, and further has a time measuring means for measuring the time interval between the instant at which data is called for by a job and the instant at which the data is transmitted from a transmitting node to a receiving node, a time counting means for adding up the measured wait time data about each job, and a job scheduling means for determining the priority of jobs in accordance with the counted wait time and for scheduling jobs.

    摘要翻译: 一种并行或网格计算系统,其具有多个节点并且实现了针对系统效率优化的节点的作业调度。 并行或网格计算系统具有用于发送和接收数据的多个节点和用于在节点之间交换数据的通信路径,所述节点是用于发送数据的发送节点或用于处理取决于所发送的数据的作业的接收节点,以及 进一步具有时间测量装置,用于测量由作业调用数据的时刻与从发送节点向接收节点发送数据的时刻之间的时间间隔;时间计数装置, 关于每个作业的等待时间数据,以及作业调度装置,用于根据所计算的等待时间和调度作业来确定作业的优先级。

    Multiprocessor system
    23.
    发明申请
    Multiprocessor system 有权
    多处理器系统

    公开(公告)号:US20050102477A1

    公开(公告)日:2005-05-12

    申请号:US10886036

    申请日:2004-07-08

    申请人: Naonobu Sukegawa

    发明人: Naonobu Sukegawa

    摘要: A splittable/connectible bus 140 and a network 1000 for transmitting coherence transactions between CPUs are provided between the CPUs, and a directory 160 and a group setup register 170 for storing bus-splitting information are provided in a directory control circuit 150 that controls cache invalidation. The bus is dynamically set to a split or connected state to fit a particular execution form of a job, and the directory control circuit uses the directory in order to manage all inter-CPU coherence control sequences in response to the above setting, while at the same time, in accordance with information of the group setup register, omitting dynamically bus-connected CPU-to-CPU cache coherence control, and conducting only bus-split CPU-to-CPU cache coherence control through the network. Thus, decreases in performance scalability due to an inter-CPU coherence-processing overhead are relieved in a system having multiple CPUs and guaranteeing inter-CPU cache coherence by use of hardware.

    摘要翻译: 在CPU之间提供可拆卸/可连接总线140和用于在CPU之间传输相干事务的网络1000,并且在目录控制电路150中提供用于存储总线分解信息的目录160和组设置寄存器170,其控制高速缓存无效 。 总线被动态设置为分割或连接状态以适合作业的特定执行形式,并且目录控制电路使用该目录以便响应于上述设置来管理所有CPU间相干控制序列,而在 同时,根据组设置寄存器的信息,省略动态总线连接的CPU到CPU缓存一致性控制,并通过网络进行总线分割CPU到CPU缓存一致性控制。 因此,在具有多个CPU的系统中减轻了由于CPU间相干处理开销引起的性能可扩展性的降低,并且通过使用硬件来保证CPU间高速缓存的一致性。

    Processing instructions up to load instruction after executing sync flag
monitor instruction during plural processor shared memory store/load
access synchronization
    24.
    发明授权
    Processing instructions up to load instruction after executing sync flag monitor instruction during plural processor shared memory store/load access synchronization 失效
    在多个处理器共享存储器/负载访问同步期间执行同步标志监视指令之后,处理指令,直到加载指令

    公开(公告)号:US5968135A

    公开(公告)日:1999-10-19

    申请号:US972539

    申请日:1997-11-18

    CPC分类号: G06F9/52

    摘要: An information processing system is connected to a common storage and executes programs by use of processors. This system includes a common storage; a plurality of processors, connected to the common storage. Each processor executes an instruction to store data from common storage, and an instruction to load data from the common storage into the cache storage, wherein each processor includes a communication controller for, when detecting synchronization completion information for attaining synchronization of execution of instructions among a plurality of processors, sending synchronization completion information and receiving synchronization information from another processor; an instruction executing section for detecting a specified change of the flag of a specified location in the common storage by executing a Monitor instruction included in a program in response to synchronization information from the communication controller; an execution controller to execute subsequent instructions after the Monitor instruction, exclusive of a Load instruction to load data into a cache storage, until a change of the flag is detected by the execution section, wherein the processor allows instruction for loading data from common storage into the cache storage to be executed after the flag detection, and wherein the execution controller may include an inhibit resetting circuit to issue an inhibit instruction control signal to terminate the instruction send-out inhibiting action of the instruction inhibit circuit according to input from a service processor.

    摘要翻译: 信息处理系统连接到公共存储器,并通过使用处理器执行程序。 该系统包括一个通用存储器; 连接到公共存储器的多个处理器。 每个处理器执行用于存储来自公共存储器的数据的指令,以及用于将数据从公共存储器加载到高速缓存存储器中的指令,其中每个处理器包括通信控制器,用于当检测到同步完成信息以实现指令的执行同步时 多个处理器,发送同步完成信息和从另一处理器接收同步信息; 指令执行部分,用于响应于来自通信控制器的同步信息,通过执行包括在程序中的监视指令来检测公共存储器中的指定位置的标志的指定变化; 执行控制器,用于执行监视器指令之后的后续指令,不包括将数据加载到高速缓存存储器中的加载指令,直到所述执行部分检测到所述标志的改变,其中所述处理器允许从共同存储器加载数据的指令 所述高速缓存存储器在所述标志检测之后执行,并且其中所述执行控制器可以包括禁止复位电路,以发出禁止指令控制信号,以根据来自服务处理器的输入来终止所述指令禁止电路的指令发送禁止动作 。

    Optimum code generation method and compiler device for multiprocessor
    25.
    发明授权
    Optimum code generation method and compiler device for multiprocessor 有权
    多处理器的最佳代码生成方法和编译器

    公开(公告)号:US08296746B2

    公开(公告)日:2012-10-23

    申请号:US12068421

    申请日:2008-02-06

    IPC分类号: G06F9/45

    CPC分类号: G06F8/452

    摘要: A method of generating optimum parallel codes from a source code for a computer system configured of plural processors that share a cache memory or a main memory is provided. A preset code is read and operation amounts and process contents are analyzed while distinguishing dependence and independence among processes from the code. Then, the amount of data to be reused among processes is analyzed, and the amount of data that accesses the main memory is analyzed. Further, upon the reception of a parallel code generation policy inputted by a user, the processes of the code are divided, and while estimating an execution cycle from the operation amount and process contents thereof, the cache use of the reuse data, and the main memory access data amount, a parallelization method with which the execution cycle becomes shortest is executed.

    摘要翻译: 提供了一种从共享高速缓冲存储器或主存储器的多个处理器配置的计算机系统的源代码生成最佳并行代码的方法。 读取预设代码,分析操作量和处理内容,同时区分来自代码的进程之间的依赖性和独立性。 然后,分析处理之间要重复使用的数据量,分析访问主存储器的数据量。 此外,在接收到用户输入的并行代码生成策略时,代码的处理被划分,并且在从其操作量和处理内容估计执行周期的同时,重用数据的高速缓存使用和主 存储器访问数据量,执行执行周期变得最短的并行化方法。

    Method and computer for reducing power consumption of a memory
    26.
    发明授权
    Method and computer for reducing power consumption of a memory 有权
    用于降低存储器功耗的方法和计算机

    公开(公告)号:US08108629B2

    公开(公告)日:2012-01-31

    申请号:US11707114

    申请日:2007-02-16

    IPC分类号: G06F12/00 G06F13/00 G06F13/28

    摘要: Provided is a method of managing, in a computer including a processor and a memory that stores information referred to by the processor, the memory. The memory includes a plurality of memory banks, respective power supplies of which are independently controlled. The respective memory banks include a plurality of physical pages. The method includes collecting the physical pages having same degrees of use frequencies in the same memory bank, selecting the memory bank, the power supply for which is controlled, on the basis of the use frequency, and controlling the power supply for the memory bank selected.

    摘要翻译: 提供了一种在包括处理器和存储器的计算机中管理存储器的方法,所述存储器存储由处理器引用的信息。 存储器包括多个存储体,其各自的电源被独立地控制。 相应的存储体包括多个物理页。 该方法包括在相同的存储体中收集具有相同使用频率的物理页面,基于使用频率选择存储体,控制其电源,以及控制所选存储体的电源 。

    Method of power-aware job management and computer system
    27.
    发明授权
    Method of power-aware job management and computer system 有权
    功率感知工作管理和计算机系统的方法

    公开(公告)号:US07958508B2

    公开(公告)日:2011-06-07

    申请号:US12068086

    申请日:2008-02-01

    IPC分类号: G06F9/46 G06F1/00

    摘要: Provided is a method used in a computer system which includes at least one host computer, the method including managing a job to be executed by the host computer and a power supply of the host computer, the method including the procedures of: receiving the job; storing the received job; scheduling an execution plan for the stored job; determining, based on the execution plan of the job, a timing to execute power control of the host computer; determining a host computer to execute the power control when the determined timing to execute the power control is reached; controlling the power supply of the determined host computer; and executing the scheduled job.

    摘要翻译: 提供了一种在计算机系统中使用的方法,其包括至少一个主计算机,所述方法包括管理由主计算机执行的作业和主计算机的电源,所述方法包括以下过程:接收作业; 存储所接收的作业; 调度存储作业的执行计划; 基于作业的执行计划确定执行主计算机的功率控制的定时; 当达到确定的执行功率控制的定时时,确定主计算机执行功率控制; 控制所确定的主机的电源; 并执行预定作业。

    Computer system and control method for controlling processor execution of a prefetech command
    28.
    发明授权
    Computer system and control method for controlling processor execution of a prefetech command 有权
    用于控制预取命令的处理器执行的计算机系统和控制方法

    公开(公告)号:US07895399B2

    公开(公告)日:2011-02-22

    申请号:US11705410

    申请日:2007-02-13

    IPC分类号: G06F12/00

    CPC分类号: G06F12/0862 G06F2212/6028

    摘要: A processor reads a program including a prefetch command and a load command and data from a main memory, and executes the program. The processor includes: a processor core that executes the program; a L2 cache that stores data on the main memory for each predetermined unit of data storage; and a prefetch unit that pre-reads the data into the L2 cache from the main memory on the basis of a request for prefetch from the processor core. The prefetch unit includes: a L2 cache management table including an area in which a storage state is held for each position in the unit of data storage of the L2 cache and an area in which a request for prefetch is reserved; and a prefetch control unit that instructs, the L2 cache to perform the request for prefetch reserved or the request for prefetch from the processor core.

    摘要翻译: 处理器从主存储器读取包括预取命令和加载命令以及数据的程序,并执行该程序。 处理器包括:执行程序的处理器核心; L2缓存,用于在每个预定的数据存储单元上存储主存储器上的数据; 以及预取单元,其基于来自处理器核的预取请求,从主存储器预读取数据到L2高速缓存中。 预取单元包括:L2高速缓存管理表,其包括以L2缓存的数据存储为单位的每个位置保持存储状态的区域以及保留预取请求的区域; 以及预取控制单元,其指示L2高速缓存从所述处理器核心执行预取请求或预取请求。

    Method and program for generating execution code for performing parallel processing
    29.
    发明授权
    Method and program for generating execution code for performing parallel processing 有权
    用于生成用于执行并行处理的执行代码的方法和程序

    公开(公告)号:US07739530B2

    公开(公告)日:2010-06-15

    申请号:US11707146

    申请日:2007-02-16

    IPC分类号: G06F1/26 G06F1/22

    摘要: Provided is a method of reliably reducing power consumption of a computer, while promoting prompt compilation of a source code and execution of an output code. The method according to this invention includes the steps of: reading a code which is preset and analyzing an amount of operation of the CPU and an access amount with respect to the cache memory based on the code; obtaining an execution rate of the CPU and an access rate with respect to the cache memory based on the amount of operation and the access amount; determining an area in which the access rate with respect to the cache memory is higher than the execution rate of the CPU, based on the code; adding a code for enabling the power consumption reduction function to the area; and generating an execution code executable on the computer, based on the code.

    摘要翻译: 提供了一种可靠地降低计算机的功耗的方法,同时促进对源代码的及时编译和输出代码的执行。 根据本发明的方法包括以下步骤:基于代码读取预设的代码并分析CPU的操作量和关于高速缓冲存储器的访问量; 基于操作量和访问量,获得CPU的执行率和相对于高速缓冲存储器的访问速率; 基于代码,确定相对于高速缓冲存储器的访问速率高于CPU的执行速率的区域; 为该区域增加一个能够实现功耗降低功能的代码; 以及基于所述代码在所述计算机上生成可执行的执行代码。

    Processor having a cache memory which is comprised of a plurality of large scale integration
    30.
    发明申请
    Processor having a cache memory which is comprised of a plurality of large scale integration 有权
    具有由多个大规模集成组成的高速缓冲存储器的处理器

    公开(公告)号:US20090172288A1

    公开(公告)日:2009-07-02

    申请号:US12068700

    申请日:2008-02-11

    申请人: Naonobu Sukegawa

    发明人: Naonobu Sukegawa

    IPC分类号: G06F12/08

    摘要: To provide an easy way to constitute a processor from a plurality of LSIs, the processor includes: a first LSI containing a processor; a second LSI having a cache memory; and information transmission paths connecting the first LSI to a plurality of the second LSIs, in which the first LSI contains an address information issuing unit which broadcasts, to the second LSIs, via the information transmission paths, address information of data, the second LSI includes: a partial address information storing unit which stores a part of address information; a partial data storing unit which stores data that is associated with the address information; and a comparison unit which compares the address information broadcast with the address information stored in the partial address information storing unit to judge whether a cache hit occurs, and the comparison units of the plurality of the second LSIs are connected to the information transmission paths.

    摘要翻译: 为了提供从多个LSI构成处理器的简单方法,处理器包括:包含处理器的第一LSI; 具有高速缓冲存储器的第二LSI; 以及将第一LSI连接到多个第二LSI的信息传输路径,其中第一LSI包含经由信息传输路径向第二LSI广播的地址信息发布单元,数据的地址信息,第二LSI包括 :部分地址信息存储单元,其存储地址信息的一部分; 存储与地址信息相关联的数据的部分数据存储单元; 以及比较单元,其将地址信息广播与存储在部分地址信息存储单元中的地址信息进行比较,以判断是否发生高速缓存命中,并且将多个第二LSI的比较单元连接到信息传输路径。