摘要:
A parallel processor system controls access to a distributed shared memory and to plural cache memories to prevent frequently-used local data from being flushed out of a cache memory. The parallel processor system includes a plurality of nodes each including a processor and a shared memory in a distributed shared memory arrangement, and a local-remote divided cache memory system, wherein local data and remote data are controlled separately. Each local-remote divided cache memory system includes a local data area, a remote data area, and a cache memory controller by which either the local data area or the remote data area is accessed according to the contents of an access request.
摘要:
A parallel processor system controls access to a distributed shared memory and to plural cache memories to prevent frequently-used local data from being flushed out of a cache memory. The parallel processor system includes a plurality of nodes each including a processor and a shared memory in a distributed shared memory arrangement, and a local-remote divided cache memory system, wherein local data and remote data are controlled separately. Each local-remote divided cache memory system includes a local data area, a remote data area, and a cache memory controller by which either the local data area or the remote data area is accessed according to the contents of an access request.
摘要:
To increase the capacity of usable memory of a parallel processing computer system as a whole and effectively utilize the address space without waste, a variable-length Global/Local allocation field is provided in a fixed-length address. When the field is locally set, the address is used as an address of a local memory area to which the local processor refers. When the allocation is globally set, the remaining address is a variable length logical processor number (this number is converted into a physical processor number) and a variable length offset address, for specifying a global memory area belonging to a processor out of the global areas of memories of a group of some of the processors, which global memory can be referred to by all the processors of the groups. A memory access interface executes memory access to the local or global area of the memory of the local processor or to the global area of the memory of another processor.
摘要:
A main memory shared by plural processing units in a parallel computer system is composed of plural partial main memories. A directory for each data line of the main memory is generated after the data line has been cached in one of the processing units. The directory is held in one of the partial main memories in place of the data line. The directory indicates a processing unit which has cached the data line. A status bit C provided for the data line is set. If a subsequent read request is given to the data line, the status C bit is checked and the directory is used to identify a processing unit that has cached the data line. The request is transferred to the identified processing unit, and the data line is transferred from that processing unit to the processing unit that has issued the request. If a processing unit that has cached the data line has replaced the data line, it is checked if there is a processing unit that has cached the data line. If there is none, the data line is written back into the one partial main memory. If there is, the data line is not written back. Another status bit RO is also used for each data line. It indicates if the data line is read only. If a data line is read only, generation of the directory and storing it in the partial main memory is prohibited.
摘要:
Provided is a method used in a computer system which includes at least one host computer, the method including managing a job to be executed by the host computer and a power supply of the host computer, the method including the procedures of: receiving the job; storing the received job; scheduling an execution plan for the stored job; determining, based on the execution plan of the job, a timing to execute power control of the host computer; determining a host computer to execute the power control when the determined timing to execute the power control is reached; controlling the power supply of the determined host computer; and executing the scheduled job.
摘要:
A splittable/connectible bus 140 and a network 1000 for transmitting coherence transactions between CPUs are provided between the CPUs, and a directory 160 and a group setup register 170 for storing bus-splitting information are provided in a directory control circuit 150 that controls cache invalidation. The bus is dynamically set to a split or connected state to fit a particular execution form of a job, and the directory control circuit uses the directory in order to manage all inter-CPU coherence control sequences in response to the above setting, while at the same time, in accordance with information of the group setup register, omitting dynamically bus-connected CPU-to-CPU cache coherence control, and conducting only bus-split CPU-to-CPU cache coherence control through the network.Thus, decreases in performance scalability due to an inter-CPU coherence-processing overhead are relieved in a system having multiple CPUs and guaranteeing inter-CPU cache coherence by use of hardware.
摘要:
A processor reads a program including a prefetch command and a load command and data from a main memory, and executes the program. The processor includes: a processor core that executes the program; a L2 cache that stores data on the main memory for each predetermined unit of data storage; and a prefetch unit that pre-reads the data into the L2 cache from the main memory on the basis of a request for prefetch from the processor core. The prefetch unit includes: a L2 cache management table including an area in which a storage state is held for each position in the unit of data storage of the L2 cache and an area in which a request for prefetch is reserved; and a prefetch control unit that instructs, the L2 cache to perform the request for prefetch reserved or the request for prefetch from the processor core.
摘要:
Interrupt process generated in a processor for arithmetic operation is offloaded onto a system control processor, thereby reducing disturbance to the processor for arithmetic operation. A heterogeneous multiprocessor system includes: means which accepts an interrupt in each CPU; means which inquires the accepted interrupt of an interrupt destination management table to select an interrupt destination CPU; means which queues the accepted interrupt; means which generates an inter-CPU interrupt to the selected interrupt destination CPU; each means which receives the inter-CPU interrupt in the interrupt source CPU, performs interrupt process of the interrupt source CPU, and generates the inter-CPU interrupt to the interrupt source CPU in the interrupt destination CPU; means which performs an interrupt end process; and means which performs interrupt process in its own CPU when the interrupt destination CPU selected as a result of the inquiry to the interrupt destination management table is its own CPU.
摘要:
A shared main memory type multiprocessor is arranged to have a switch connection type. The multiprocessor prepares an instruction for outputting a synchronization transaction. When each CPU executes this instruction, after all the transactions of the preceding instructions are output, the synchronization transaction is output to the main memory and the coherence controller. By the synchronization transaction, the main memory serializes the memory accesses and the coherence controller guarantees the completion of the cache coherence control. This makes it possible to serialize the memory accesses and guarantee the completion of the cache coherence control at the same time.
摘要:
A memory system having a DRAM or synchronous DRAM as a memory unit. A memory controller which controls the memory unit in correspondence with a memory access request received from a memory access request generator, has a row address buffer for storing a row address extracted from an issued memory access request, avoiding registration of same row address into different positions, a pointer register for storing a pointer to a registration entry in the row address buffer holding the row address, correspondence detection circuit that detects whether or not row addresses of issued access requests correspond with each other by comparing stored pointers, and a memory unit control circuit which continuously issues column addresses of plural requests with row addresses corresponding with each other to the DRAM.