Abstract:
According to a method of operating a data processing system, one or more monitoring parameter sets are established in a processing unit within the data processing system. The processing unit monitors, in hardware, execution of each of a plurality of schedulable software entities within the processing unit in accordance with a monitoring parameter set among the one or more monitoring parameter sets. The processing unit then reports to software executing in the data processing system utilization of hardware resources by each of the plurality of schedulable software entities. The hardware utilization information reported by the processing unit may be stored and utilized by software to schedulable execution of the schedulable software entities reported by the processing unit. The hardware utilization information may also be utilized to generate a classification of at least one executing schedulable software entity, which may be communicated to the processing unit to dynamically modify an allocation of hardware resources to the schedulable software entity.
Abstract:
A processor communication register (PCR) contained in each processor within a multiprocessor system provides enhanced processor communication. Each PCR stores identical processor communication information that is useful in pipelined or parallel multi-processing. Each processor has exclusive rights to store to a sector within each PCR and has continuous access to read the contents of its own PCR. Each processor updates its exclusive sector within all of the PCRs, instantly allowing all of the other processors to see the change within the PCR data, and bypassing the cache subsystem. Efficiency is enhanced within the multiprocessor system by providing processor communications to be immediately transferred into all processors without momentarily restricting access to the information or forcing all the processors to be continually contending for the same cache line, and thereby overwhelming the interconnect and memory system with an endless stream of load, store and invalidate commands.
Abstract:
A method and system are disclosed for running a manufacturing-level test program on an installed processor by interrupting processor execution of a non-test process. Periodic execution of the manufacturing-level test program allows the processor to continually self-test during normal function operation, in order to facilitate proper maintenance and function of the processor and a data processing system of which the processor is a part.
Abstract:
A data processing system includes a global promotion facility containing a plurality of promotion bit fields, an interconnect, and a plurality of processing units coupled to the global promotion facility and to the interconnect. A first processing unit includes an instruction sequencing unit, an execution unit that executes an acquisition instruction to acquire a particular promotion bit field within the global promotion facility, and a promotion awareness facility. In response to the first processing unit snooping a request by a second processing unit for the particular promotion bit field, the first processing unit records an association between the second processing unit and the particular promotion bit field in the global promotion facility. After the request and release of the particular promotion bit field by the first processing unit, the first processing unit checks the promotion awareness facility for an association for the particular promotion bit and responsive to the checking, pushes the particular promotion bit field to the second processing unit utilizing an unsolicited operation on the interconnect such that no additional request by the second processing unit is required.
Abstract:
A method of improving memory access for a computer system, by sending load requests to a lower level storage subsystem along with associated information pertaining to intended use of the requested information by the requesting processor, without using a high level load queue. Returning the requested information to the processor along with the associated use information allows the information to be placed immediately without using reload buffers. A register load bus separate from the cache load bus (and having a smaller granularity) is used to return the information. An upper level (Li) cache may then be imprecisely reloaded (the upper level cache can also be imprecisely reloaded with store instructions). The lower level (L2) cache can monitor L1 and L2 cache activity, which can be used to select a victim cache block in the L1 cache (based on the additional L2 information), or to select a victim cache block in the L2 cache (based on the additional L1 information). L2 control of the L1 directory also allows certain snoop requests to be resolved without waiting for L1 acknowledgement. The invention can be applied to, e.g., instruction, operand data and translation caches.
Abstract:
Disclosed is a method of processing instructions in a data processing system. An instruction sequence that includes a memory access instruction is received at a processor in program order. In response to receipt of the memory access instruction a memory access request and a barrier operation are created. The barrier operation is placed on an interconnect after the memory access request is issued to a memory system. After the barrier operation has completed, the memory access request is completed in program order. When the memory access request is a load request, the load request is speculatively issued if a barrier operation is pending. Data returned by the speculatively issued load request is only returned to a register or execution unit of the processor when an acknowledgment is received for the barrier operation.
Abstract:
A method for reserving memory buffers for receiving data prior to the actual movement of data on a data processing system. A naked write operation is generated that includes a destination address and an address of the processor generating the write operation. The naked write operation is then issued on the fabric of said data processing system without any accompanying data. The naked write operation is snooped by the memory controller associated with the destination address. The memory controller then provides a response that is sent to the processor. The response sent depends on whether the memory controller is able to allocate a buffer to the naked write operation. When the memory controller is able to allocate a buffer to the naked write operation, the memory controller issues a Null response, which triggers a read operation that sends the corresponding data to the buffer at a later time.
Abstract:
A move engine and operating system transparently reconfigure physical memory to accomplish addition, subtraction, or replacement of a memory module. The operating system stores FROM and TO real addresses in unique fields in memory that are used to virtualize the physical address of the memory module being reconfigured and provide the reconfiguration in real-time through the use of hardware functionality and not software. Using the FROM and TO real addresses to select a source and a target, the move engine copies the contents of the memory module to be removed or reconfigured into the remaining or inserted memory module. Then, the real address associated with the reconfigured memory module is re-assigned to the memory module receiving the copied contents, thereby creating a virtualized physical mapping from the addressable real address space being utilized by the operating system into a virtual physical address space. During the process of moving the memory contents, the operating system stalls. Write memory requests addressed to the real address space currently associated with the sourcing memory module indicated by either the FROM or TO real address space. As will be appreciated, a memory module can be inserted, removed or replaced in physical memory without the operating system having to stop all memory operations in the memory to accomplish the physical memory change.
Abstract:
A non-uniform memory access (NUMA) computer system includes at least one remote node and a home node coupled by a node interconnect. The home node contains a home system memory and a memory controller. In response to receipt of a data request from a remote node, the memory controller determines whether to grant exclusive or non-exclusive ownership of requested data specified in the data request by reference to history information indicative of prior data accesses originating in the remote node. The memory controller then transmits the requested data and an indication of exclusive or non-exclusive ownership to the remote node.
Abstract:
A computer system of a number of processing nodes operate either in a loop configuration or off of a common bus with high speed, high performance wide bandwidth characteristics. The processing nodes in the system are also interconnected by a separate narrow bandwidth, low frequency recovery bus. When problems arise with node operations on the high speed bus, operations are transferred to the low frequency recovery bus and continue there at a slower rate for recovery operations. The recovery technique may be used to increase system speed and performance on a dynamic basis.