摘要:
A processor having multiple cores coordinates functions performed on the cores to automatically, dynamically and repeatedly reconfigure the cores for optimal performance based on characteristics of currently executing software. A core running a thread detects a multi-core characteristic of the thread and assigns one or more other cores to the thread to dynamically combine the cores into what functionally amounts to a common core for more efficient execution of the thread.
摘要:
A multiprocessor system having plural heterogeneous processing units schedules instruction sets for execution on a selected of the processing units by matching workload processing characteristics of processing units and the instruction sets. To establish an instruction set's processing characteristics, the homogeneous instruction set is executed on each of the plural processing units with one or more performance metrics tracked at each of the processing units to determine which processing unit most efficiently executes the instruction set. Instruction set workload processing characteristics are stored for reference in scheduling subsequent execution of the instruction set.
摘要:
A snoop coherency method, system and program are provided for intervening a requested cache line from a plurality of candidate memory sources in a multiprocessor system on the basis of the sensed temperature or power dissipation value at each memory source. By providing temperature or power dissipation sensors in each of the candidate memory sources (e.g., at cores, cache memories, memory controller, etc.) that share a requested cache line, control logic may be used to determine which memory source should source the cache line by using the power sensor signals to signal only the memory source with acceptable power dissipation to provide the cache line to the requester.
摘要:
A method, system and program are provided for dynamically reconfiguring a pipelined processor to operate with reduced power consumption without reducing existing performance. By monitoring or detecting the performance of individual units or stages in the processor as they execute a given workload, each stage may use high-performance circuitry until such time as a drop in the throughput performance is detected, at which point the stages are reconfigured to use lower-performance circuitry so as to meet the reduced performance throughput requirements using less power. By configuring the processor to back off from high-performance designs to low-performance designs to meet the detected performance characteristics of the executing workload warrant, power dissipation may be optimized.
摘要:
A method, system and program are provided for dynamically assigning priority values to instruction threads in a computer system based on one or more predetermined thread performance tests, and using the assigned instruction priorities to determine how resources are used in the system. By storing the assigning priority values for each thread as a tag in the thread's instructions, tagged instructions from different threads that are dispatched through the system are allocated system resources based on the tagged priority values assigned to the respective instruction threads. Priority values for individual threads may be updated with control software which tests thread performance and uses the test results to apply predetermined adjustment policies. The test results may be used to optimize the workload allocation of system resources by dynamically assigning thread priority values to individual threads using any desired policy, such as achieving thread execution balance relative to thresholds and to performance of other threads, reducing thread response time, lowering power consumption, etc.
摘要:
A reduced number of voltage regulator modules provides a reduced number of supply voltages to the package. The package includes a voltage plane for each of the voltage regulator modules. Each core or other component on the die is tied to a switch on the package, and each switch is electrically connected to all of the voltage planes. A wafer-level test determines a voltage that optimizes performance of each core or other component. Given these voltage values, an engineer may determine voltage settings for the voltage regulator modules and which cores are to be connected to which voltage regulator modules. A database stores voltage setting data, such as the optimal voltage for each component, switch values, or voltage settings for each voltage regulator module. An engineering wire may permanently set each switch to customize the voltage supply to each core or other component.
摘要:
A reduced number of voltage regulator modules provides a reduced number of supply voltages to the package. The package includes a voltage plane for each of the voltage regulator modules. Each core or other component on the die is tied to a switch on the package, and each switch is electrically connected to all of the voltage planes. A wafer-level test determines a voltage that optimizes performance of each core or other component. Given these voltage values, an engineer may determine voltage settings for the voltage regulator modules and which cores are to be connected to which voltage regulator modules. A database stores voltage setting data, such as the optimal voltage for each component, switch values, or voltage settings for each voltage regulator module. An engineering wire may permanently set each switch to customize the voltage supply to each core or other component.
摘要:
An apparatus and method for providing a multi-core integrated circuit chip that reduces the cost of the package and board while optimizing performance of the cores for use with a single voltage plane. The apparatus and method of the illustrative embodiments make use of a dynamic burn-in technique that optimizes all of the cores on the chip to run at peak performance at a single voltage. Each core is burned-in with a customized burn-in voltage that provides uniform power and performance across the whole chip. This results in a higher burn-in yield and lower overall power in the integrated circuit chip. The optimization of the cores to run at peak performance at a single voltage is achieved through use of the negative bias temperature instability affects on the cores imparted by the burn-in voltages applied.
摘要:
A logic circuit mechanism for inducing a processing unit to release a LOCK signal that the processing unit uses to secure continuous access to a memory system during read modify write operations requiring "atomic" (continuous) access. The processing unit has an internal cache enabling it to set up consecutive memory access operations at a pace such that the LOCK signal could be held continuously active while a string of atomic memory accesses is carried out. The present circuit mechanism prevents premature release of the processing unit's LOCK signal, by asserting a Hold signal which requires the processing unit to release its LOCK signal but only after that unit has fully completed its current atomic access operation. The logic circuit reduces its impact on processing unit performance by detecting when the LOCK signal has been active continuously for N consecutive atomic operations coinciding with external contention, and calling for release of the CPUs LOCK signal only while the Nth such operation is being conducted.
摘要:
A personal computer has two memory banks respectively connected to two parity check units operative to detect parity errors. Upon doing so, each unit feeds a parity error signal to a separate latch. The latches are connected to a logic circuit which is in turn connected to an interrupt controller that initiates an interrupt when a parity error occurs. One latch is further connected to a check bit of a register of an I/O port and the check bit is set by the one latch. An interrupt handler reads the register and provides messages indicating which memory bank caused the parity error.