摘要:
Mechanisms for extracting data dependencies during runtime are provided. With these mechanisms, a portion of code having a loop is executed. A first parallel execution group is generated for the loop, the group comprising a subset of iterations of the loop less than a total number of iterations of the loop. The first parallel execution group is executed by executing each iteration in parallel. Store data for iterations are stored in corresponding store caches of the processor. Dependency checking logic of the processor determines, for each iteration, whether the iteration has a data dependence. Only the store data for stores where there was no data dependence determined are committed to memory.
摘要:
Mechanisms for extracting data dependencies during runtime are provided. With these mechanisms, a portion of code having a loop is executed. A first parallel execution group is generated for the loop, the group comprising a subset of iterations of the loop less than a total number of iterations of the loop. The first parallel execution group is executed by executing each iteration in parallel. Store data for iterations are stored in corresponding store caches of the processor, Dependency checking logic of the processor determines, for each iteration, whether the iteration has a data dependence. Only the store data for stores where there was no data dependence determined are committed to memory.
摘要:
Mechanisms for performing data parallel function calls in code during runtime are provided. These mechanisms may operate to execute, in the processor, a portion of code having a data parallel function call to a target portion of code. The mechanisms may further operate to determine, at runtime by the processor, whether the target portion of code is a data parallel portion of code or a scalar portion of code and determine whether the calling code is data parallel code or scalar code. Moreover, the mechanisms may operate to execute the target portion of code based on the determination of whether the target portion of code is a data parallel portion of code or a scalar portion of code, and the determination of whether the calling code is data parallel code or scalar code.
摘要:
A method and apparatus are provided for calibrating digital thermal sensors. A processor chip with a plurality of digital thermal sensors receives an analog voltage. A test circuit coupled to the processor chip receives a clock signal and a register coupled to the test circuit outputs a value on each clock cycle to a digital thermal sensor in the plurality of digital thermal sensors. The digital thermal sensor transitions an output state in response to the value of the register received in the digital thermal sensor equaling a temperature threshold of the digital thermal sensor. The value of the register at the point of transition is used to calibrate the digital thermal sensor. An incrementer increments the value of the register on each clock cycle in response to the value of the register received in the digital thermal sensor failing to equal the temperature threshold of the digital thermal sensor.
摘要:
Mechanisms for communicating with a processor event facility are provided. The mechanisms make use of a channel interface as the primary mechanism for communicating with the processor event facility. The channel interface provides channels for communicating with processor facilities, memory flow control facilities, machine state registers, and external processor interrupt facilities, for example. These channels may be designated as blocking or non-blocking. With blocking channels, when no data is available to be read from the corresponding registers, or there is no space available to write to the corresponding registers, the processor is placed in a low power “stall” state. The processor is automatically awakened, via communication across the blocking channel, when data becomes available or space is freed. Thus, the channels of the present invention permit the processor to stay in a low power state.
摘要:
A mechanism for communicating instructions and data between a processor and external devices are provided. The mechanism makes use of a channel interface as the primary mechanism for communicating between the processor and a memory flow controller. The channel interface provides channels for communicating with processor facilities, memory flow control facilities, machine state registers, and external processor interrupt facilities, for example. These channels may be designated as blocking or non-blocking. With blocking channels, when no data is available to be read from the corresponding registers, or there is no space available to write to the corresponding registers, the processor is placed in a low power “stall” state. The processor is automatically awakened, via communication across the blocking channel, when data becomes available or space is freed. Thus, the channels of the present invention permit the processor to stay in a low power state.
摘要:
A mechanism for programming a direct memory access engine operating as a single thread processor is provided. A program is received from a host processor in a local memory associated with the direct memory access engine. A request is received in the direct memory access engine from the host processor indicating that the program located in the local memory is to be executed. The direct memory access engine executes the program without intervention by a host processor. Responsive to the program completing execution, the direct memory access engine sends a completion notification to the host processor that indicates that the program has completed execution.
摘要:
A load when reservation lost instruction for performing cacheline polling is disclosed. Initially, a first process requests an action to be performed by a second process. The request is made via a store operation to a cacheable memory location. The first process then reads the cacheable memory location via a conditional load operation to determine whether or not the requested action has been completed by the second process, and the first process sets a reservation at the cacheable memory location if the requested action has not been completed by the second process. The conditional load operation of the first process is stalled until the reservation at the cacheable memory location has been lost. After the requested action has been completed, the reservation in the cacheable memory location is reset by the second process.
摘要:
Systems and methods for positioning thermal sensors within an integrated circuit in a manner that provides useful thermal measurements corresponding to different parts of the integrated circuit. In one embodiment, an integrated circuit includes multiple, duplicate functional blocks. A separate thermal sensor is coupled to each of the duplicate functional blocks, preferably in the same relative location on each of the duplicate functional blocks, and preferably at a hotspot. One embodiment also includes thermal sensors on one or more functional blocks of other types in the integrated circuit. One embodiment includes a thermal sensor positioned at a cool spot, such as at the edge of the integrated circuit chip. Each of the thermal sensors may have ports to enable power and ground connections or data connections between the sensors and external components or devices.
摘要:
The present invention provides a method, system, and computer program product for displaying images of conference call participants. A method in accordance with an embodiment of the present invention includes receiving a call from a user to join a conference call, obtaining a phone number of the user, matching the phone number to a stored graphical representation, and distributing and displaying the matching graphical representation to a predetermined set of users. A voice identification/recognition process can also be used to match the user to a stored graphical representation.