摘要:
Loading software on a plurality of processors is presented. A processing unit (PU) retrieves a file from system memory and loads it into its internal memory. The PU extracts a processor type from the file's header which identifies whether the file should execute on the PU or a synergistic processing unit (SPU). If an SPU should execute the file, the PU DMA's the file to the SPU for execution. In one embodiment, the file is a combined file which includes both PU and SPU code. In this embodiment, the PU identifies one or more section headers included in the file which indicates embedded SPU code within the combined file. In this embodiment, the PU extracts the SPU code from the combined file and DMA's the extracted code to an SPU for execution.
摘要:
Grouping processors is presented. A processing unit (PU) initiates an application and identifies the application's requirements. The PU assigns one or more synergistic processing units (SPUs) and a memory space to the application in the form of a group. The application specifies whether the task requires shared memory or private memory. Shared memory is a memory space that is accessible by the SPUs and the PU. Private memory, however, is a memory space that is only accessible by the SPUs that are included in the group. When the application executes, the resources within the group are allocated to the application's execution thread. Each group has its own group properties, such as address space, policies (i.e. real-time, FIFO, run-to-completion, etc.) and priority (i.e. low or high). These group properties are used during thread execution to determine which groups take precedence over other tasks.
摘要:
An approach that uses a handler to detect asynchronous lock line reservation lost events, and switching tasks based upon whether a condition is true or a mutex lock is acquired is presented. A synergistic processing unit (SPU) invokes a first thread and, during execution, the first thread requests external data that is shared with other threads or processors in the system. This shared data may be protected with a mutex lock or other shared memory synchronization constructs. When requested data is not available, the SPU switches to a second thread and monitors lock line reservation lost events in order to check when the data is available. When the data is available, the SPU switches back to the first thread and processes the first thread's request.
摘要:
Asynchronously traversing a disjoint linked data structure is presented. A synergistic processing unit (SPU) includes a handler that works in conjunction with a memory flow controller (MFC) to traverse a disjoint linked data structure. The handler compares a search value with a node value, and provides the MFC with an effective address of the next node to traverse based upon the comparison. In turn, the MFC retrieves the corresponding node data from system memory and stores the node data in the SPU's local storage area. The MFC stalls processing and sends an asynchronous event interrupt to the SPU which, as a result, instructs the handler to retrieve and compare the latest node data in the local storage area with the search value. The traversal continues until the handler matches the search value with a node value or until the handler determines a failed search.
摘要:
A system and method for using a handler to detect asynchronous lock line reservation lost events, and switching tasks based upon whether a condition is true or a mutex lock is acquired is presented. A synergistic processing unit (SPU) invokes a first thread and, during execution, the first thread requests external data that is shared with other threads or processors in the system. This shared data may be protected with a mutex lock or other shared memory synchronization constructs. When requested data is not available, the SPU switches to a second thread and monitors lock line reservation lost events in order to check when the data is available. When the data is available, the SPU switches back to the first thread and processes the first thread's request.
摘要:
Code handling, such as interpreting language instructions or performing “just-in-time” compilation, is performed using a heterogeneous processing environment that shares a common memory. In a heterogeneous processing environment that includes a plurality of processors, one of the processors is programmed to perform a dedicated code-handling task, such as perform just-in-time compilation or interpretation of interpreted language instructions, such as Java. The other processors request code handling processing that is performed by the dedicated processor. Speed is achieved using a shared memory map so that the dedicated processor can quickly retrieve data provided by one of the other processors.
摘要:
A computer implemented method, data processing system, and computer usable code are provided for generation of hardware thermal profiles for a set of processors. Sampling is performed of the thermal states of the set of processors during the execution of a set of workloads to create sampled information. The sampled information and thermal characteristics of the set of processors are combined and a thermal index is generated based on the sampled information and characteristics of the set of processors.
摘要:
An approach that optimizes system performance using performance monitors is presented. The system gathers thread performance data using performance monitors for threads running on either a first ISA processor or a second ISA processor. Multiple first processors and multiple second processors may be included in a single computer system. The first processors and second processors can each access data stored in a common shared memory. The gathered thread performance data is analyzed to determine whether the corresponding thread needs additional CPU time in order to optimize system performance. If additional CPU time is needed, the amount of CPU time that the thread receives is altered (increased) so that the thread receives the additional time when it is scheduled by the scheduler. In one embodiment, the increased CPU time is accomplished by altering a priority value that corresponds to the thread.
摘要:
A computer implemented method, data processing system, and computer usable code are provided for generation of software thermal profiles for applications executing on a set of processors using thermal sampling. Sampling is performed of the thermal states of the set of processors for the set of workloads to create sampled information. A thermal index is then generated based on the sampled information.
摘要:
A system and method is provided to improve software simulation. A software emulator is used in conjunction with a hardware simulator. A special snapshot instruction is included in the software code that is emulated. When the snapshot instruction is encountered, values such as register, memory, and program stack values, are stored creating an initial snapshot. Code continues to be emulated and, when the next snapshot instruction is encountered, the values are written to create a second snapshot. The initial values are used to set an initial state in a hardware model that is simulated on a hardware simulator. The results of the hardware simulation are compared to the second snapshot to uncover software errors and/or hardware errors so that the software can be modified or the hardware design can be modified. Multiple sets of snapshots can be taken to analyze multiple sections of the software program.