Abstract:
Embodiments relate to clustering execution in a processing system. An aspect includes accessing a control flow graph that defines a data dependency and an execution sequence of a plurality of tasks of an application that executes on a plurality of system components. The execution sequence of the tasks in the control flow graph is modified as a clustered control flow graph that clusters active and idle phases of a system component while maintaining the data dependency. The clustered control flow graph is sent to an operating system, where the operating system utilizes the clustered control flow graph for scheduling the tasks.
Abstract:
An aspect includes optimizing an application workflow. The optimizing includes characterizing the application workflow by determining at least one baseline metric related to an operational control knob of an embedded system processor. The application workflow performs a real-time computational task encountered by at least one mobile embedded system of a wirelessly connected cluster of systems supported by a server system. The optimizing of the application workflow further includes performing an optimization operation on the at least one baseline metric of the application workflow while satisfying at least one runtime constraint. An annotated workflow that is the result of performing the optimization operation is output.
Abstract:
Direct communication of data between processing elements is provided. An aspect includes sending, by a first processing element, data over an inter-processing element chaining bus. The data is destined for another processing element via a data exchange component that is coupled between the first processing element and a second processing element via a communication line disposed between corresponding multiplexors of the first processing element and the second processing element. A further aspect includes determining, by the data exchange component, whether the data has been received at the data exchange element. If so, an indicator is set in a register of the data exchange component and the data is forwarded to the other processing element. Setting the indicator causes the first processing element to stall. If the data has not been received, the other processing element is stalled while the data exchange component awaits receipt of the data.
Abstract:
Direct communication of data between processing elements is provided. An aspect includes sending, by a first processing element, data over an inter-processing element chaining bus. The data is destined for another processing element via a data exchange component that is coupled between the first processing element and a second processing element via a communication line disposed between corresponding multiplexors of the first processing element and the second processing element. A further aspect includes determining, by the data exchange component, whether the data has been received at the data exchange element. If so, an indicator is set in a register of the data exchange component and the data is forwarded to the other processing element. Setting the indicator causes the first processing element to stall. If the data has not been received, the other processing element is stalled while the data exchange component awaits receipt of the data.
Abstract:
A computer detects a request by a process for access to a shadow control page, wherein the shadow control page allows the process access to one or more devices. The computer assigns the shadow control page and a key to the process associated with the request. The computer detects a request by the process via the assigned shadow control page for creation of a subset of devices from the one or more devices. The computer inputs information detailing an association between the subset of devices and the assigned key into a subset definition table, wherein the subset definition table includes one or more keys and one or more corresponding subsets.
Abstract:
According to one embodiment, a method for traffic prioritization in a memory device includes sending a memory access request including a priority value from a processing element in the memory device to a crossbar interconnect in the memory device. The memory access request is routed through the crossbar interconnect to a memory controller in the memory device associated with the memory access request. The memory access request is received at the memory controller. The priority value of the memory access request is compared to priority values of a plurality of memory access requests stored in a queue of the memory controller to determine a highest priority memory access request. A next memory access request is performed by the memory controller based on the highest priority memory access request.
Abstract:
Embodiments relate to storing data in memory. An aspect includes applying a power savings technique to at least a subset of a processor. Pending work items scheduled to be executed by the processor are monitored. The pending work items are grouped based on the power savings technique. The grouping includes delaying a scheduled execution time of at least one of the pending work items to increase an overall number of clock cycles that the power savings technique is applied to the processor. It is determined that an execution criteria has been met. The pending work items are executed based on the execution criteria being met and the grouping.
Abstract:
Embodiments relate to storing data in memory. An aspect includes applying a power savings technique to at least a subset of a processor. Pending work items scheduled to be executed by the processor are monitored. The pending work items are grouped based on the power savings technique. The grouping includes delaying a scheduled execution time of at least one of the pending work items to increase an overall number of clock cycles that the power savings technique is applied to the processor. It is determined that an execution criteria has been met. The pending work items are executed based on the execution criteria being met and the grouping.
Abstract:
A heterogeneous processing system includes a compiler for performing power-constrained code generation and scheduling of work in the heterogeneous processing system. The compiler produces source code that is executable by a computer. The compiler performs a method. The method includes dividing a power budget for the heterogeneous processing system into a discrete number of power tokens. Each of the power tokens has an equal value of units of power. The method also includes determining a power requirement for executing a code segment on a processing element of the heterogeneous processing system. The determining is based on characteristics of the processing element and the code segment. The method further includes allocating, to the processing element at runtime, at least one of the power tokens to satisfy the power requirement.
Abstract:
An apparatus for detecting hard errors in a circuit includes a storage device and a processing circuit. The storage has stored therein test data and normal data. The processing circuit includes combinational logic in series with at least one set of input latches and at least one set of output latches. The apparatus includes a test control module configured to control the processing circuit to halt a flow of normal data through the processing circuit and run the test data through the processing circuit while subjecting the processing circuit to a stress condition.