Abstract:
An aspect includes pruning a design space when generating a maximum power stressmark. A multi-stage design space search process is performed. Each stage includes calculating a number of instructions per cycle (IPC) for each instruction sequence in a set of instruction sequences that place a power stress on a system under analysis, removing one or more of the instruction sequences having an IPC lower than a pruning threshold from the set, evaluating at least one power metric of the remaining instruction sequences in the set, removing one or more of the instruction sequences having at least one power metric evaluated outside of one or more pruning ranges from the set, and passing the remaining instruction sequences in the set to a next stage. A maximum power stressmark is generated based on the evaluating of the at least one power metric from a final stage.
Abstract:
An aspect includes pruning a design space when generating a maximum power stressmark. A multi-stage design space search process is performed. Each stage includes calculating a number of instructions per cycle (IPC) for each instruction sequence in a set of instruction sequences that place a power stress on a system under analysis, removing one or more of the instruction sequences having an IPC lower than a pruning threshold from the set, evaluating at least one power metric of the remaining instruction sequences in the set, removing one or more of the instruction sequences having at least one power metric evaluated outside of one or more pruning ranges from the set, and passing the remaining instruction sequences in the set to a next stage. A maximum power stressmark is generated based on the evaluating of the at least one power metric from a final stage.
Abstract:
One aspect is a method that includes analyzing, by a processor of an analysis system, an instruction set architecture of a targeted complex-instruction set computer (CISC) processor to generate an instruction set profile for each CISC architectural instruction variant of the instruction set architecture. A combination of instruction sequences for the targeted CISC processor is determined from the instruction set profile that corresponds to a desired stressmark type. The desired stressmark type defines a metric representative of functionality of interest of the targeted CISC processor. Performance of the targeted CISC processor is monitored with respect to the desired stressmark type while executing each of the instruction sequences. One of the instruction sequences is identified as most closely aligning with the desired stressmark type based on performance results of execution of the instruction sequences with respect to the desired stressmark type.
Abstract:
An aspect includes optimizing an application workflow. The optimizing includes characterizing the application workflow by determining at least one baseline metric related to an operational control knob of an embedded system processor. The application workflow performs a real-time computational task encountered by at least one mobile embedded system of a wirelessly connected cluster of systems supported by a server system. The optimizing of the application workflow further includes performing an optimization operation on the at least one baseline metric of the application workflow while satisfying at least one runtime constraint. An annotated workflow that is the result of performing the optimization operation is output.
Abstract:
An aspect includes receiving a write request that includes a memory address and write data. Stored data is read from a memory location at the memory address. Based on determining that the memory location was not previously modified, the stored data is compared to the write data. Based on the stored data matching the write data, the write request is completed without writing the write data to the memory and a corresponding silent store bit, in a silent store bitmap is set. Based on the stored data not matching the write data, the write data is written to the memory location, the silent store bit is reset and a corresponding modified bit is set. At least one of an application and an operating system is provided access to the silent store bitmap.
Abstract:
A time-of-day (TOD) clock is leveraged to provide a high-resolution measure of the real time that is suitable for the indication of date and time to perform cycle-level thread synchronization. A time-of-day value provided by the time-of-day clock is used in a spin lock, along with a configurable mask, to meet a specified condition. The condition is met at regular time intervals and at the same time for all the hardware threads to be synchronized. When the condition is met and synchronization is reached, execution of the threads continues ensuring that the activity generated on each thread is in synchronization.
Abstract:
In an approach for sharing memory bandwidth in one or more processors, a processor receives a first set of monitored usage information for one or more processors executing one or more threads. A processor calculates impact of hardware data prefetching for each thread of the one or more threads, based on the first set of monitored usage information. A processor adjusts prefetch settings for the one or more threads, based on the calculated impact of hardware data prefetching for each thread of the one or more threads.
Abstract:
Embodiments relate to storing data in memory. An aspect includes applying a power savings technique to at least a subset of a processor. Pending work items scheduled to be executed by the processor are monitored. The pending work items are grouped based on the power savings technique. The grouping includes delaying a scheduled execution time of at least one of the pending work items to increase an overall number of clock cycles that the power savings technique is applied to the processor. It is determined that an execution criteria has been met. The pending work items are executed based on the execution criteria being met and the grouping.
Abstract:
A time-of-day (TOD) clock is leveraged to provide a high-resolution measure of the real time that is suitable for the indication of date and time to perform cycle-level thread synchronization. A time-of-day value provided by the time-of-day clock is used in a spin lock, along with a configurable mask, to meet a specified condition. The condition is met at regular time intervals and at the same time for all the hardware threads to be synchronized. When the condition is met and synchronization is reached, execution of the threads continues ensuring that the activity generated on each thread is in synchronization.
Abstract:
Systems and methods to manage memory on a spin transfer torque magnetoresistive random-access memory (STT-MRAM) are provided. A particular method of managing memory includes determining a temperature associated with the memory and determining a level of write queue utilization associated with the memory. A write operation may be performed based on the level of write queue utilization and the temperature.