Abstract:
A work stealer apparatus includes a determination module. The determination module is to determine to steal work from a first hardware computation unit of a first type for a second hardware computation unit of a second type that is different than the first type. The work is to be queued in a first work queue, which is to correspond to the first hardware computation unit, and which is to be stored in a shared memory that is to be shared by the first and second hardware computation units. A synchronized work stealer module is to steal the work through a synchronized memory access to the first work queue. The synchronized memory access is to be synchronized relative to memory accesses to the first work queue from the first hardware computation unit.
Abstract:
Methods, apparatus, systems, and articles of manufacture are disclosed to steal work in heterogeneous computing systems. An apparatus includes load balancing circuitry to obtain tasks from a workload by encoding minimum and maximum index ranges of a data parallel operation, allocate a first task from the workload to a first work queue based on a first capability of first computation circuitry, the first computation circuitry to process the first task in the first work queue, and allocate a second task from the workload to a second work queue, second computation circuitry to process the second task in the second work queue. The apparatus further includes first work stealer logic to steal the second task from the second work queue using an atomic operation to access the second work queue.
Abstract:
A work stealer apparatus includes a determination module. The determination module is to determine to steal work from a first hardware computation unit of a first type for a second hardware computation unit of a second type that is different than the first type. The work is to be queued in a first work queue, which is to correspond to the first hardware computation unit, and which is to be stored in a shared memory that is to be shared by the first and second hardware computation units. A synchronized work stealer module is to steal the work through a synchronized memory access to the first work queue. The synchronized memory access is to be synchronized relative to memory accesses to the first work queue from the first hardware computation unit.
Abstract:
Disclosed examples include scheduler circuitry to allocate a first task to a first work queue in memory; and a first processor circuit of a first type, the first processor circuit to cause movement of the first task from the first work queue to a second work queue in the memory, the second work queue accessible by a second processor circuit of a second type, the movement atomically performed via a read operation and a write operation to update the second work queue in a same bus cycle to prevent multiple entities from moving the first task in the same bus cycle.
Abstract:
Methods, apparatus, systems, and articles of manufacture are disclosed to steal work in heterogeneous computing systems. An apparatus includes load balancing circuitry to obtain tasks from a workload by encoding minimum and maximum index ranges of a data parallel operation, allocate a first task from the workload to a first work queue based on a first capability of first computation circuitry, the first computation circuitry to process the first task in the first work queue, and allocate a second task from the workload to a second work queue, second computation circuitry to process the second task in the second work queue. The apparatus further includes first work stealer logic to steal the second task from the second work queue using an atomic operation to access the second work queue.
Abstract:
Systems and methods may provide for identifying an object in a managed runtime environment and determining an age of the object at a software level of the managed runtime environment. Additionally, the object may be selectively allocated in one of a dynamic random access memory (DRAM) or a non-volatile random access memory (NVRAM) based at least in part on the age of the object. In one example, the data type of the object is also determined, wherein the object is selectively allocated further based on the data type.