Function callback mechanism between a Central Processing Unit (CPU) and an auxiliary processor

    公开(公告)号:US10706496B2

    公开(公告)日:2020-07-07

    申请号:US16282553

    申请日:2019-02-22

    申请人: INTEL CORPORATION

    IPC分类号: G06F9/54 G06T1/20

    摘要: Generally, this disclosure provides systems, devices, methods and computer readable media for implementing function callback requests between a first processor (e.g., a GPU) and a second processor (e.g., a CPU). The system may include a shared virtual memory (SVM) coupled to the first and second processors, the SVM configured to store at least one double-ended queue (Deque). An execution unit (EU) of the first processor may be associated with a first of the Deques and configured to push the callback requests to that first Deque. A request handler thread executing on the second processor may be configured to: pop one of the callback requests from the first Deque; execute a function specified by the popped callback request; and generate a completion signal to the EU in response to completion of the function.

    ADAPTIVE SCHEDULING FOR TASK ASSIGNMENT AMONG HETEROGENEOUS PROCESSOR CORES

    公开(公告)号:US20190080429A1

    公开(公告)日:2019-03-14

    申请号:US16185965

    申请日:2018-11-09

    申请人: Intel Corporation

    摘要: Generally, this disclosure provides systems, devices, methods and computer readable media for adaptive scheduling of task assignment among heterogeneous processor cores. The system may include any number of CPUs, a graphics processing unit (GPU) and memory configured to store a pool of work items to be shared by the CPUs and GPU. The system may also include a GPU proxy profiling module associated with one of the CPUs to profile execution of a first portion of the work items on the GPU. The system may further include profiling modules, each associated with one of the CPUs, to profile execution of a second portion of the work items on each of the CPUs. The measured profiling information from the CPU profiling modules and the GPU proxy profiling module is used to calculate a distribution ratio for execution of a remaining portion of the work items between the CPUs and the GPU.

    Adaptive scheduling for task assignment among heterogeneous processor cores

    公开(公告)号:US10186007B2

    公开(公告)日:2019-01-22

    申请号:US14583247

    申请日:2014-12-26

    申请人: Intel Corporation

    摘要: An example system for adaptive scheduling of task assignment among heterogeneous processor cores may include any number of CPUs, a graphics processing unit (GPU) and memory configured to store a pool of work items to be shared by the CPUs and GPU. The system may also include a GPU proxy profiling module associated with one of the CPU s to profile execution of a first portion of the work items on the GPU. The system may further include profiling modules, each associated with one of the CPUs, to profile execution of a second portion of the work items on each of the CPUs. The measured profiling information from the CPU profiling modules and the GPU proxy profiling module is used to calculate a distribution ratio for execution of a remaining portion of the work items between the CPUs and the GPU.

    DYNAMIC RUNTIME TASK MANAGEMENT
    49.
    发明申请

    公开(公告)号:US20180173563A1

    公开(公告)日:2018-06-21

    申请号:US15383738

    申请日:2016-12-19

    申请人: Intel Corporation

    IPC分类号: G06F9/48 G06F9/50

    CPC分类号: G06F9/4881 G06F9/5027

    摘要: A dynamic runtime scheduling system includes task manager circuitry capable of detecting a correspondence in at least a portion of the output arguments from one or more first tasks with at least a portion of the input arguments to one or more second tasks. Upon detecting the output arguments from the first task represents a superset of the second task input arguments, the task manager circuitry apportions the first task into a plurality of new subtasks. At least one of the new subtasks includes output arguments having a 1:1 correspondence to the second task input arguments. Upon detecting the output arguments from an first task represents a subset of the second task input arguments, the task manager circuitry may autonomously apportion the second task into a plurality of new subtasks. At least one of the new subtasks may include input arguments having a 1:1 correspondence to first task output arguments.