摘要:
Provided herein is a method for optimizing communication for system calls. The method includes storing a system call for each work item in a wavefront and transmitting said stored system calls to a processor for execution. The method also includes receiving a result to each work item in the wavefront responsive to said transmitting.
摘要:
A method, system and article of manufacture for balancing a workload on heterogeneous processing devices. The method comprising accessing a memory storage of a processor of one type by a dequeuing entity associated with a processor of a different type, identifying a task from a plurality of tasks within the memory that can be processed by the processor of the different type, synchronizing a plurality of dequeuing entities capable of accessing the memory storage, and dequeuing the task form the memory storage.
摘要:
A method, system and article of manufacture for balancing a workload on heterogeneous processing devices. The method comprising accessing a memory storage of a processor of one type by a dequeuing entity associated with a processor of a different type, identifying a task from a plurality of tasks within the memory that can be processed by the processor of the different type, synchronizing a plurality of dequeuing entities capable of accessing the memory storage, and dequeuing the task form the memory storage
摘要:
A processor may include a stack file and an execution core. The stack file may include an entry configured to store an addressing pattern and a tag. The addressing pattern identifies a memory location within the stack area of memory. The stack file may be configured to link a data value identified by the tag stored in the entry to the speculative result of a memory operation if the addressing pattern of the memory operation matches the addressing pattern stored in the entry. The execution core may be configured to access the speculative result when executing another operation that is dependent on the memory operation.
摘要:
A system may include a memory file and an execution core. The memory file may include an entry configured to store an addressing pattern and a tag. If an addressing pattern of a memory operation matches the addressing pattern stored in the entry, the memory file may be configured to link a data value identified by the tag to a speculative result of the memory operation. The addressing pattern of the memory operation includes an identifier of a logical register, and the memory file may be configured to predict whether the logical register is being specified as a general purpose register or a stack frame pointer register in order to determine whether the addressing pattern of the memory operation matches the addressing pattern stored in the entry. The execution core may be configured to access the speculative result when executing another operation that is dependent on the memory operation.
摘要:
A memory controller includes a threshold register that stores a value indicating a length of time and a control unit. In response to a first memory access request, the control unit generates signals that cause a memory device to open a page of memory. The control unit generates signals that cause the memory device to close the page if the page has been open for the length of time indicated by the value in the threshold register. The control unit modifies the value in the threshold register in response to receiving a second memory access request. For example, if the second memory access request causes a page miss for a most recently open page, the control unit may increase the value in the threshold register. The control unit may decrease the value in the threshold register in response to a page conflict caused by the second memory access request.
摘要:
Methods, systems and computer-readable mediums for task scheduling on an accelerated processing device (APD) are provided. In an embodiment, a method comprises: enqueuing one or more tasks in a memory storage module based on the APD; using a software-based enqueuing module; and dequeuing the one or more tasks from the memory storage module using a hardware-based command processor, wherein the command processor forwards the one or more tasks to the shader core.
摘要:
A system may include a dispatch unit, a scheduler, and an execution core. The dispatch unit may be configured to modify a load operation to include a register-to-register move operation in response to an indication that a speculative result of the load operation is linked to a data value identified by a first tag. The scheduler may be coupled to the dispatch unit and configured to issue the register-to-register move operation in response to availability of the data value. The execution core may be configured to execute the register-to-register move operation by outputting the data value and a tag indicating that the data value is the result of the load operation.
摘要:
A system may include a scheduler and an execution core. The scheduler includes an entry allocated to an operation. The entry includes a non-speculative tag and a speculative tag, and both the non-speculative tag and the speculative tag are associated with a first operand of the operation. The scheduler is configured to issue the operation in response to a data value identified by the speculative tag being available. The execution core may be configured to execute the operation using the data value identified by the speculative tag. The scheduler may be configured to reissue the operation if the non-speculative tag appears on a result bus.
摘要:
A system, method and article of manufacture for an accelerated processing device (APD) to request a central processing unit (CPU) to process a task, comprising enqueuing a plurality of tasks on a queue using the APD, generating a user-level interrupt and transmitting to the CPU the plurality of tasks in the queue using an interrupt handler associated with a CPU thread.