Hardware accelerated dynamic work creation on a graphics processing unit

    公开(公告)号:US11550627B2

    公开(公告)日:2023-01-10

    申请号:US17215171

    申请日:2021-03-29

    IPC分类号: G06F9/48 G06F9/54 G06F9/38

    摘要: A processor core is configured to execute a parent task that is described by a data structure stored in a memory. A coprocessor is configured to dispatch a child task to the at least one processor core in response to the coprocessor receiving a request from the parent task concurrently with the parent task executing on the at least one processor core. In some cases, the parent task registers the child task in a task pool and the child task is a future task that is configured to monitor a completion object and enqueue another task associated with the future task in response to detecting the completion object. The future task is configured to self-enqueue by adding a continuation future task to a continuation queue for subsequent execution in response to the future task failing to detect the completion object.

    Data transfer acceleration
    3.
    发明授权

    公开(公告)号:US11086809B2

    公开(公告)日:2021-08-10

    申请号:US16693638

    申请日:2019-11-25

    发明人: Anthony Gutierrez

    摘要: Data transfer acceleration includes receiving, by a data transfer accelerator in a first node of a plurality of nodes, from a second node of the plurality of nodes, a request for data in a second state, wherein the second node stores an instance of the data in a first state; generating a message including one or more operations to transform the data from the first state to the second state; and sending the message to the second node in response to the request.

    Combining Quantum States of Qubits on a Quantum Processor

    公开(公告)号:US20230153672A1

    公开(公告)日:2023-05-18

    申请号:US17840417

    申请日:2022-06-14

    IPC分类号: G06N10/20

    CPC分类号: G06N10/20

    摘要: An electronic device includes a quantum processor including a plurality of qubits. The quantum processor runs a plurality of instances of a quantum program using a separate set of qubits from among the qubits for each instance of the quantum program. The quantum processor then sets quantum states for ancilla qubits from among the qubits based on quantum states of respective groups of associated qubits from the separate sets of qubits. The quantum processor next provides an output of the instances of the quantum program based on the quantum states of the ancilla qubits.

    Hardware accelerated dynamic work creation on a graphics processing unit

    公开(公告)号:US10963299B2

    公开(公告)日:2021-03-30

    申请号:US16134695

    申请日:2018-09-18

    IPC分类号: G06F9/48 G06F9/54 G06F9/38

    摘要: A processor core is configured to execute a parent task that is described by a data structure stored in a memory. A coprocessor is configured to dispatch a child task to the at least one processor core in response to the coprocessor receiving a request from the parent task concurrently with the parent task executing on the at least one processor core. In some cases, the parent task registers the child task in a task pool and the child task is a future task that is configured to monitor a completion object and enqueue another task associated with the future task in response to detecting the completion object. The future task is configured to self-enqueue by adding a continuation future task to a continuation queue for subsequent execution in response to the future task failing to detect the completion object.

    GARBAGE COLLECTING WAVEFRONT
    9.
    发明申请

    公开(公告)号:US20230097115A1

    公开(公告)日:2023-03-30

    申请号:US17485662

    申请日:2021-09-27

    IPC分类号: G06F9/50 G06F12/02

    摘要: A processing system executes a specialized wavefront, referred to as a “garbage collecting wavefront” or GCWF, to identify and deallocate resources such as, for example, scalar registers, vector registers, and local data share space, that are no longer being used by wavefronts of a workgroup executing at the processing system (i.e., dead resources). In some embodiments, the GCWF is programmed to have compiler information regarding the resource requirements of the other wavefronts of the workgroup and specifies the program counter after which there will be a permanent drop in resource requirements for the other wavefronts. In other embodiments, the standard compute wavefronts signal the GCWF when they have completed using resources. The GCWF sends a command to deallocate the dead resources so the dead resources can be made available for additional wavefronts.