Hardware accelerated dynamic work creation on a graphics processing unit

    公开(公告)号:US11550627B2

    公开(公告)日:2023-01-10

    申请号:US17215171

    申请日:2021-03-29

    Abstract: A processor core is configured to execute a parent task that is described by a data structure stored in a memory. A coprocessor is configured to dispatch a child task to the at least one processor core in response to the coprocessor receiving a request from the parent task concurrently with the parent task executing on the at least one processor core. In some cases, the parent task registers the child task in a task pool and the child task is a future task that is configured to monitor a completion object and enqueue another task associated with the future task in response to detecting the completion object. The future task is configured to self-enqueue by adding a continuation future task to a continuation queue for subsequent execution in response to the future task failing to detect the completion object.

    Data transfer acceleration
    4.
    发明授权

    公开(公告)号:US11086809B2

    公开(公告)日:2021-08-10

    申请号:US16693638

    申请日:2019-11-25

    Abstract: Data transfer acceleration includes receiving, by a data transfer accelerator in a first node of a plurality of nodes, from a second node of the plurality of nodes, a request for data in a second state, wherein the second node stores an instance of the data in a first state; generating a message including one or more operations to transform the data from the first state to the second state; and sending the message to the second node in response to the request.

    Hardware accelerated dynamic work creation on a graphics processing unit

    公开(公告)号:US12131186B2

    公开(公告)日:2024-10-29

    申请号:US17993490

    申请日:2022-11-23

    CPC classification number: G06F9/4881 G06F9/3877 G06F9/542 G06F9/545 G06F9/546

    Abstract: A processor core is configured to execute a parent task that is described by a data structure stored in a memory. A coprocessor is configured to dispatch a child task to the at least one processor core in response to the coprocessor receiving a request from the parent task concurrently with the parent task executing on the at least one processor core. In some cases, the parent task registers the child task in a task pool and the child task is a future task that is configured to monitor a completion object and enqueue another task associated with the future task in response to detecting the completion object. The future task is configured to self-enqueue by adding a continuation future task to a continuation queue for subsequent execution in response to the future task failing to detect the completion object.

    DYNAMICALLY CONFIGURABLE OVERPROVISIONED MICROPROCESSOR

    公开(公告)号:US20220100563A1

    公开(公告)日:2022-03-31

    申请号:US17037727

    申请日:2020-09-30

    Abstract: A dynamically configurable overprovisioned microprocessor optimally supports a variety of different compute application workloads and with the capability to tradeoff among compute performance, energy consumption, and clock frequency on a per-compute application basis, using general-purpose microprocessor designs. In some embodiments, the overprovisioned microprocessor comprises a physical compute resource and a dynamic configuration logic configured to: detect an activation-warranting operating condition; undarken the physical compute resource responsive to detecting the activation-warranting operating condition; detect a configuration-warranting operating condition; and configure the overprovisioned microprocessor to use the undarkened physical compute resource responsive to detecting the configuration-warranting operating condition.

    Dynamic kernel memory space allocation

    公开(公告)号:US11720993B2

    公开(公告)日:2023-08-08

    申请号:US16138708

    申请日:2018-09-21

    CPC classification number: G06T1/60 G06F9/30098 G06F12/02 G06F12/023 G06T1/20

    Abstract: A processing unit includes one or more processor cores and a set of registers to store configuration information for the processing unit. The processing unit also includes a coprocessor configured to receive a request to modify a memory allocation for a kernel concurrently with the kernel executing on the at least one processor core. The coprocessor is configured to modify the memory allocation by modifying the configuration information stored in the set of registers. In some cases, initial configuration information is provided to the set of registers by a different processing unit. The initial configuration information is stored in the set of registers prior to the coprocessor modifying the configuration information.

    Chiplet-Level Performance Information for Configuring Chiplets in a Processor

    公开(公告)号:US20230153218A1

    公开(公告)日:2023-05-18

    申请号:US17526218

    申请日:2021-11-15

    CPC classification number: G06F11/3051 G06F15/80 G06F11/3024

    Abstract: A processor includes a controller and plurality of chiplets, each chiplet including a plurality of processor cores. The controller provides chiplet-level performance information for the chiplets that identifies a performance of each chiplet at each of a plurality of performance levels for specified sets of processor cores on that chiplet. The controller receives an identification of one or more selected chiplets from among the plurality of chiplets for which a specified number of processor cores are to be configured at a given performance level, the one or more selected chiplets having been selected based on the chiplet-level performance information and performance requirements. The controller configures the specified number of processor cores of the one or more selected chiplets at the given performance level. A task is then run on the specified number of processor cores of the one or more selected chiplets at the given performance level.

Patent Agency Ranking