APPARATUS AND METHOD
    2.
    发明公开

    公开(公告)号:US20240231924A1

    公开(公告)日:2024-07-11

    申请号:US18617682

    申请日:2024-03-27

    IPC分类号: G06F9/50

    CPC分类号: G06F9/5027

    摘要: It is provided an apparatus comprising interface circuitry, machine-readable instructions, and processing circuitry to execute the machine-readable instructions. The machine-readable instructions comprise instructions to identify a processing flow pattern of a large language model, LLM, wherein the LLM is executed on a processor circuitry comprising a plurality of processor cores and wherein the processing flow pattern comprising a plurality of processing phases. The machine-readable instructions further comprise instructions to identify a processing phase of the LLM from the processing flow pattern. The machine-readable instructions further comprise instructions to allocate processing resources to the processor circuitry based on the identified processing phase of the LLM.

    MANAGEMENT OF WORKLOAD PROCESSING USING DISTRIBUTED NETWORKED PROCESSING UNITS

    公开(公告)号:US20230135645A1

    公开(公告)日:2023-05-04

    申请号:US18090764

    申请日:2022-12-29

    IPC分类号: G06F9/50

    摘要: Various approaches for deploying and controlling distributed compute operations with the use of infrastructure processing units (IPUs) and similar networked processing units are disclosed. A system that includes a networked processing unit may perform workload processing with operations that: receive, from another networked processing unit, workload information for a workload, for a workload having respective tasks to be processed among distributed computing entities; perform an analysis of network conditions for a predicted execution of the workload, based on the workload information, to analyze network availability among the distributed computing entities; perform an analysis of compute conditions for the predicted execution of the workload, based on the workload information, to analyze processing availability among the distributed computing entities; and identify locations of the distributed computing entities to deploy the workload, based on the analysis of network conditions and the analysis of compute conditions.

    SOFTWARE-DEFINED COHERENT CACHING OF POOLED MEMORY

    公开(公告)号:US20210064531A1

    公开(公告)日:2021-03-04

    申请号:US17092803

    申请日:2020-11-09

    IPC分类号: G06F12/0817

    摘要: Methods and apparatus for software-defined coherent caching of pooled memory. The pooled memory is implemented in an environment having a disaggregated architecture where compute resources such as compute platforms are connected to disaggregated memory via a network or fabric. Software-defined caching policies are implemented in hardware in a processor SoC or discrete device such as a Network Interface Controller (NIC) by programming logic in an FPGA or accelerator on the SoC or discrete device. The programmed logic is configured to implement software-defined caching policies in hardware for effecting disaggregated memory (DM) caching in an associated DM cache of at least a portion of an address space allocated for the software application in the disaggregated memory. In connection with DM cache operations, such as cache lines evicted from a CPU, logic implemented in hardware determines whether a cache line in a DM cache is to be convicted and implements the software-defined caching policy for the DM cache including associated memory coherency operations.

    AUTOMATIC LOCALIZATION OF ACCELERATION IN EDGE COMPUTING ENVIRONMENTS

    公开(公告)号:US20200026575A1

    公开(公告)日:2020-01-23

    申请号:US16586576

    申请日:2019-09-27

    IPC分类号: G06F9/50

    摘要: Methods, apparatus, systems and machine-readable storage media of an edge computing device which is enabled to access and select the use of local or remote acceleration resources for edge computing processing is disclosed. In an example, an edge computing device obtains first telemetry information that indicates availability of local acceleration circuitry to execute a function, and obtains second telemetry that indicates availability of a remote acceleration function to execute the function. An estimated time (and cost or other identifiable or estimateable considerations) to execute the function at the respective location is identified. The use of the local acceleration circuitry or the remote acceleration resource is selected based on the estimated time and other appropriate factors in relation to a service level agreement.