NETWORK CACHE INJECTION FOR COHERENT GPUS
    1.
    发明申请

    公开(公告)号:US20180314638A1

    公开(公告)日:2018-11-01

    申请号:US15498076

    申请日:2017-04-26

    Abstract: Methods, devices, and systems for GPU cache injection. A GPU compute node includes a network interface controller (NIC) which includes NIC receiver circuitry which can receive data for processing on the GPU, NIC transmitter circuitry which can send the data to a main memory of the GPU compute node and which can send coherence information to a coherence directory of the GPU compute node based on the data. The GPU compute node also includes a GPU which includes GPU receiver circuitry which can receive the coherence information; GPU processing circuitry which can determine, based on the coherence information, whether the data satisfies a heuristic; and GPU loading circuitry which can load the data into a cache of the GPU from the main memory if on the data satisfies the heuristic.

    PROGRAMMING IN-MEMORY ACCELERATORS TO IMPROVE THE EFFICIENCY OF DATACENTER OPERATIONS

    公开(公告)号:US20180081583A1

    公开(公告)日:2018-03-22

    申请号:US15269495

    申请日:2016-09-19

    CPC classification number: G06F12/00 G06F9/30

    Abstract: Systems, apparatuses, and methods for utilizing in-memory accelerators to perform data conversion operations are disclosed. A system includes one or more main processors coupled to one or more memory modules. Each memory module includes one or more memory devices coupled to a processing in memory (PIM) device. The main processors are configured to generate an executable for a PIM device to accelerate data conversion tasks of data stored in the local memory devices. In one embodiment, the system detects a read request for data stored in a given memory module. In order to process the read request, the system determines that a conversion from a first format to a second format is required. In response to detecting the read request, the given memory module's PIM device performs the conversion of the data from the first format to the second format and then provides the data to a consumer application.

    Remote Task Queuing by Networked Computing Devices
    4.
    发明申请
    Remote Task Queuing by Networked Computing Devices 有权
    网络计算设备的远程任务排队

    公开(公告)号:US20140331230A1

    公开(公告)日:2014-11-06

    申请号:US14164220

    申请日:2014-01-26

    Abstract: The described embodiments include a networking subsystem in a second computing device that is configured to receive a task message from a first computing device. Based on the task message, the networking subsystem updates an entry in a task queue with task information from the task message. A processing subsystem in the second computing device subsequently retrieves the task information from the task queue and performs the corresponding task. In these embodiments, the networking subsystem processes the task message (e.g., stores the task information in the task queue) without causing the processing subsystem to perform operations for processing the task message.

    Abstract translation: 所描述的实施例包括被配置为从第一计算设备接收任务消息的第二计算设备中的网络子系统。 基于任务消息,网络子系统使用来自任务消息的任务信息来更新任务队列中的条目。 第二计算设备中的处理子系统随后从任务队列检索任务信息并执行相应的任务。 在这些实施例中,网络子系统处理任务消息(例如,将任务信息存储在任务队列中)而不使处理子系统执行用于处理任务消息的操作。

    Network packet templating for GPU-initiated communication

    公开(公告)号:US10740163B2

    公开(公告)日:2020-08-11

    申请号:US16022498

    申请日:2018-06-28

    Abstract: Systems, apparatuses, and methods for performing network packet templating for graphics processing unit (GPU)-initiated communication are disclosed. A central processing unit (CPU) creates a network packet according to a template and populates a first subset of fields of the network packet with static data. Next, the CPU stores the network packet in a memory. A GPU initiates execution of a kernel and detects a network communication request within the kernel and prior to the kernel completing execution. Responsive to this determination, the GPU populates a second subset of fields of the network packet with runtime data. Then, the GPU generates a notification that the network packet is ready to be processed. A network interface controller (NIC) processes the network packet using data retrieved from the first subset of fields and from the second subset of fields responsive to detecting the notification.

    NETWORK PACKET TEMPLATING FOR GPU-INITIATED COMMUNICATION

    公开(公告)号:US20200004610A1

    公开(公告)日:2020-01-02

    申请号:US16022498

    申请日:2018-06-28

    Abstract: Systems, apparatuses, and methods for performing network packet templating for graphics processing unit (GPU)-initiated communication are disclosed. A central processing unit (CPU) creates a network packet according to a template and populates a first subset of fields of the network packet with static data. Next, the CPU stores the network packet in a memory. A GPU initiates execution of a kernel and detects a network communication request within the kernel and prior to the kernel completing execution. Responsive to this determination, the GPU populates a second subset of fields of the network packet with runtime data. Then, the GPU generates a notification that the network packet is ready to be processed. A network interface controller (NIC) processes the network packet using data retrieved from the first subset of fields and from the second subset of fields responsive to detecting the notification.

    Remote task queuing by networked computing devices
    7.
    发明授权
    Remote task queuing by networked computing devices 有权
    网络计算设备的远程任务排队

    公开(公告)号:US09582402B2

    公开(公告)日:2017-02-28

    申请号:US14164220

    申请日:2014-01-26

    Abstract: The described embodiments include a networking subsystem in a second computing device that is configured to receive a task message from a first computing device. Based on the task message, the networking subsystem updates an entry in a task queue with task information from the task message. A processing subsystem in the second computing device subsequently retrieves the task information from the task queue and performs the corresponding task. In these embodiments, the networking subsystem processes the task message (e.g., stores the task information in the task queue) without causing the processing subsystem to perform operations for processing the task message.

    Abstract translation: 所描述的实施例包括被配置为从第一计算设备接收任务消息的第二计算设备中的网络子系统。 基于任务消息,网络子系统使用来自任务消息的任务信息来更新任务队列中的条目。 第二计算设备中的处理子系统随后从任务队列检索任务信息并执行相应的任务。 在这些实施例中,网络子系统处理任务消息(例如,将任务信息存储在任务队列中)而不使处理子系统执行用于处理任务消息的操作。

    GPU NETWORKING USING AN INTEGRATED COMMAND PROCESSOR

    公开(公告)号:US20230120934A1

    公开(公告)日:2023-04-20

    申请号:US18068836

    申请日:2022-12-20

    Abstract: Systems, apparatuses, and methods for generating network messages on a parallel processor are disclosed. A system includes at least a parallel processor, a general purpose processor, and a network interface unit. The parallel processor includes at least a plurality of compute units, a command processor, and a cache. A thread within a kernel executing on a compute unit of the parallel processor generates a network message and stores the network message and a corresponding indication in the cache. In response to detecting the indication of the network message in the cache, the command processor processes and conveys the network message to the network interface unit without involving the general purpose processor.

    GPU networking using an integrated command processor

    公开(公告)号:US11544121B2

    公开(公告)日:2023-01-03

    申请号:US15815043

    申请日:2017-11-16

    Abstract: Systems, apparatuses, and methods for generating network messages on a parallel processor are disclosed. A system includes at least a parallel processor, a general purpose processor, and a network interface unit. The parallel processor includes at least a plurality of compute units, a command processor, and a cache. A thread within a kernel executing on a compute unit of the parallel processor generates a network message and stores the network message and a corresponding indication in the cache. In response to detecting the indication of the network message in the cache, the command processor processes and conveys the network message to the network interface unit without involving the general purpose processor.

    GPU NETWORKING USING AN INTEGRATED COMMAND PROCESSOR

    公开(公告)号:US20190146857A1

    公开(公告)日:2019-05-16

    申请号:US15815043

    申请日:2017-11-16

    Abstract: Systems, apparatuses, and methods for generating network messages on a parallel processor are disclosed. A system includes at least a parallel processor, a general purpose processor, and a network interface unit. The parallel processor includes at least a plurality of compute units, a command processor, and a cache. A thread within a kernel executing on a compute unit of the parallel processor generates a network message and stores the network message and a corresponding indication in the cache. In response to detecting the indication of the network message in the cache, the command processor processes and conveys the network message to the network interface unit without involving the general purpose processor.

Patent Agency Ranking