COLLECTIVE OPERATION USING A NETWORK-ATTACHED MEMORY

    公开(公告)号:US20250021273A1

    公开(公告)日:2025-01-16

    申请号:US18349318

    申请日:2023-07-10

    Abstract: In some examples, a processor receives a first request to allocate a memory region for a collective operation by process entities in a plurality of computer nodes. In response to the first request, the processor creates a virtual address for the memory region and allocates the memory region in a network-attached memory coupled to the plurality of computer nodes over a network. The processor correlates the virtual address to an address of the memory region in mapping information. The processor identifies the memory region in the network-attached memory by obtaining the address of the memory region from the mapping information using the virtual address in a second request. In response to the second request, the processor performs the collective operation.

    FABRIC-ATTACHED MEMORY FOR APPLICATIONS USING MESSAGE PASSING PROCEDURE

    公开(公告)号:US20240362163A1

    公开(公告)日:2024-10-31

    申请号:US18308953

    申请日:2023-04-28

    CPC classification number: G06F12/0284 G06F13/16 G06F2213/16

    Abstract: Some examples relate to providing a fabric-attached memory (FAM) for applications using message passing procedure. In an example, a remotely accessible memory creation function of a message passing procedure is modified to include a reference to a region of memory in a FAM. A remotely accessible memory data structure representing a remotely accessible memory is created through the remotely accessible memory creation function. When an application calls a message passing function of the message passing procedure, a determination is made whether the remotely accessible memory data structure in the message passing function includes a reference to the region of memory in the FAM. In response to a determination that the remotely accessible memory data structure includes a reference to the region of memory in the FAM, the message passing function call is routed to a FAM message passing function corresponding to the message passing function.

    Compiler for implementing neural network accelerator

    公开(公告)号:US12254416B2

    公开(公告)日:2025-03-18

    申请号:US17229497

    申请日:2021-04-13

    Abstract: Examples disclosed herein relate to using a compiler for implementing tensor operations in a neural network base computing system. A compiler defines the tensor operations to be implemented. The compiler identifies a binary tensor operation receiving input operands from a first output tensor of a first tensor operation and a second output tensor of a second tensor operation from two different paths of the convolution neural network. For the binary tensor operation, the compiler allocates a buffer space for a first input operand in the binary tensor operation based on a difference between a count of instances of the first output tensor and a count of instances of the second output tensor.

    Avoiding cycles in neural networks

    公开(公告)号:US11379712B2

    公开(公告)日:2022-07-05

    申请号:US16155036

    申请日:2018-10-09

    Abstract: Disclosed is a method, system, and computer readable medium to manage (and possibly replace) cycles in graphs for a computer device. The method includes detecting a compound operation including a first tensor, the compound operation resulting from source code represented in a first graph structure as part of a compilation process from source code to binary executable code. To address a detected cycle, an instance of a proxy class may be created to store a pointer to a proxy instance of the first tensor based on the detection. In some examples, using the instance of the proxy class facilitates implementation of a level of indirection to replace a cyclical portion of the graph structure with an acyclical portion such that the second graph structure indicates assignment of a result of the compound operation to the proxy instance of the first tensor. Optimization may reduce a total number of indirection replacements.

Patent Agency Ranking