-
公开(公告)号:US20250021273A1
公开(公告)日:2025-01-16
申请号:US18349318
申请日:2023-07-10
Applicant: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP
Inventor: Soumitra Chatterjee , Chinmay Ghosh , Mashood Abdulla Kodavanji , Sharad Singhal
IPC: G06F3/06
Abstract: In some examples, a processor receives a first request to allocate a memory region for a collective operation by process entities in a plurality of computer nodes. In response to the first request, the processor creates a virtual address for the memory region and allocates the memory region in a network-attached memory coupled to the plurality of computer nodes over a network. The processor correlates the virtual address to an address of the memory region in mapping information. The processor identifies the memory region in the network-attached memory by obtaining the address of the memory region from the mapping information using the virtual address in a second request. In response to the second request, the processor performs the collective operation.
-
公开(公告)号:US11269973B2
公开(公告)日:2022-03-08
申请号:US16860357
申请日:2020-04-28
Applicant: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP
Inventor: Mashood Abdulla Kodavanji , Soumitra Chatterjee , Chinmay Ghosh , Mohan Parthasarathy
Abstract: Repeating patterns are identified in a matrix. Based on the identification of the repeating patterns, instructions are generated, which are executable by processing cores of a dot product engine to allocate analog multiplication crossbars of the dot product engine to perform multiplication of the matrix with a vector.
-
公开(公告)号:US11132423B2
公开(公告)日:2021-09-28
申请号:US16176848
申请日:2018-10-31
Applicant: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP
Inventor: Soumitra Chatterjee , Mashood Abdulla K , Chinmay Ghosh , Mohan Parthasarathy
IPC: G06F17/16 , H03K19/177 , G06F9/30
Abstract: According to examples, an apparatus may include a processor and a non-transitory computer readable medium having instructions that when executed by the processor, may cause the processor to partition a matrix of elements into a plurality of sub-matrices of elements. Each sub-matrix of the plurality of sub-matrices may include elements from a set of columns of the matrix of elements that includes a nonzero element. The processor may also assign elements of the plurality of sub-matrices to a plurality of crossbar devices to maximize a number of nonzero elements of the matrix of elements assigned to the crossbar devices.
-
公开(公告)号:US20240362163A1
公开(公告)日:2024-10-31
申请号:US18308953
申请日:2023-04-28
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Soumitra Chatterjee , Chinmay Ghosh , Mashood Abdulla Kodavanji , Sharad Singhal
CPC classification number: G06F12/0284 , G06F13/16 , G06F2213/16
Abstract: Some examples relate to providing a fabric-attached memory (FAM) for applications using message passing procedure. In an example, a remotely accessible memory creation function of a message passing procedure is modified to include a reference to a region of memory in a FAM. A remotely accessible memory data structure representing a remotely accessible memory is created through the remotely accessible memory creation function. When an application calls a message passing function of the message passing procedure, a determination is made whether the remotely accessible memory data structure in the message passing function includes a reference to the region of memory in the FAM. In response to a determination that the remotely accessible memory data structure includes a reference to the region of memory in the FAM, the message passing function call is routed to a FAM message passing function corresponding to the message passing function.
-
公开(公告)号:US12254416B2
公开(公告)日:2025-03-18
申请号:US17229497
申请日:2021-04-13
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Jitendra Onkar Kolhe , Soumitra Chatterjee , Vaithyalingam Nagendran , Shounak Bandopadhyay
Abstract: Examples disclosed herein relate to using a compiler for implementing tensor operations in a neural network base computing system. A compiler defines the tensor operations to be implemented. The compiler identifies a binary tensor operation receiving input operands from a first output tensor of a first tensor operation and a second output tensor of a second tensor operation from two different paths of the convolution neural network. For the binary tensor operation, the compiler allocates a buffer space for a first input operand in the binary tensor operation based on a difference between a count of instances of the first output tensor and a count of instances of the second output tensor.
-
公开(公告)号:US11379712B2
公开(公告)日:2022-07-05
申请号:US16155036
申请日:2018-10-09
Applicant: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP
IPC: G06N3/063
Abstract: Disclosed is a method, system, and computer readable medium to manage (and possibly replace) cycles in graphs for a computer device. The method includes detecting a compound operation including a first tensor, the compound operation resulting from source code represented in a first graph structure as part of a compilation process from source code to binary executable code. To address a detected cycle, an instance of a proxy class may be created to store a pointer to a proxy instance of the first tensor based on the detection. In some examples, using the instance of the proxy class facilitates implementation of a level of indirection to replace a cyclical portion of the graph structure with an acyclical portion such that the second graph structure indicates assignment of a result of the compound operation to the proxy instance of the first tensor. Optimization may reduce a total number of indirection replacements.
-
公开(公告)号:US20210334335A1
公开(公告)日:2021-10-28
申请号:US16860357
申请日:2020-04-28
Applicant: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP
Inventor: Mashood Abdulla Kodavanji , Soumitra Chatterjee , Chinmay Ghosh , Mohan Parthasarathy
Abstract: Repeating patterns are identified in a matrix. Based on the identification of the repeating patterns, instructions are generated, which are executable by processing cores of a dot product engine to allocate analog multiplication crossbars of the dot product engine to perform multiplication of the matrix with a vector.
-
公开(公告)号:US11874688B2
公开(公告)日:2024-01-16
申请号:US17519179
申请日:2021-11-04
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Soumitra Chatterjee , Balasubramanian Viswanathan
CPC classification number: G06F11/3624 , G06N3/04 , G06N3/08
Abstract: Example techniques for identification of diagnostic messages corresponding to exceptions are described. A determination model may determine whether a set of diagnostic messages generated based on analysis of a source code includes a diagnostic message that likely corresponds to an exception. The determination may be used to identify a set of diagnostic messages including the diagnostic message that likely corresponds to an exception.
-
公开(公告)号:US11645358B2
公开(公告)日:2023-05-09
申请号:US16260331
申请日:2019-01-29
Applicant: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP
Abstract: In an example, a neural network program corresponding to a neural network model is received. The neural network program includes matrices, vectors, and matrix-vector multiplication (MVM) operations. A computation graph corresponding to the neural network model is generated. The computation graph includes a plurality of nodes, each node representing a MVM operation, a matrix, or a vector. Further, a class model corresponding to the neural network model is populated with a data structure pointing to the computation graph. The computation graph is traversed based on the class model. Based on the traversal, the plurality of MVM operations are assigned to MVM units of a neural network accelerator. Each MVM unit can perform a MVM operation. Based on assignment of the plurality of MVM operations, an executable file is generated for execution by the neural network accelerator.
-
10.
公开(公告)号:US11361050B2
公开(公告)日:2022-06-14
申请号:US16196423
申请日:2018-11-20
Applicant: Hewlett Packard Enterprise Development LP
Abstract: Example implementations relate to assigning dependent matrix-vector multiplication (MVM) operations to consecutive crossbars of a dot product engine (DPE). A method can comprise grouping a first MVM operation of a computation graph with a second MVM operation of the computation graph where the first MVM operation is dependent on a result of the second MVM operation, assigning a first crossbar of a DPE to an operand of the first MVM operation, and assigning a second crossbar of the DPE to an operand of the second MVM operation, wherein the first and second crossbars are consecutive.
-
-
-
-
-
-
-
-
-