Patent search ap:("SambaNova Systems Page Inc.") AND inv:"Greg DYKEMA"

1.

发明公开
Auto-Discovery Module for the Discovery of Reconfigurable Processors in a Pool of Heterogeneous Reconfigurable Processors 审中-公开

公开(公告)号：US20240273057A1

公开(公告)日：2024-08-15

申请号：US18635114

申请日：2024-04-15

Applicant: SambaNova Systems, Inc.

Inventor： Greg DYKEMA , Maran WILSON , Guoyao FENG , Kuan ZHOU , Tianyu SUN , Taylor LEE , Kin Hing LEUNG , Arnav GOEL , Conrad Alexander TURLIK , Milad SHARIF

IPC: G06F15/80 , G06F8/41 , G06F15/78

CPC classification number: G06F15/8038 , G06F8/443 , G06F8/447 , G06F8/45 , G06F15/7867 , G06F15/80

Abstract: A host system for executing an application on first and/or second reconfigurable processors is presented. The host system is operatively coupled to the first and second reconfigurable processors, whereby the first reconfigurable processors have a first architecture, and the second reconfigurable processors have a second architecture that is different than the first architecture. The host system allocates reconfigurable processors of the first and/or second reconfigurable processors for executing the application and includes an auto-discovery module that is configured to determine whether the allocated reconfigurable processors include at least one of the first reconfigurable processors.

2.

发明公开
DATA TRANSFER IN DATAFLOW COMPUTING SYSTEMS USING AN INTELLIGENT DYNAMIC TRANSFER ENGINE 审中-公开

公开(公告)号：US20240231903A1

公开(公告)日：2024-07-11

申请号：US18614639

申请日：2024-03-23

Applicant: SambaNova Systems, Inc.

Inventor： Qi ZHENG , Arnav GOEL , Conrad Alexander TURLIK , Guoyao FENG , Joshua Earle POLZIN , Fansheng CHENG , Ravinder KUMAR , Greg DYKEMA , Subhra MAZUMDAR , Milad SHARIF , Jiayu BAI , Neal SANGHVI , Arjun SABNIS , Letao CHEN

IPC: G06F9/48 , G06F9/38

CPC classification number: G06F9/4881 , G06F9/3877

Abstract: In a computer-implemented method a Dynamic Transfer Engine (DTE) included in a computing system receives a dynamic stimulus associated with transfer of stage data during execution of a dataflow application by the system. The DTE determines, based on source and destination devices of the transfer, a transfer method and a transfer channel to transfer the stage data between memories coupled to the source and destination devices. The DTE acquires, hardware resources of the computing system to transfer the stage using the channel and, initiates the transfer. A computer program product can cause one or more processors to perform the method. A computing system can comprise source and destination processors and memories, hardware channels to transfer data between the memories, a resource manager, and a DTE configured to perform the method.

3.

发明申请
SYSTEM AND METHOD FOR OPTIMIZING DATA-TRANSFER AMONG MULTIPLE COMPUTE UNITS IN A DATA-PARALLEL COMPUTING SYSTEM 有权

公开(公告)号：US20240394218A1

公开(公告)日：2024-11-28

申请号：US18794143

申请日：2024-08-05

Applicant: SambaNova Systems, Inc.

Inventor： Greg DYKEMA , Aarti LALWANI

IPC: G06F15/82 , G06F13/40

Abstract: System and method for optimizing data-transfer among multiple compute units in a data-parallel computing system. A topological communications configurator (TCC) determines a connections-optimized configuration of processors associated with compute nodes of the computing system. The processors can execute dataflow workers of an application and form intranodal segments of an internodal interconnection topology coupling the intranodal segments. The TCC determines the connections-optimized configuration based on internodal communications costs corresponding to communications routes among the internodal segments via the internodal interconnection fabric.

4.

发明公开
Overlapping Gradient Synchronization In Machine Learning 审中-公开

公开(公告)号：US20230259823A1

公开(公告)日：2023-08-17

申请号：US18109080

申请日：2023-02-13

Applicant: SambaNova Systems, Inc.

Inventor： Greg DYKEMA , Fansheng CHENG , Kuan ZHOU , Arnav GOEL , Subhra MAZUMDAR , Milad SHARIF , Po-Yu WU , Bowen YANG , Qi ZHENG

IPC: G06N20/00

CPC classification number: G06N20/00

Abstract: In a method an orchestrator of a computing system determines that results of Machine Learning model computations are available and dispatches a worker to perform model computations that include computing gradients of the results. The orchestrator determines that a set of gradients of the results is available and dispatches a gradient worker to compute a sum of the gradients. The orchestrator determines that a second set of gradients of the results is available and dispatches a second gradient worker to compute a sum of the second set of gradients. The orchestrator determines that the sums of the first and second gradients are available and dispatches a third gradient worker to compute synchronized gradients. The gradient workers compute the sums and synchronized gradients concurrent with training workers computing additional model computations results and/or gradients. A computer program product can include the method and a computing system can include the orchestrator.

Patent Agency Ranking