-
1.
公开(公告)号:US20240273057A1
公开(公告)日:2024-08-15
申请号:US18635114
申请日:2024-04-15
Applicant: SambaNova Systems, Inc.
Inventor: Greg DYKEMA , Maran WILSON , Guoyao FENG , Kuan ZHOU , Tianyu SUN , Taylor LEE , Kin Hing LEUNG , Arnav GOEL , Conrad Alexander TURLIK , Milad SHARIF
CPC classification number: G06F15/8038 , G06F8/443 , G06F8/447 , G06F8/45 , G06F15/7867 , G06F15/80
Abstract: A host system for executing an application on first and/or second reconfigurable processors is presented. The host system is operatively coupled to the first and second reconfigurable processors, whereby the first reconfigurable processors have a first architecture, and the second reconfigurable processors have a second architecture that is different than the first architecture. The host system allocates reconfigurable processors of the first and/or second reconfigurable processors for executing the application and includes an auto-discovery module that is configured to determine whether the allocated reconfigurable processors include at least one of the first reconfigurable processors.
-
2.
公开(公告)号:US20240231903A1
公开(公告)日:2024-07-11
申请号:US18614639
申请日:2024-03-23
Applicant: SambaNova Systems, Inc.
Inventor: Qi ZHENG , Arnav GOEL , Conrad Alexander TURLIK , Guoyao FENG , Joshua Earle POLZIN , Fansheng CHENG , Ravinder KUMAR , Greg DYKEMA , Subhra MAZUMDAR , Milad SHARIF , Jiayu BAI , Neal SANGHVI , Arjun SABNIS , Letao CHEN
CPC classification number: G06F9/4881 , G06F9/3877
Abstract: In a computer-implemented method a Dynamic Transfer Engine (DTE) included in a computing system receives a dynamic stimulus associated with transfer of stage data during execution of a dataflow application by the system. The DTE determines, based on source and destination devices of the transfer, a transfer method and a transfer channel to transfer the stage data between memories coupled to the source and destination devices. The DTE acquires, hardware resources of the computing system to transfer the stage using the channel and, initiates the transfer. A computer program product can cause one or more processors to perform the method. A computing system can comprise source and destination processors and memories, hardware channels to transfer data between the memories, a resource manager, and a DTE configured to perform the method.
-
公开(公告)号:US20240394218A1
公开(公告)日:2024-11-28
申请号:US18794143
申请日:2024-08-05
Applicant: SambaNova Systems, Inc.
Inventor: Greg DYKEMA , Aarti LALWANI
Abstract: System and method for optimizing data-transfer among multiple compute units in a data-parallel computing system. A topological communications configurator (TCC) determines a connections-optimized configuration of processors associated with compute nodes of the computing system. The processors can execute dataflow workers of an application and form intranodal segments of an internodal interconnection topology coupling the intranodal segments. The TCC determines the connections-optimized configuration based on internodal communications costs corresponding to communications routes among the internodal segments via the internodal interconnection fabric.
-
公开(公告)号:US20230259823A1
公开(公告)日:2023-08-17
申请号:US18109080
申请日:2023-02-13
Applicant: SambaNova Systems, Inc.
Inventor: Greg DYKEMA , Fansheng CHENG , Kuan ZHOU , Arnav GOEL , Subhra MAZUMDAR , Milad SHARIF , Po-Yu WU , Bowen YANG , Qi ZHENG
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: In a method an orchestrator of a computing system determines that results of Machine Learning model computations are available and dispatches a worker to perform model computations that include computing gradients of the results. The orchestrator determines that a set of gradients of the results is available and dispatches a gradient worker to compute a sum of the gradients. The orchestrator determines that a second set of gradients of the results is available and dispatches a second gradient worker to compute a sum of the second set of gradients. The orchestrator determines that the sums of the first and second gradients are available and dispatches a third gradient worker to compute synchronized gradients. The gradient workers compute the sums and synchronized gradients concurrent with training workers computing additional model computations results and/or gradients. A computer program product can include the method and a computing system can include the orchestrator.
-
-
-