-
公开(公告)号:US20230359582A1
公开(公告)日:2023-11-09
申请号:US18222946
申请日:2023-07-17
Applicant: Intel Corporation
Inventor: Vivek KASHYAP , Amedeo SAPIO
IPC: G06F15/173 , H04L47/193
CPC classification number: G06F15/17343 , H04L47/193
Abstract: Examples described herein relate to a switch comprising circuitry configured to for packet communications associated with a collective operation to train machine learning (ML) models: utilize a reliable transport protocol for communications from at least one worker node of the collective operation to a switch, wherein the utilize a reliable transport protocol for communications from at least one worker node of the collective operation to the switch comprises store packet receipt state for per-packet communications from the at least one worker node of the collective operation to the switch and utilize a non-reliable transport protocol by the switch to a device that is to perform aggregation of results, wherein the reliable transport protocol comprises a different protocol than that of the non-reliable transport protocol.