-
公开(公告)号:US12020063B2
公开(公告)日:2024-06-25
申请号:US17540123
申请日:2021-12-01
Applicant: Google LLC
Inventor: Jiafan Zhu , Jianqiao Liu , Xiangyu Dong , Xiao Zhang , Jikai Tang , Kexin Yang , Yong Zhao , Alireza Ghaffarkhah , Arash Rezaei , Dayou Du , Yazhou Zu , Xiangling Kong , Hoang-Vu Dang , Alexander Vadimovich Kolbasov
CPC classification number: G06F9/4843 , G06F9/5027 , G06F11/3024 , G06F11/3433
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for performing preflight checks of a distributed computing system, are described. In one aspect, a method includes assigning a computing workload to a first subset of hardware accelerator machines each having one or more hardware accelerators. A preflight check on the first subset is performed before performing the computing workload to verify the functionality of each machine in the first subset. For each hardware accelerator machine of the first subset, a program code package is installed, including a task action based at least in part on characteristics of the computing workload. The task action including a sequence of operations is performed on the hardware accelerator machine to determine whether the task action fails. Whenever the task action fails, the computing workload is re-assigned to a second subset of hardware accelerator machines different from the first subset.
-
公开(公告)号:US20240385873A1
公开(公告)日:2024-11-21
申请号:US18667501
申请日:2024-05-17
Applicant: Google LLC
Inventor: Jiafan Zhu , Jianqiao Liu , Xiangyu Dong , Xiao Zhang , Jikai Tang , Kexin Yang , Yong Zhao , Alireza Ghaffarkhah , Arash Rezaei , Dayou Du , Yazhou Zu , Xiangling Kong , Hoang-Vu Dang , Alexander Vadimovich Kolbasov
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for performing preflight checks of a distributed computing system, are described. In one aspect, a method includes assigning a computing workload to a first subset of hardware accelerator machines each having one or more hardware accelerators. A preflight check on the first subset is performed before performing the computing workload to verify the functionality of each machine in the first subset. For each hardware accelerator machine of the first subset, a program code package is installed, including a task action based at least in part on characteristics of the computing workload. The task action including a sequence of operations is performed on the hardware accelerator machine to determine whether the task action fails. Whenever the task action fails, the computing workload is re-assigned to a second subset of hardware accelerator machines different from the first subset.
-
公开(公告)号:US11704110B2
公开(公告)日:2023-07-18
申请号:US17889508
申请日:2022-08-17
Applicant: Google LLC
Inventor: Jianqiao Liu , Xiangyu Dong , Pedram Z. Dashti , Kais Belgaied
IPC: G06F8/65
CPC classification number: G06F8/65
Abstract: A uniform and unified firmware in-field upgrade capability for the optics modules may ensure compatibility, security and code quality, and scalability. In some examples, an intermediate representation, which includes vendor firmware upgrade operations and control logic, may be defined, received, and parsed. Read/write operations may be communicated to optical module(s) based on the control logic. In some examples, a unified optics module firmware in-field upgrade framework, which has multiple defined software layers, may ensure a uniform and unified approach to managing optics module(s) from different vendors and used by different projects. The software layers that may properly translate optics module read/write operations, abstract and make uniform the read/write operations, provide libraries of intermediate representations, package the intermediate representations into executables/scripts, monitor optics module status, determine when a new firmware is released, and gradually upgrade the optics module firmware.
-
公开(公告)号:US20230325175A1
公开(公告)日:2023-10-12
申请号:US18201365
申请日:2023-05-24
Applicant: Google LLC
Inventor: Jianqiao Liu , Xiangyu Dong , Pedram Z. Dashti , Kais Belgaied
IPC: G06F8/65
CPC classification number: G06F8/65
Abstract: A uniform and unified firmware in-field upgrade capability for the optics modules may ensure compatibility, security and code quality, and scalability. In some examples, an intermediate representation, which includes vendor firmware upgrade operations and control logic, may be defined, received, and parsed. Read/write operations may be communicated to optical module(s) based on the control logic. In some examples, a unified optics module firmware in-field upgrade framework, which has multiple defined software layers, may ensure a uniform and unified approach to managing optics module(s) from different vendors and used by different projects. The software layers that may properly translate optics module read/write operations, abstract and make uniform the read/write operations, provide libraries of intermediate representations, package the intermediate representations into executables/scripts, monitor optics module status, determine when a new firmware is released, and gradually upgrade the optics module firmware.
-
公开(公告)号:US20220121928A1
公开(公告)日:2022-04-21
申请号:US17110867
申请日:2020-12-03
Applicant: Google LLC
Inventor: Xiangyu Dong , Jianqiao Liu
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for an enhanced reconfigurable interconnect network. The reconfigurable interconnect network can be used to switch between multiple different connection topologies for different sizes of subsets of processing nodes in a cluster. For example, for a given number of processing nodes to be used, different connection topologies can provide different levels of scalability, data transfer bandwidth among processing nodes, and latency for transfers among processing nodes. In some implementations, the connection topologies can assign connections for each of the data ports of the processing nodes used, to maximize utilization of the data ports and provide better performance.
-
公开(公告)号:US20230168919A1
公开(公告)日:2023-06-01
申请号:US17540123
申请日:2021-12-01
Applicant: Google LLC
Inventor: Jiafan Zhu , Jianqiao Liu , Xiangyu Dong , Xiao Zhang , Jikai Tang , Kexin Yang , Yong Zhao , Alireza Ghaffarkhah , Arash Rezaei , Dayou Du , Yazhou Zu , Xiangling Kong , Hoang-Vu Dang , Alexander Vadimovich Kolbasov
CPC classification number: G06F9/4843 , G06F9/5027 , G06F11/3433 , G06F11/3024
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for performing preflight checks of a distributed computing system, are described. In one aspect, a method includes assigning a computing workload to a first subset of hardware accelerator machines each having one or more hardware accelerators. A preflight check on the first subset is performed before performing the computing workload to verify the functionality of each machine in the first subset. For each hardware accelerator machine of the first subset, a program code package is installed, including a task action based at least in part on characteristics of the computing workload. The task action including a sequence of operations is performed on the hardware accelerator machine to determine whether the task action fails. Whenever the task action fails, the computing workload is re-assigned to a second subset of hardware accelerator machines different from the first subset.
-
公开(公告)号:US11467822B2
公开(公告)日:2022-10-11
申请号:US17201256
申请日:2021-03-15
Applicant: Google LLC
Inventor: Jianqiao Liu , Xiangyu Dong , Pedram Z. Dashti , Kais Belgaied
IPC: G06F8/65
Abstract: A uniform and unified firmware in-field upgrade capability for the optics modules may ensure compatibility, security and code quality, and scalability. In some examples, an intermediate representation, which includes vendor firmware upgrade operations and control logic, may be defined, received, and parsed. Read/write operations may be communicated to optical module(s) based on the control logic. In some examples, a unified optics module firmware in-field upgrade framework, which has multiple defined software layers, may ensure a uniform and unified approach to managing optics module(s) from different vendors and used by different projects. The software layers that may properly translate optics module read/write operations, abstract and make uniform the read/write operations, provide libraries of intermediate representations, package the intermediate representations into executables/scripts, monitor optics module status, determine when a new firmware is released, and gradually upgrade the optics module firmware.
-
公开(公告)号:US12014167B2
公开(公告)日:2024-06-18
申请号:US18201365
申请日:2023-05-24
Applicant: Google LLC
Inventor: Jianqiao Liu , Xiangyu Dong , Pedram Z. Dashti , Kais Belgaied
IPC: G06F8/65
CPC classification number: G06F8/65
Abstract: A uniform and unified firmware in-field upgrade capability for the optics modules may ensure compatibility, security and code quality, and scalability. In some examples, an intermediate representation, which includes vendor firmware upgrade operations and control logic, may be defined, received, and parsed. Read/write operations may be communicated to optical module(s) based on the control logic. In some examples, a unified optics module firmware in-field upgrade framework, which has multiple defined software layers, may ensure a uniform and unified approach to managing optics module(s) from different vendors and used by different projects. The software layers that may properly translate optics module read/write operations, abstract and make uniform the read/write operations, provide libraries of intermediate representations, package the intermediate representations into executables/scripts, monitor optics module status, determine when a new firmware is released, and gradually upgrade the optics module firmware.
-
公开(公告)号:US20220398090A1
公开(公告)日:2022-12-15
申请号:US17889508
申请日:2022-08-17
Applicant: Google LLC
Inventor: Jianqiao Liu , Xiangyu Dong , Pedram Z. Dashti , Kais Belgaied
IPC: G06F8/65
Abstract: A uniform and unified firmware in-field upgrade capability for the optics modules may ensure compatibility, security and code quality, and scalability. In some examples, an intermediate representation, which includes vendor firmware upgrade operations and control logic, may be defined, received, and parsed. Read/write operations may be communicated to optical module(s) based on the control logic. In some examples, a unified optics module firmware in-field upgrade framework, which has multiple defined software layers, may ensure a uniform and unified approach to managing optics module(s) from different vendors and used by different projects. The software layers that may properly translate optics module read/write operations, abstract and make uniform the read/write operations, provide libraries of intermediate representations, package the intermediate representations into executables/scripts, monitor optics module status, determine when a new firmware is released, and gradually upgrade the optics module firmware.
-
公开(公告)号:US20220291915A1
公开(公告)日:2022-09-15
申请号:US17201256
申请日:2021-03-15
Applicant: Google LLC
Inventor: Jianqiao Liu , Xiangyu Dong , Pedram Z. Dashti , Kais Belgaied
IPC: G06F8/65
Abstract: A uniform and unified firmware in-field upgrade capability for the optics modules may ensure compatibility, security and code quality, and scalability. In some examples, an intermediate representation, which includes vendor firmware upgrade operations and control logic, may be defined, received, and parsed. Read/write operations may be communicated to optical module(s) based on the control logic. In some examples, a unified optics module firmware in-field upgrade framework, which has multiple defined software layers, may ensure a uniform and unified approach to managing optics module(s) from different vendors and used by different projects. The software layers that may properly translate optics module read/write operations, abstract and make uniform the read/write operations, provide libraries of intermediate representations, package the intermediate representations into executables/scripts, monitor optics module status, determine when a new firmware is released, and gradually upgrade the optics module firmware.
-
-
-
-
-
-
-
-
-