-
公开(公告)号:US20230092343A1
公开(公告)日:2023-03-23
申请号:US18071459
申请日:2022-11-29
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Da Qi Ren , Liang Peng
IPC: G06F11/14
Abstract: A fault tolerant processing environment wherein multiple processors are configured as worker nodes and redundant nodes, with a failed worker node replaced programmatically by a manager node. Each of the processing nodes may include a processor and memory associated with the processor and communicate with other processing nodes using a network. A manager node creates a message passing interface (MPI) communication group having worker nodes and redundant nodes, instructs the worker nodes to perform lockstep processing of tasks for an application, and monitors execution of the tasks. If a node fails, the manager node creates a replacement worker node from one of the redundant processing nodes and creates a new communications group. It then instructs those nodes in the new communications group to resume processing based on the application state and checkpoint backup data.
-
公开(公告)号:US20220091850A1
公开(公告)日:2022-03-24
申请号:US17543096
申请日:2021-12-06
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Da Qi Ren , Qian Wang , XingYu Jiang
Abstract: The disclosure relates to branch prediction techniques that can improve the performance of pipelined microprocessors. A microprocessor for branch predictor selection includes a fetch stage configured to retrieve instructions from a memory. A buffer is configured to store instructions retrieved by the fetch stage, and one or more pipelined stages configured to execute the instructions stored in the buffer. The branch predictor, communicatively coupled to the buffer and the one or more pipelined stages, is configured to select a branch target predictor from a set of branch target predictors. Each of the branch target predictors comprise a trained model associated with a previously executed instruction to identify a target branch path for the instruction currently being executed based on the selected branch target predictor.
-
公开(公告)号:US12253921B2
公开(公告)日:2025-03-18
申请号:US18300642
申请日:2023-04-14
Applicant: Huawei Technologies Co., Ltd.
Inventor: Da Qi Ren , Liang Peng
Abstract: A lockstep controller operates a lockstep system of three or more CPU-GPU pairs, comparing the outputs from the CPU-GPU pairs and, by way of a majority vote, provides the output for the lockstep system. Based on comparing the outputs, if one of the CPU-GPU pairs provides outputs that disagree with the majority outputs, it can be switched out of the lockstep system. The removed CPU is replaced by a backup CPU. So that the backup CPU can be part of a CPU-GPU pair, a portion of the address space from the GPU of one of the other CPU-GPU pairs is assigned to the backup CPU to operate as a replacement CPU-GPU pair, while the CPU already associated with this GPU retains another portion of the GPU's address space to continue operating as a CPU-GPU pair.
-
公开(公告)号:US12197290B2
公开(公告)日:2025-01-14
申请号:US18071459
申请日:2022-11-29
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Da Qi Ren , Liang Peng
Abstract: A fault tolerant processing environment wherein multiple processors are configured as worker nodes and redundant nodes, with a failed worker node replaced programmatically by a manager node. Each of the processing nodes may include a processor and memory associated with the processor and communicate with other processing nodes using a network. A manager node creates a message passing interface (MPI) communication group having worker nodes and redundant nodes, instructs the worker nodes to perform lockstep processing of tasks for an application, and monitors execution of the tasks. If a node fails, the manager node creates a replacement worker node from one of the redundant processing nodes and creates a new communications group. It then instructs those nodes in the new communications group to resume processing based on the application state and checkpoint backup data.
-
公开(公告)号:US20230251941A1
公开(公告)日:2023-08-10
申请号:US18300642
申请日:2023-04-14
Applicant: Huawei Technologies Co., Ltd.
Inventor: Da Qi Ren , Liang Peng
CPC classification number: G06F11/184 , G06F11/2033
Abstract: A lockstep controller operates a lockstep system of three or more CPU-GPU pairs, comparing the outputs from the CPU-GPU pairs and, by way of a majority vote, provides the output for the lockstep system. Based on comparing the outputs, if one of the CPU-GPU pairs provides outputs that disagree with the majority outputs, it can be switched out of the lockstep system. The removed CPU is replaced by a backup CPU. So that the backup CPU can be part of a CPU-GPU pair, a portion of the address space from the GPU of one of the other CPU-GPU pairs is assigned to the backup CPU to operate as a replacement CPU-GPU pair, while the CPU already associated with this GPU retains another portion of the GPU's address space to continue operating as a CPU-GPU pair.
-
-
-
-