FAULT MANAGEMENT IN A DISTRIBUTED COMPUTER SYSTEM

    公开(公告)号:US20230385152A1

    公开(公告)日:2023-11-30

    申请号:US17804392

    申请日:2022-05-27

    CPC classification number: G06F11/1407 G06F11/0772 H04L67/1029 G06N20/00

    Abstract: In some examples, a distributed computer system includes a plurality of computer nodes, where the plurality of computer nodes include respective programs to cooperate to perform a workload. A first computer node includes a communication proxy between the program of the first computer node and a communication library that supports communications between the program of the first computer node and the programs of other computer nodes of the plurality of computer nodes, and a fault management service to monitor a health of the other computer nodes, and in response to a detection of a fault of a second computer node of the plurality of computer nodes, relaunch the communication proxy. The relaunched communication proxy selects, from a plurality of states, a common state to which the programs are to roll back.

    Fault management in a distributed computer system

    公开(公告)号:US11966292B2

    公开(公告)日:2024-04-23

    申请号:US17804392

    申请日:2022-05-27

    CPC classification number: G06F11/1407 G06F11/0772 H04L67/1029

    Abstract: In some examples, a distributed computer system includes a plurality of computer nodes, where the plurality of computer nodes include respective programs to cooperate to perform a workload. A first computer node includes a communication proxy between the program of the first computer node and a communication library that supports communications between the program of the first computer node and the programs of other computer nodes of the plurality of computer nodes, and a fault management service to monitor a health of the other computer nodes, and in response to a detection of a fault of a second computer node of the plurality of computer nodes, relaunch the communication proxy. The relaunched communication proxy selects, from a plurality of states, a common state to which the programs are to roll back.

Patent Agency Ranking