PERFORMANCE ANALYSIS OF JOBS IN COMPUTING CLUSTERS

    公开(公告)号:US20250045136A1

    公开(公告)日:2025-02-06

    申请号:US18491845

    申请日:2023-10-23

    Abstract: Example implementations relate to performance analysis of jobs in computing clusters. In some examples, a processor detects a trigger event in a computing cluster, and identifies a computing job associated with the trigger event. The processor determines a time window associated with trigger event, and determines compute nodes executing the computing job during the time window. The processor determines database attributes associated with the compute nodes, and obtains data values for the determined database attributes in the determined time window. The processor determines whether the data values are correlated to the trigger event base according to a diagnostic rule. In response to a determination that the data values are correlated to the trigger event base according to the diagnostic rule, the processor determines provides an indication of a degraded performance for the computing job.

    HEALTH-BASED MANAGEMENT OF A NETWORK
    2.
    发明公开

    公开(公告)号:US20240256500A1

    公开(公告)日:2024-08-01

    申请号:US18160563

    申请日:2023-01-27

    Abstract: In some examples, a system stores, in a first database having a first schema, metrics received from a network comprising communication nodes, the metrics relating to operations of the communication nodes, and the first database associating the metrics with metadata corresponding to hierarchical components in a topology of the network. In response to an alert relating to an issue in the network, the system computes a health measure based on the metrics, the health measure indicating a health status of a first component, performs a dynamic runtime mapping of the metadata associated with the metrics in the first database having the first schema with corresponding metadata in a second database having a second schema different from the first schema, where the second database contains information of a topology of the network, and initiates a management action to address the health status based on the health measure and the dynamic runtime mapping.

    Method for federating a cluster from a plurality of computing nodes from different fault domains

    公开(公告)号:US12034600B2

    公开(公告)日:2024-07-09

    申请号:US16446288

    申请日:2019-06-19

    Abstract: The present disclosure describes a plurality of examples for federating a cluster from a plurality of interconnected computing nodes. The examples disclose a controller receiving network information and enclosure information associated with the computing nodes. The network information is indicative of a network topology between the computing nodes. The enclosure information is indicative of a configuration of an enclosure associated with a corresponding computing node. The controller identifies fault domains based on the network information and the enclosure information. Each fault domain of the fault domains includes one or more computing nodes impacted by at least one of a corresponding network fault event or a corresponding enclosure fault event, and the plurality of fault domains comprise first, second, and third fault domains. The controller selects computing nodes from the fault domains for federating the cluster. The selected computing nodes comprise a first computing node from the first fault domain and a second computing node from the second fault domain. The selecting of the computing nodes favors selecting the first computing node from the first fault domain with a smaller quantity of computing nodes over the third fault domain with a larger quantity of computing nodes. The controller allocates the selected computing nodes in federating the cluster to execute workloads.

    Method for deploying an application workload on a cluster

    公开(公告)号:US11240345B2

    公开(公告)日:2022-02-01

    申请号:US16446315

    申请日:2019-06-19

    Abstract: The present disclosure describes a plurality of examples for deploying an application workload consisting of micro-service instances. The examples include federating a cluster from a plurality of computing nodes, defining a network overlay policy based on an application policy associated with the application workload, configuring one or more virtual networks in accordance with defined network overlay policy, each virtual network from one or more virtual networks connects one or more computing nodes from the two or more computing nodes of the cluster for providing layer 2 adjacency, and deploying the plurality of micro-service instances on the two or more computing nodes in accordance with the network overlay policy, for executing the application workload.

Patent Agency Ranking