Monitoring input/output and persistent reservation activity patterns to detect degraded performance of a high availability and fault tolerant application

    公开(公告)号:US12169445B2

    公开(公告)日:2024-12-17

    申请号:US18095271

    申请日:2023-01-10

    Applicant: Nutanix, Inc.

    Abstract: A technique monitors input/output (I/O) and Persistent Reservation (PR) activity patterns to detect degraded performance of a highly available and fault tolerant application executing in a multi-site disaster recovery (DR) environment. Multiple instances of the application execute in different virtual machines (VMs) of a compute layer within a guest clustering configuration that extends across clusters of the sites. A storage layer of the clusters provides shared storage to the multiple VMs across the multiple sites. One of the sites is configured as an active storage site configured to receive and service I/O requests from the compute layer. A single instance of the application is active at a time and configured as a “compute owner” of the shared storage to issue the I/O requests to the shared storage. The compute owner and active storage site may not be co-located on the same site, leading to excessive I/O and PR activity patterns indicative of degraded performance. Upon detecting such patterns, the technique automatically triggers a storage failover to ensure that compute owner and active storage site are co-located at the same site.

    MONITORING INPUT/OUTPUT AND PERSISTENT RESERVATION ACTIVITY PATTERNS TO DETECT DEGRADED PERFORMANCE OF A HIGH AVAILABILITY AND FAULT TOLERANT APPLICATION

    公开(公告)号:US20240143462A1

    公开(公告)日:2024-05-02

    申请号:US18095271

    申请日:2023-01-10

    Applicant: Nutanix, Inc.

    CPC classification number: G06F11/203 G06F11/076 G06F11/3075

    Abstract: A technique monitors input/output (I/O) and Persistent Reservation (PR) activity patterns to detect degraded performance of a highly available and fault tolerant application executing in a multi-site disaster recovery (DR) environment. Multiple instances of the application execute in different virtual machines (VMs) of a compute layer within a guest clustering configuration that extends across clusters of the sites. A storage layer of the clusters provides shared storage to the multiple VMs across the multiple sites. One of the sites is configured as an active storage site configured to receive and service I/O requests from the compute layer. A single instance of the application is active at a time and configured as a “compute owner” of the shared storage to issue the I/O requests to the shared storage. The compute owner and active storage site may not be co-located on the same site, leading to excessive I/O and PR activity patterns indicative of degraded performance. Upon detecting such patterns, the technique automatically triggers a storage failover to ensure that compute owner and active storage site are co-located at the same site.

    MONITORING INPUT/OUTPUT AND PERSISTENT RESERVATION ACTIVITY PATTERNS TO DETECT DEGRADED PERFORMANCE OF A HIGH AVAILABILITY AND FAULT TOLERANT APPLICATION

    公开(公告)号:US20250117299A1

    公开(公告)日:2025-04-10

    申请号:US18982992

    申请日:2024-12-16

    Applicant: Nutanix, Inc.

    Abstract: A technique monitors input/output (I/O) and storage ownership takeover activity patterns to detect degraded performance of a highly available and fault tolerant application executing in a multi-site environment. Multiple instances of the application execute in different containers or pods running on virtual machines (VMs) of a compute layer within a containerized (e.g., Kubernetes) clustering configuration that extends across clusters of the sites. A storage layer of the clusters provides shared storage to the pods running on the VMs across the multiple sites. One of the sites is configured as an active storage site configured to receive and service I/O requests from the compute layer. A single instance of the application is active at a time and configured as a “compute owner” of the shared storage to issue the I/O requests to the shared storage. The compute owner and active storage site may not be co-located on the same site, leading to excessive I/O and storage ownership takeover activity patterns indicative of degraded performance. Upon detecting such patterns, the technique automatically triggers a storage failover to ensure that compute owner and active storage site are co-located at the same site.

Patent Agency Ranking