MONITORING INPUT/OUTPUT AND PERSISTENT RESERVATION ACTIVITY PATTERNS TO DETECT DEGRADED PERFORMANCE OF A HIGH AVAILABILITY AND FAULT TOLERANT APPLICATION

    公开(公告)号:US20250117299A1

    公开(公告)日:2025-04-10

    申请号:US18982992

    申请日:2024-12-16

    Applicant: Nutanix, Inc.

    Abstract: A technique monitors input/output (I/O) and storage ownership takeover activity patterns to detect degraded performance of a highly available and fault tolerant application executing in a multi-site environment. Multiple instances of the application execute in different containers or pods running on virtual machines (VMs) of a compute layer within a containerized (e.g., Kubernetes) clustering configuration that extends across clusters of the sites. A storage layer of the clusters provides shared storage to the pods running on the VMs across the multiple sites. One of the sites is configured as an active storage site configured to receive and service I/O requests from the compute layer. A single instance of the application is active at a time and configured as a “compute owner” of the shared storage to issue the I/O requests to the shared storage. The compute owner and active storage site may not be co-located on the same site, leading to excessive I/O and storage ownership takeover activity patterns indicative of degraded performance. Upon detecting such patterns, the technique automatically triggers a storage failover to ensure that compute owner and active storage site are co-located at the same site.

    INSTANT RECOVERY AS AN ENABLER FOR UNINHIBITED MOBILITY BETWEEN PRIMARY STORAGE AND SECONDARY STORAGE

    公开(公告)号:US20220309010A1

    公开(公告)日:2022-09-29

    申请号:US17676013

    申请日:2022-02-18

    Applicant: Nutanix, Inc.

    Abstract: In accordance with some aspects of the present disclosure, a non-transitory computer readable medium is disclosed. In some embodiments, the non-transitory computer readable medium includes instructions that, when executed by a processor, cause the processor to receive, from a workload hosted on a host of a cluster, first I/O traffic programmed according to a first I/O traffic protocol supported by a cluster-wide storage fabric exposed to the workload as being hosted on the same host. In some embodiments, the workload is recovered by a hypervisor hosted on the same host. In some embodiments, the non-transitory computer readable medium includes the instructions that, when executed by the processor, cause the processor to adapt the first I/O traffic to generate second I/O traffic programmed according to a second I/O traffic protocol supported by a repository external to the storage fabric and forward the second I/O traffic to the repository.

    Virtual disk grafting and differential based data pulling from external repository

    公开(公告)号:US12141042B2

    公开(公告)日:2024-11-12

    申请号:US18116413

    申请日:2023-03-02

    Applicant: Nutanix, Inc.

    Abstract: A technique utilizes grafting and differential based (diff-based) data seeding to hydrate a special virtual disk (vdisk) on a multi-node cluster with data changes (differences) between a reference vdisk stored on the cluster and a snapshot stored in an external repository to enable failover (including failback) recovery of an application workload in a disaster recovery environment. The application workload is stored as a workload vdisk on local storage of the cluster and snapshots of the workload vdisk are generated and organized as a vdisk chain on the cluster. One or more snapshots of the vdisk chain may be replicated to the external repository using a long-term snapshot service. Each replicated snapshot may be backed up from the cluster to the external repository at the granularity of a vdisk, referred to herein as an external datasource disk. The special vdisk is a thinly provisioned, datasource-backed vdisk that is grafted onto the vdisk chain, e.g., as a child vdisk of the reference vdisk. The differences between the reference vdisk and datasource disk are seeded from the datasource disk to hydrate the datasource-backed vdisk.

    High frequency snapshot technique for improving data replication in disaster recovery environment

    公开(公告)号:US12259790B2

    公开(公告)日:2025-03-25

    申请号:US17388735

    申请日:2021-07-29

    Applicant: Nutanix, Inc.

    Abstract: A high frequency snapshot technique improves data replication in a disaster recovery (DR) environment. A base snapshot is generated from failover data at a primary site and replicated to a placeholder file at a secondary site. Upon commencement of the base snapshot generation and replication, incremental light weight snapshots (LWSs) of the failover data are captured and replicated to the secondary site. A staging file at the secondary site accumulates the replicated LWSs (“high-frequency snapshots”). The staging file is populated with the LWSs in parallel with the replication of the base snapshot at the placeholder file. At a subsequent predetermined time interval, the accumulated LWSs are synthesized to capture a “checkpoint” snapshot by applying and pruning the accumulated LWSs at the staging file. Once the base snapshot is fully replicated, the pruned LWSs are merged to the base snapshot to synchronize the replicated failover data.

    Monitoring input/output and persistent reservation activity patterns to detect degraded performance of a high availability and fault tolerant application

    公开(公告)号:US12169445B2

    公开(公告)日:2024-12-17

    申请号:US18095271

    申请日:2023-01-10

    Applicant: Nutanix, Inc.

    Abstract: A technique monitors input/output (I/O) and Persistent Reservation (PR) activity patterns to detect degraded performance of a highly available and fault tolerant application executing in a multi-site disaster recovery (DR) environment. Multiple instances of the application execute in different virtual machines (VMs) of a compute layer within a guest clustering configuration that extends across clusters of the sites. A storage layer of the clusters provides shared storage to the multiple VMs across the multiple sites. One of the sites is configured as an active storage site configured to receive and service I/O requests from the compute layer. A single instance of the application is active at a time and configured as a “compute owner” of the shared storage to issue the I/O requests to the shared storage. The compute owner and active storage site may not be co-located on the same site, leading to excessive I/O and PR activity patterns indicative of degraded performance. Upon detecting such patterns, the technique automatically triggers a storage failover to ensure that compute owner and active storage site are co-located at the same site.

    MONITORING INPUT/OUTPUT AND PERSISTENT RESERVATION ACTIVITY PATTERNS TO DETECT DEGRADED PERFORMANCE OF A HIGH AVAILABILITY AND FAULT TOLERANT APPLICATION

    公开(公告)号:US20240143462A1

    公开(公告)日:2024-05-02

    申请号:US18095271

    申请日:2023-01-10

    Applicant: Nutanix, Inc.

    CPC classification number: G06F11/203 G06F11/076 G06F11/3075

    Abstract: A technique monitors input/output (I/O) and Persistent Reservation (PR) activity patterns to detect degraded performance of a highly available and fault tolerant application executing in a multi-site disaster recovery (DR) environment. Multiple instances of the application execute in different virtual machines (VMs) of a compute layer within a guest clustering configuration that extends across clusters of the sites. A storage layer of the clusters provides shared storage to the multiple VMs across the multiple sites. One of the sites is configured as an active storage site configured to receive and service I/O requests from the compute layer. A single instance of the application is active at a time and configured as a “compute owner” of the shared storage to issue the I/O requests to the shared storage. The compute owner and active storage site may not be co-located on the same site, leading to excessive I/O and PR activity patterns indicative of degraded performance. Upon detecting such patterns, the technique automatically triggers a storage failover to ensure that compute owner and active storage site are co-located at the same site.

    HIGH FREQUENCY SNAPSHOT TECHNIQUE FOR IMPROVING DATA REPLICATION IN DISASTER RECOVERY ENVIRONMENT

    公开(公告)号:US20220398163A1

    公开(公告)日:2022-12-15

    申请号:US17388735

    申请日:2021-07-29

    Applicant: Nutanix, Inc.

    Abstract: A high frequency snapshot technique improves data replication in a disaster recovery (DR) environment. A base snapshot is generated from failover data at a primary site and replicated to a placeholder file at a secondary site. Upon commencement of the base snapshot generation and replication, incremental light weight snapshots (LWSs) of the failover data are captured and replicated to the secondary site. A staging file at the secondary site accumulates the replicated LWSs (“high-frequency snapshots”). The staging file is populated with the LWSs in parallel with the replication of the base snapshot at the placeholder file. At a subsequent predetermined time interval, the accumulated LWSs are synthesized to capture a “checkpoint” snapshot by applying and pruning the accumulated LWSs at the staging file. Once the base snapshot is fully replicated, the pruned LWSs are merged to the base snapshot to synchronize the replicated failover data.

Patent Agency Ranking