TECHNIQUE FOR EFFICIENTLY INDEXING DATA OF AN ARCHIVAL STORAGE SYSTEM

    公开(公告)号:US20230029677A1

    公开(公告)日:2023-02-02

    申请号:US17487935

    申请日:2021-09-28

    申请人: Nutanix, Inc.

    IPC分类号: G06F16/13 G06F16/11

    摘要: An indexing technique provides an index data structure for efficient retrieval of a snapshot from a long-term storage service (LTSS) of an archival storage system. The snapshot is generated from typed data of a logical entity, such as a virtual disk (vdisk). The data of the snapshot is replicated to a frontend data service of the LTSS sequentially and organized as one or more data objects for storage by a backend data service of LTSS in an object store of the archival storage system. Metadata associated with the snapshot (i.e., snapshot metadata) is recorded as a log and persistently stored on storage media local to the frontend data service. The snapshot metadata includes information describing the snapshot data, e.g., a logical offset range of a snapshot of the vdisk and, thus, is used to construct the index data structure. Notably, construction of the index data structure is deferred until after the entirety of the snapshot data has been replicated and received by the frontend data service.

    DYNAMICALLY ADAPTIVE TECHNIQUE FOR REFERENCE SNAPSHOT SELECTION

    公开(公告)号:US20230080691A1

    公开(公告)日:2023-03-16

    申请号:US17512217

    申请日:2021-10-27

    申请人: Nutanix, Inc.

    IPC分类号: G06F3/06

    摘要: A reference snapshot selection technique is configured to select a reference snapshot resolution algorithm used to determine an appropriate reference snapshot that may be employed to perform incremental snapshot replication of workload data between primary and secondary sites in a data replication environment. A reference resolution procedure is configured to process a set of constraints from the data replication environment to dynamically select the reference snapshot resolution algorithm based on a figure of merit that satisfies administrative constraints to reduce or optimize resource utilization in the data replication environment.

    LAZY INDEX CONSTRUCTION OF SNAPSHOTS IN A REPLICATION RECEIVER

    公开(公告)号:US20240362185A1

    公开(公告)日:2024-10-31

    申请号:US18243980

    申请日:2023-09-08

    申请人: Nutanix, Inc.

    摘要: A lazy index construction technique efficiently and cost effectively manages creation and storage of an index data structure based on characteristics of storage media used by an archival storage system. The index data structure (index) is configured to reference snapshot data of snapshots stored in the archival storage system. The technique is configured to defer creation and storage of the index on the archival storage system in a lazy manner until all snapshot data is received by a replication receiver and stored on the storage media so that updates/changes to the index on the storage media are minimized. The technique may be used with any type or combination of (i) “overwrite” data structure embodied as an index (i.e., an index data structure with overwrite capabilities) stored in an (ii) archival storage system having storage media (e.g., an object store) that is not conducive to overwrite capabilities.

    EXTENSIVE RECOVERY POINT MANAGEMENT USING TAGS

    公开(公告)号:US20240330119A1

    公开(公告)日:2024-10-03

    申请号:US18237814

    申请日:2023-08-24

    申请人: Nutanix, Inc.

    IPC分类号: G06F11/14

    摘要: A technique enables coordination of unrelated software components to facilitate extensive recovery point management on a snapshot or recovery point through the use of a flexible tag structure. The tag is organized and arranged as a {key=value,[value] . . . } structure wherein the key denotes an operation that requires coordination between the unrelated software components and the value(s) denote multi-cardinality that provide parameters for coordination of the operation. The multi-cardinality aspect of the flexible tag structure provides a set of values associated with the key of the tag that enables a software component and/or protocol to insert its value(s) into the tag structure for its interpretation. The technique thus provides an extensible model where multiple components/protocols use the tag to coordinate operations on the RP by conveying certain meaning/interpretations of the tag and its values.

    GARBAGE COLLECTION FROM ARCHIVAL OF STORAGE SNAPSHOTS

    公开(公告)号:US20230079621A1

    公开(公告)日:2023-03-16

    申请号:US17514603

    申请日:2021-10-29

    申请人: Nutanix, Inc.

    IPC分类号: G06F16/11 G06F12/02

    摘要: A technique improves storage efficiency of an object store configured to maintain numerous snapshots for long-term storage in an archival storage system by efficiently determining data that is exclusively owned by an expiring snapshot to allow deletion of the expiring snapshot from the object store. The technique involves managing index data structures to enable efficient garbage collection across a very large number of data objects. When a snapshot expires, the technique obviates the need to scan the numerous snapshot data objects to determine which index structures are no longer needed and can be reclaimed (garbage collected). The technique is directed to management of underlying storage based on different sets of policies. When certain snapshots expire and are ready for deletion, the technique is directed to finding those data blocks that are no longer referenced (used) by any valid snapshots.

    SITE AND STORAGE TIER AWARE REFERENCE RESOLUTION

    公开(公告)号:US20240330118A1

    公开(公告)日:2024-10-03

    申请号:US18236160

    申请日:2023-08-21

    申请人: Nutanix, Inc.

    IPC分类号: G06F11/14

    CPC分类号: G06F11/1464 G06F11/1466

    摘要: A site and storage tier aware technique replicates data as one or more recovery points (RPs) from a primary site to a secondary site in a multi-site data replication environment. A storage tier aware reference resolver determines (i) an amount of RP data transfer associated with the replication and (ii) location information associated with a cloud storage tier storing the RP data in an object store. The storage tier aware reference resolution aspect provides two additional factors to consider when retrieving data of a reference RP from cloud storage: (iii) the time (duration) needed to retrieve the data and (iv) the cost (financial expense) needed to retrieve the data. In addition, a site aware reference resolution aspect of the technique determines an optimal RP to use as the reference RP and involves consideration of (v) which RPs have been replicated from the primary site to the secondary site and (vi) which RPs have been retained for storage at the sites.

    Dynamically adaptive technique for reference snapshot selection

    公开(公告)号:US11704042B2

    公开(公告)日:2023-07-18

    申请号:US17512217

    申请日:2021-10-27

    申请人: Nutanix, Inc.

    IPC分类号: G06F12/00 G06F3/06

    摘要: A reference snapshot selection technique is configured to select a reference snapshot resolution algorithm used to determine an appropriate reference snapshot that may be employed to perform incremental snapshot replication of workload data between primary and secondary sites in a data replication environment. A reference resolution procedure is configured to process a set of constraints from the data replication environment to dynamically select the reference snapshot resolution algorithm based on a figure of merit that satisfies administrative constraints to reduce or optimize resource utilization in the data replication environment.

    Computing an unbroken snapshot sequence

    公开(公告)号:US11513914B2

    公开(公告)日:2022-11-29

    申请号:US17139489

    申请日:2020-12-31

    申请人: Nutanix, Inc.

    IPC分类号: G06F12/00 G06F11/14

    摘要: Methods, systems and computer program products for high-availability computing. In a computing configuration comprising a primary node, a first backup node, and a second backup node, a particular data state is restored to the primary node from a backup snapshot at the second backup node. Firstly, a snapshot coverage gap is identified between a primary node snapshot at the primary node and the backup snapshot at the second backup node. Next, intervening snapshots at the first backup node that fills the snapshot coverage gap are identified and located. Having both the backup snapshot from the second backup node and the intervening snapshots from the first backup node, the particular data state at the primary node is restored by performing differencing operations between the primary node snapshot, the backup snapshot from the second backup node, and the intervening snapshots of the first backup node.