Efficiency sets for determination of unique data

    公开(公告)号:US11194506B1

    公开(公告)日:2021-12-07

    申请号:US16940461

    申请日:2020-07-28

    Applicant: NetApp, Inc.

    Abstract: A system, method, and machine-readable storage medium for determining an amount of unique data in a distributed storage system are provided. In some embodiments, a combined efficiency set for a first data set stored in the distributed storage system, such as at a volume, may be generated. The first data set may include a first subset of data and a second subset of data in the distributed storage system. Additionally, a set of efficiency sets for the first subset of data may be generated. A set difference based on the combined efficiency set and the set of efficiency sets may be computed. An amount of memory used for storing unique data of the second subset of data may be estimated based on the set difference. The unique data may be present in the second subset of data but absent from the first subset of data.

    Standby copies withstand cascading fails

    公开(公告)号:US11194501B2

    公开(公告)日:2021-12-07

    申请号:US16752001

    申请日:2020-01-24

    Applicant: NetApp, Inc.

    Abstract: A technique is configured to maintain multiple copies of data served by storage nodes of a cluster during upgrade of a storage node to ensure continuous protection of the data served by the nodes. The data is logically organized as one or more volumes on storage devices of the cluster and includes metadata that describe the data of each volume. A data protection system may be configured to maintain at least two copies of the data in the cluster during upgrade to a storage node that is assigned to host one of the copies of the data but that is taken offline during the upgrade. As a result, an original slice service of the node may be rendered unavailable during the upgrade. In response, the technique redirects replicated data targeted to the original slice service to a standby pool of slice services in accordance with a degraded redundant metadata service of the cluster. In the event the standby slice service itself subsequently becomes unavailable, another standby slice service from the standby pool is activated to receive the subsequent data. In this manner, cascading failure of secondary slice slices is handled.

    FREEING AND UTILIZING UNUSED INODES
    193.
    发明申请

    公开(公告)号:US20210365187A1

    公开(公告)日:2021-11-25

    申请号:US17396796

    申请日:2021-08-09

    Applicant: NetApp Inc.

    Abstract: Techniques are provided for freeing and utilizing unused inodes. For example, an operation, targeting a first storage object of a first node having a replication relationship with a second storage object of a second node, is intercepted. A replication operation is created as a replication of the operation. The operation is implemented upon the first storage object and the replication operation is implemented upon the second storage object. A determination is made that the replication operation uses an inode no longer used by storage objects of the second node. The inode targeted by the replication operation is freed and utilized based upon the inode being a leaf inode. If the inode is a stream directory inode, then data streams of the stream directory inode are moved under a new private inode and the stream directory inode is released.

    Pooling blocks for erasure coding write groups

    公开(公告)号:US11175989B1

    公开(公告)日:2021-11-16

    申请号:US16858376

    申请日:2020-04-24

    Applicant: NetApp, Inc.

    Abstract: A technique provides efficient data protection, such as erasure coding, for data blocks of volumes served by storage nodes of a cluster. Data blocks associated with write requests of unpredictable client workload patterns may be compressed. A set of the compressed data blocks may be selected to form a write group and an erasure code may be applied to the group to algorithmically generate one or more encoded blocks in addition to the data blocks. Due to the unpredictability of the data workload patterns, the compressed data blocks may have varying sizes. A pool of the various-sized compressed data blocks may be established and maintained from which the data blocks of the write group are selected. Establishment and maintenance of the pool enables selection of compressed data blocks that are substantially close to the same size and, thus, that require minimal padding.

    Selective deduplication
    195.
    发明授权

    公开(公告)号:US11169967B2

    公开(公告)日:2021-11-09

    申请号:US16716759

    申请日:2019-12-17

    Applicant: NetApp Inc.

    Abstract: Methods and apparatuses for performing selective deduplication in a storage system are introduced here. Techniques are provided for determining a probability of deduplication for a data object based on a characteristic of the data object and performing a deduplication operation on the data object in the storage system prior to the data object being stored in persistent storage of the storage system if the probability of deduplication for the data object has a specified relationship to a specified threshold.

    Recovery support techniques for storage virtualization environments

    公开(公告)号:US11169884B2

    公开(公告)日:2021-11-09

    申请号:US16866984

    申请日:2020-05-05

    Applicant: NetApp Inc.

    Abstract: Recovery support techniques for storage virtualization environments are described. In one embodiment, for example, a method may be performed that comprises defining, by processing circuitry, a storage container comprising one or more logical storage volumes of a logical storage array of a storage system, associating the storage container with a virtual volume (vvol) datastore, identifying metadata for a vvol of the vvol datastore, and writing the metadata for the vvol to the storage system. Other embodiments are described and claimed.

    ASYNCHRONOUS SEMI-INLINE DEDUPLICATION

    公开(公告)号:US20210342082A1

    公开(公告)日:2021-11-04

    申请号:US17373820

    申请日:2021-07-13

    Applicant: NetApp Inc.

    Abstract: Techniques are provided for asynchronous semi-inline deduplication. A multi-tiered storage arrangement comprises a first storage tier, a second storage tier, etc. An in-memory change log of data recently written to the first storage tier is evaluate to identify a fingerprint of a data block recently written to the first storage tier. A donor data store, comprising fingerprints of data blocks already stored within the first storage tier, is queried using the fingerprint. If the fingerprint is found, then deduplication is performed for the data block to create deduplicated data based upon a potential donor data block within the first storage tier. The deduplicated data is moved from the first storage tier to the second storage tier, such as in response to a determination that the deduplicated data has not been recently accessed. The deduplication is performed before cold data is moved from first storage tier to second storage tier.

    Deduplicated host cache flush to remote storage

    公开(公告)号:US11163690B2

    公开(公告)日:2021-11-02

    申请号:US16679585

    申请日:2019-11-11

    Applicant: NetApp Inc.

    Abstract: In addition to caching I/O operations at a host, at least some data management can migrate to the host. With host side caching, data sharing or deduplication can be implemented with the cached writes before those writes are supplied to front end storage elements. When a host cache flush to distributed storage trigger is detected, the host deduplicates the cached writes. The host aggregates data based on the deduplication into a “change set file” (i.e., a file that includes the aggregation of unique data from the cached writes). The host supplies the change set file to the distributed storage system. The host then sends commands to the distributed storage system. Each of the commands identifies a part of the change set file to be used for a target of the cached writes.

    METHODS FOR HANDLING STORAGE DEVICES WITH DIFFERENT ZONE SIZES AND DEVICES THEREOF

    公开(公告)号:US20210334025A1

    公开(公告)日:2021-10-28

    申请号:US16857919

    申请日:2020-04-24

    Applicant: NetApp, Inc.

    Abstract: The disclosed technology relates determining a first subset of a plurality drives having a first zone size and a second subset of the plurality of drives having a second zone size different from the first zone size, within a redundant array of independent disks (RAID) group. A prevailing zone size between the first zone size and the second zone size is determined. One or more logical zones within the determined first subset of the plurality of drives and the determined second subset of the plurality of drives for a received input-output operation is reserved based on the determined prevailing zone size. The received input-output operation is completed within the reserved one or more logical zones within the determined first subset of the plurality of drives and the determined second subset of the plurality of drives

    LOW-OVERHEAD ATOMIC WRITES FOR PERSISTENT MEMORY

    公开(公告)号:US20210326266A1

    公开(公告)日:2021-10-21

    申请号:US16852589

    申请日:2020-04-20

    Applicant: NetApp Inc.

    Abstract: Techniques are provided for atomic writes for persistent memory. In response to receiving a write operation, a new per-page structure with a new page block number is allocated. New data of the write operation is persisted to a new page of the persistent memory having the new page block number, and the new per-page structure is persisted to the persistent memory. If the write operation targets a hole after the new data and the new per-page structure have been persisted, then a new per-page structure identifier of the new per-page structure is inserted into a parent indirect page of a page comprising the new data. If the write operation targets old data after the new data and the new per-page structure have been persisted, then an old per-page structure of the old data is updated with the new page block number.

Patent Agency Ranking