ASYNCHRONOUS SEMI-INLINE DEDUPLICATION

    公开(公告)号:US20210342082A1

    公开(公告)日:2021-11-04

    申请号:US17373820

    申请日:2021-07-13

    Applicant: NetApp Inc.

    Abstract: Techniques are provided for asynchronous semi-inline deduplication. A multi-tiered storage arrangement comprises a first storage tier, a second storage tier, etc. An in-memory change log of data recently written to the first storage tier is evaluate to identify a fingerprint of a data block recently written to the first storage tier. A donor data store, comprising fingerprints of data blocks already stored within the first storage tier, is queried using the fingerprint. If the fingerprint is found, then deduplication is performed for the data block to create deduplicated data based upon a potential donor data block within the first storage tier. The deduplicated data is moved from the first storage tier to the second storage tier, such as in response to a determination that the deduplicated data has not been recently accessed. The deduplication is performed before cold data is moved from first storage tier to second storage tier.

    MIRRORING OBJECTS BETWEEN DIFFERENT CLOUD PROVIDERS WITH DIFFERENT DATA LAYOUT REQUIREMENTS

    公开(公告)号:US20240362124A1

    公开(公告)日:2024-10-31

    申请号:US18308337

    申请日:2023-04-27

    Applicant: NetApp Inc.

    Abstract: Techniques are provided for mirroring objects between object stores hosted by cloud providers that have different data layout requirements. An object may be stored within a first object store that supports a fix offset format where uncompressed data is stored according to fixed offsets and boundaries within fixed size objects. A mirroring operation may be used to mirror the object to a second object store that supports a unified object format where compressed data can be stored at non-fixed offsets and boundaries within variable sized objects. The mirroring operation selects a compression algorithm and compresses the object on the fly to create a mirrored object having the unified object format. The mirrored object, populated with the compressed data and slot header metadata comprising compression information for how to locate and decompress the data in the mirrored object, is stored into the second object store.

    Asynchronous semi-inline deduplication

    公开(公告)号:US11068182B2

    公开(公告)日:2021-07-20

    申请号:US16683466

    申请日:2019-11-14

    Applicant: NetApp inc.

    Abstract: Techniques are provided for asynchronous semi-inline deduplication. A multi-tiered storage arrangement comprises a first storage tier, a second storage tier, etc. An in-memory change log of data recently written to the first storage tier is evaluate to identify a fingerprint of a data block recently written to the first storage tier. A donor data store, comprising fingerprints of data blocks already stored within the first storage tier, is queried using the fingerprint. If the fingerprint is found, then deduplication is performed for the data block to create deduplicated data based upon a potential donor data block within the first storage tier. The deduplicated data is moved from the first storage tier to the second storage tier, such as in response to a determination that the deduplicated data has not been recently accessed. The deduplication is performed before cold data is moved from first storage tier to second storage tier.

    INLINE DEDUPLICATION
    6.
    发明申请

    公开(公告)号:US20200159432A1

    公开(公告)日:2020-05-21

    申请号:US16774127

    申请日:2020-01-28

    Applicant: NetApp Inc.

    Abstract: One or more techniques and/or computing devices are provided for inline deduplication. For example, a checksum hash table and/or a block number hash table may be maintained within memory (e.g., a storage controller may maintain the hash tables in-core). The checksum hash table may be utilized for inline deduplication to identify potential donor blocks that may comprise the same data as an incoming storage operation. Data within an in-core buffer cache is eligible as potential donor blocks so that inline deduplication may be performed using data from the in-core buffer cache, which may mitigate disk access to underlying storage for which the in-core buffer cache is used for caching. The block number hash table may be used for updating or removing entries from the hash tables, such as for blocks that are no longer eligible as potential donor blocks (e.g., deleted blocks, blocks evicted from the in-core buffer cache, etc.).

    MIRRORING OBJECTS BETWEEN DIFFERENT CLOUD PROVIDERS

    公开(公告)号:US20240362183A1

    公开(公告)日:2024-10-31

    申请号:US18308313

    申请日:2023-04-27

    Applicant: NetApp Inc.

    CPC classification number: G06F16/125 G06F16/1744

    Abstract: Techniques are provided for mirroring objects between object stores hosted by cloud providers that could have different data layout requirements. An object may be stored within an object store that supports a unified object format where the object is capable of storing compressed data. The object may be mirrored to a destination object store that may also support the unified object format or to a destination object store that does not support the unified object format. If the destination object store does not support the unified object format, then slot header metadata within the object is used to decompress the data within the object into an uncompressed format. The data is packaged from being in the uncompressed format into a fixed offset format supported by the destination object store to create a mirrored object that is stored into the destination object store while retaining compression of the data.

    Asynchronous semi-inline deduplication

    公开(公告)号:US11620064B2

    公开(公告)日:2023-04-04

    申请号:US17373820

    申请日:2021-07-13

    Applicant: NetApp Inc.

    Abstract: Techniques are provided for asynchronous semi-inline deduplication. A multi-tiered storage arrangement comprises a first storage tier, a second storage tier, etc. An in-memory change log of data recently written to the first storage tier is evaluate to identify a fingerprint of a data block recently written to the first storage tier. A donor data store, comprising fingerprints of data blocks already stored within the first storage tier, is queried using the fingerprint. If the fingerprint is found, then deduplication is performed for the data block to create deduplicated data based upon a potential donor data block within the first storage tier. The deduplicated data is moved from the first storage tier to the second storage tier, such as in response to a determination that the deduplicated data has not been recently accessed. The deduplication is performed before cold data is moved from first storage tier to second storage tier.

    ASYNCHRONOUS SEMI-INLINE DEDUPLICATION
    9.
    发明申请

    公开(公告)号:US20180173449A1

    公开(公告)日:2018-06-21

    申请号:US15386544

    申请日:2016-12-21

    Applicant: NetApp Inc.

    CPC classification number: G06F3/0641 G06F3/0608 G06F3/0683

    Abstract: Techniques are provided for asynchronous semi-inline deduplication. A multi-tiered storage arrangement comprises a first storage tier, a second storage tier, etc. An in-memory change log of data recently written to the first storage tier is evaluate to identify a fingerprint of a data block recently written to the first storage tier. A donor data store, comprising fingerprints of data blocks already stored within the first storage tier, is queried using the fingerprint. If the fingerprint is found, then deduplication is performed for the data block to create deduplicated data based upon a potential donor data block within the first storage tier. The deduplicated data is moved from the first storage tier to the second storage tier, such as in response to a determination that the deduplicated data has not been recently accessed. The deduplication is performed before cold data is moved from first storage tier to second storage tier.

Patent Agency Ranking