METHODS AND SYSTEMS FOR SCALABLE DEDUPLICATION

    公开(公告)号:US20230350863A1

    公开(公告)日:2023-11-02

    申请号:US18347395

    申请日:2023-07-05

    IPC分类号: G06F16/215

    CPC分类号: G06F16/215

    摘要: Methods, computer program products, computer systems, and the like are disclosed that provide for scalable deduplication. Such methods, computer program products, and computer systems can include, in response to receiving a request to perform a lookup operation, performing the lookup operation and, in response to the signature not being found, forwarding the request to a remote node. Further, in response to receiving an indication that the signature was not found at the remote node, processing the subunit of data as a unique subunit of data.

    Systems and methods for data placement in container-based storage systems

    公开(公告)号:US11132128B2

    公开(公告)日:2021-09-28

    申请号:US15469157

    申请日:2017-03-24

    IPC分类号: G06F3/06 G06F12/0862

    摘要: The disclosed computer-implemented method for data placement in container-based storage systems may include (i) identifying a file stored within a container-based storage system, where the container-based storage system stores the file as data segments within containers, (ii) receiving, in response to a write operation directed to the file, a request to store within the container-based storage system a new data segment generated by the write operation, (iii) describing the file in terms of a plurality of consecutive slabs, (iv) determining that the new data segment falls within a specified slab, and (v) fulfilling the request to store the new data segment within the container-based storage system by storing the new data segment in a designated container that corresponds to the specified slab in response to determining that the new data segment falls within the specified slab. Various other methods, systems, and computer-readable media are also disclosed.

    Methods and systems for scalable deduplication

    公开(公告)号:US11741060B2

    公开(公告)日:2023-08-29

    申请号:US16698288

    申请日:2019-11-27

    IPC分类号: G06F16/215

    CPC分类号: G06F16/215

    摘要: Methods, computer program products, computer systems, and the like are disclosed that provide for scalable deduplication in an efficient and effective manner. For example, such methods, computer program products, and computer systems can include receiving a data object at an assigned node, determining whether the data object includes a sub-data object, and processing the sub-data object. The assigned node is a node of a plurality of nodes of a cluster, where the data object includes a data segment, and a signature. The signature is generated based, at least in part, on data of the data segment. The processing includes sending the sub-data object to a remote node. The remote node is another node of the plurality of nodes of the cluster.

    Systems and methods for improving the efficiency of recording data to tape

    公开(公告)号:US10296221B1

    公开(公告)日:2019-05-21

    申请号:US15192685

    申请日:2016-06-24

    IPC分类号: G06F3/06

    摘要: A computer-implemented method for improving the efficiency of recording data to tape may include (i) identifying a command to duplicate a data unit to tape storage after a previous version of the data unit has already been duplicated to tape storage, (ii) identifying metadata that distinguishes between segments of the data unit that have not changed since the previous version of the data unit and segments that have changed, (iii) reading the previous version of the data unit from tape storage and reading the segments of the data unit that have changed from a data sharing storage rather than tape storage, and (iv) combining, using the metadata, the segments read from tape storage that have not changed and the segments read from the data sharing storage that have changed to duplicate the data unit to tape storage. Various other methods, systems, and computer-readable media are also disclosed.

    SYSTEMS AND METHODS FOR DATA PLACEMENT IN CONTAINER-BASED STORAGE SYSTEMS

    公开(公告)号:US20180275886A1

    公开(公告)日:2018-09-27

    申请号:US15469157

    申请日:2017-03-24

    IPC分类号: G06F3/06 G06F12/0862

    摘要: The disclosed computer-implemented method for data placement in container-based storage systems may include (i) identifying a file stored within a container-based storage system, where the container-based storage system stores the file as data segments within containers, (ii) receiving, in response to a write operation directed to the file, a request to store within the container-based storage system a new data segment generated by the write operation, (iii) describing the file in terms of a plurality of consecutive slabs, (iv) determining that the new data segment falls within a specified slab, and (v) fulfilling the request to store the new data segment within the container-based storage system by storing the new data segment in a designated container that corresponds to the specified slab in response to determining that the new data segment falls within the specified slab. Various other methods, systems, and computer-readable media are also disclosed.

    Fingerprint change during data operations

    公开(公告)号:US09952933B1

    公开(公告)日:2018-04-24

    申请号:US14588008

    申请日:2014-12-31

    CPC分类号: G06F11/1448

    摘要: Various systems, methods, and processes for caching and referencing multiple fingerprints while data operations are ongoing are disclosed. A first fingerprint is generated based on a first fingerprinting process. The first fingerprint is stored in association with a second fingerprint, which is based on a second fingerprinting process. The first fingerprint and the second fingerprint are associated with the same data segment. Data operations such as a backup operation, a restore operation, or a replication operation can be performed while the conversion of the data segment from the second fingerprint to the first fingerprint is ongoing.

    METHODS AND SYSTEMS FOR SCALABLE DEDUPLICATION

    公开(公告)号:US20210157777A1

    公开(公告)日:2021-05-27

    申请号:US16698288

    申请日:2019-11-27

    IPC分类号: G06F16/215

    摘要: Methods, computer program products, computer systems, and the like are disclosed that provide for scalable deduplication in an efficient and effective manner. For example, such methods, computer program products, and computer systems can include receiving a data object at an assigned node, determining whether the data object includes a sub-data object, and processing the sub-data object. The assigned node is a node of a plurality of nodes of a cluster, where the data object includes a data segment, and a signature. The signature is generated based, at least in part, on data of the data segment. The processing includes sending the sub-data object to a remote node. The remote node is another node of the plurality of nodes of the cluster.

    Fingerprint change during data operations

    公开(公告)号:US10983867B1

    公开(公告)日:2021-04-20

    申请号:US15959489

    申请日:2018-04-23

    摘要: Various systems, methods, and processes for caching and referencing multiple fingerprints while data operations are ongoing are disclosed. A first fingerprint is generated based on a first fingerprinting process. The first fingerprint is stored in association with a second fingerprint, which is based on a second fingerprinting process. The first fingerprint and the second fingerprint are associated with the same data segment. Data operations such as a backup operation, a restore operation, or a replication operation can be performed while the conversion of the data segment from the second fingerprint to the first fingerprint is ongoing.