Techniques for efficient data deduplication

    公开(公告)号:US11416462B2

    公开(公告)日:2022-08-16

    申请号:US16927257

    申请日:2020-07-13

    Inventor: Peng Wu Bin Dai Rong Yu

    Abstract: Data deduplication techniques may use a fingerprint hash table and a backend location hash table in connection with performing operations including fingerprint insertion, fingerprint deletion and fingerprint lookup. Processing I/O operations may include: receiving a write operation that writes data to a target logical address; determining a fingerprint for the data; querying the fingerprint hash table using the fingerprint to determine a matching entry of the fingerprint hash table for the fingerprint; and responsive to determining that the fingerprint hash table does not have the matching entry that matches the fingerprint, performing processing including: inserting a first entry in the fingerprint hash table, wherein the first entry includes the fingerprint for the data and identifies a storage location at which the data is stored; and inserting a second entry in a backend location hash table, wherein the second entry references the first entry.

    Synchronous destage of write data from shared global memory to back-end storage resources

    公开(公告)号:US11573738B2

    公开(公告)日:2023-02-07

    申请号:US17151794

    申请日:2021-01-19

    Abstract: A synchronous destage process is used to move data from shared global memory to back-end storage resources. The synchronous destage process is implemented using a client-server model between a data service layer (client) and back-end disk array of a storage system (server). The data service layer initiates a synchronous destage operation by requesting that the back-end disk array move data from one or more slots of global memory to back-end storage resources. The back-end disk array services the request and notifies the data service layer of the status of the destage operation, e.g. a destage success or destage failure. If the destage operation is a success, the data service layer updates metadata to identify the location of the data on back-end storage resources. If the destage operation is not successful, the data service layer re-initiates the destage process by issuing a subsequent destage request to the back-end disk array.

    Methods and devices for data de-duplication

    公开(公告)号:US10936560B2

    公开(公告)日:2021-03-02

    申请号:US15846370

    申请日:2017-12-19

    Abstract: Embodiments of the present disclosure disclose methods and devices of data de-duplication. The method of data de-duplication performed at a client comprises: in response to receiving data to be backed up at a client, sampling the data to be backed up to obtain the sampled data; generating a signature for the sampled data; transmitting the signature to a master storage node in a storage cluster including a plurality of storage nodes, to allow the master storage node to select one storage node from the plurality of storage nodes; receiving an indication of the selected storage node from the master storage node; and transmitting, based on the indication, the data to be backed up to the selected storage node. Embodiments of the present disclosure also provide methods of data de-duplication performed at the master storage node and the slave storage node, and corresponding devices.

    Techniques performed in connection with an insufficient resource level when processing write data

    公开(公告)号:US10776290B1

    公开(公告)日:2020-09-15

    申请号:US16568576

    申请日:2019-09-12

    Abstract: Techniques for processing I/O operations includes: determining whether a current amount of unused physical storage is greater than a threshold; and responsive to determining the current amount of unused physical storage is greater than the threshold, performing normal write processing, and otherwise performing alternative write processing. The alternative write processing includes: initializing a counter; determining whether a physical storage allocation is needed or potentially needed for a write I/O operation; responsive to determining that no physical storage allocation is needed for the write I/O operation, performing the normal write processing. Responsive to determining that a physical storage allocation is needed or potentially needed for the write I/O operation, determining a first amount of one or more credits needed to service the write I/O operation; and responsive to determining the counter does not include at least the first amount of one or more credits, failing the write I/O operation.

    Data protection cluster system supporting multiple data tiers

    公开(公告)号:US10719417B2

    公开(公告)日:2020-07-21

    申请号:US15883832

    申请日:2018-01-30

    Inventor: Peng Wu Yong Zou

    Abstract: A hierarchical multi-level heterogeneous cluster data system having processing nodes at each of a plurality of cluster levels configured for different data tiers having different availability, accessibility and protection requirements. Each cluster level comprises groups of processing nodes arranged into a plurality of failover domains of interconnected nodes that exchange heartbeat signals to indicate that the nodes are alive and functioning. A master node of each failover domain is connected to a master node of a parent failover domain for exchanging heartbeat signals to detect failures of nodes at lower cluster levels. Upon a network partition, the nodes of the failover domain may be merged into another failover domain at the same or a higher cluster level to continue providing data services. The cluster has a global namespace across all cluster levels, so that nodes that are moved to different failover domains can be accessed using the same pathname.

    Techniques for efficient data deduplication

    公开(公告)号:US11803527B2

    公开(公告)日:2023-10-31

    申请号:US17864717

    申请日:2022-07-14

    Inventor: Peng Wu Bin Dai Rong Yu

    CPC classification number: G06F16/215 G06F16/174 G06F16/2255 G06F16/245

    Abstract: Data deduplication techniques may use a fingerprint hash table and a backend location hash table in connection with performing operations including fingerprint insertion, fingerprint deletion and fingerprint lookup. Processing I/O operations may include: receiving a write operation that writes data to a target logical address; determining a fingerprint for the data; querying the fingerprint hash table using the fingerprint to determine a matching entry of the fingerprint hash table for the fingerprint; and responsive to determining that the fingerprint hash table does not have the matching entry that matches the fingerprint, performing processing including: inserting a first entry in the fingerprint hash table, wherein the first entry includes the fingerprint for the data and identifies a storage location at which the data is stored; and inserting a second entry in a backend location hash table, wherein the second entry references the first entry.

    Data fingerprint distribution on a data storage system

    公开(公告)号:US10782882B1

    公开(公告)日:2020-09-22

    申请号:US16387997

    申请日:2019-04-18

    Inventor: Peng Wu Bin Dai Rong Yu

    Abstract: Fingerprints of data portions are distributed in a balanced manner across active controllers of a data storage system, and may be done so in such a manner that, when a new active controller is added to the system, fingerprint ownership and movement between pre-existing active controllers, and active controllers overall, is minimized When a new active controller is added to the system and fingerprints are redistributed, no fingerprint ownership may be re-assigned between pre-existing active controllers and no fingerprints may be moved between pre-existing active controllers, for example, between local memories of the active controller.

Patent Agency Ranking