TECHNIQUES FOR EFFICIENT DATA DEDUPLICATION

    公开(公告)号:US20220358103A1

    公开(公告)日:2022-11-10

    申请号:US17864717

    申请日:2022-07-14

    Inventor: Peng Wu Bin Dai Rong Yu

    Abstract: Data deduplication techniques may use a fingerprint hash table and a backend location hash table in connection with performing operations including fingerprint insertion, fingerprint deletion and fingerprint lookup. Processing I/O operations may include: receiving a write operation that writes data to a target logical address; determining a fingerprint for the data; querying the fingerprint hash table using the fingerprint to determine a matching entry of the fingerprint hash table for the fingerprint; and responsive to determining that the fingerprint hash table does not have the matching entry that matches the fingerprint, performing processing including: inserting a first entry in the fingerprint hash table, wherein the first entry includes the fingerprint for the data and identifies a storage location at which the data is stored; and inserting a second entry in a backend location hash table, wherein the second entry references the first entry.

    TECHNIQUES FOR EFFICIENT DATA DEDUPLICATION

    公开(公告)号:US20220012218A1

    公开(公告)日:2022-01-13

    申请号:US16927257

    申请日:2020-07-13

    Inventor: Peng Wu Bin Dai Rong Yu

    Abstract: Data deduplication techniques may use a fingerprint hash table and a backend location hash table in connection with performing operations including fingerprint insertion, fingerprint deletion and fingerprint lookup. Processing I/O operations may include: receiving a write operation that writes data to a target logical address; determining a fingerprint for the data; querying the fingerprint hash table using the fingerprint to determine a matching entry of the fingerprint hash table for the fingerprint; and responsive to determining that the fingerprint hash table does not have the matching entry that matches the fingerprint, performing processing including: inserting a first entry in the fingerprint hash table, wherein the first entry includes the fingerprint for the data and identifies a storage location at which the data is stored; and inserting a second entry in a backend location hash table, wherein the second entry references the first entry.

    Data fingerprint distribution on a data storage system

    公开(公告)号:US10303365B1

    公开(公告)日:2019-05-28

    申请号:US15884519

    申请日:2018-01-31

    Inventor: Peng Wu Bin Dai Rong Yu

    Abstract: Fingerprints of data portions are distributed in a balanced manner across active controllers of a data storage system, and may be done so in such a manner that, when a new active controller is added to the system, fingerprint ownership and movement between pre-existing active controllers, and active controllers overall, is minimized When a new active controller is added to the system and fingerprints are redistributed, no fingerprint ownership may be re-assigned between pre-existing active controllers and no fingerprints may be moved between pre-existing active controllers, for example, between local memories of the active controller.

    Techniques for efficient data deduplication

    公开(公告)号:US11803527B2

    公开(公告)日:2023-10-31

    申请号:US17864717

    申请日:2022-07-14

    Inventor: Peng Wu Bin Dai Rong Yu

    CPC classification number: G06F16/215 G06F16/174 G06F16/2255 G06F16/245

    Abstract: Data deduplication techniques may use a fingerprint hash table and a backend location hash table in connection with performing operations including fingerprint insertion, fingerprint deletion and fingerprint lookup. Processing I/O operations may include: receiving a write operation that writes data to a target logical address; determining a fingerprint for the data; querying the fingerprint hash table using the fingerprint to determine a matching entry of the fingerprint hash table for the fingerprint; and responsive to determining that the fingerprint hash table does not have the matching entry that matches the fingerprint, performing processing including: inserting a first entry in the fingerprint hash table, wherein the first entry includes the fingerprint for the data and identifies a storage location at which the data is stored; and inserting a second entry in a backend location hash table, wherein the second entry references the first entry.

    Data fingerprint distribution on a data storage system

    公开(公告)号:US10782882B1

    公开(公告)日:2020-09-22

    申请号:US16387997

    申请日:2019-04-18

    Inventor: Peng Wu Bin Dai Rong Yu

    Abstract: Fingerprints of data portions are distributed in a balanced manner across active controllers of a data storage system, and may be done so in such a manner that, when a new active controller is added to the system, fingerprint ownership and movement between pre-existing active controllers, and active controllers overall, is minimized When a new active controller is added to the system and fingerprints are redistributed, no fingerprint ownership may be re-assigned between pre-existing active controllers and no fingerprints may be moved between pre-existing active controllers, for example, between local memories of the active controller.

    Techniques for efficient data deduplication

    公开(公告)号:US11416462B2

    公开(公告)日:2022-08-16

    申请号:US16927257

    申请日:2020-07-13

    Inventor: Peng Wu Bin Dai Rong Yu

    Abstract: Data deduplication techniques may use a fingerprint hash table and a backend location hash table in connection with performing operations including fingerprint insertion, fingerprint deletion and fingerprint lookup. Processing I/O operations may include: receiving a write operation that writes data to a target logical address; determining a fingerprint for the data; querying the fingerprint hash table using the fingerprint to determine a matching entry of the fingerprint hash table for the fingerprint; and responsive to determining that the fingerprint hash table does not have the matching entry that matches the fingerprint, performing processing including: inserting a first entry in the fingerprint hash table, wherein the first entry includes the fingerprint for the data and identifies a storage location at which the data is stored; and inserting a second entry in a backend location hash table, wherein the second entry references the first entry.

Patent Agency Ranking