-
公开(公告)号:US09933970B2
公开(公告)日:2018-04-03
申请号:US14928848
申请日:2015-10-30
Applicant: NetApp, Inc.
Inventor: Atish Kathpal , Giridhar Yasa
CPC classification number: G06F3/0641 , G06F3/0608 , G06F3/0686
Abstract: A method and system for deduplicating data for a data storage system using similarity determinations are described. A tape library is arranged in a hierarchy of tape groups and tape plexes. Tape groups are an admin visible entity and are comprised of multiple tape plexes (at least equal to the number of replicas in a tape group). Tape plexes in turn comprise multiple tape cartridges. Data files and objects received within a time period are initially staged in a disk cache where they are logically segregated into cliques based on their expected deduplication ratios. These cliques are then evaluated for the amount of duplication they have with data existing in tape plexes. Based on the number of replicas being written, the top few tape plexes are selected from within the tape group. The cliques are deduplicated with data on the selected tape plexes, compressed, and written to tape.
-
公开(公告)号:US20170123711A1
公开(公告)日:2017-05-04
申请号:US14928848
申请日:2015-10-30
Applicant: NetApp, Inc.
Inventor: Atish Kathpal , Giridhar Yasa
IPC: G06F3/06
CPC classification number: G06F3/0641 , G06F3/0608 , G06F3/0686
Abstract: A method and system for deduplicating data for a data storage system using similarity determinations are described. A tape library is arranged in a hierarchy of tape groups and tape plexes. Tape groups are an admin visible entity and are comprised of multiple tape plexes (at least equal to the number of replicas in a tape group). Tape plexes in turn comprise multiple tape cartridges. Data files and objects received within a time period are initially staged in a disk cache where they are logically segregated into cliques based on their expected deduplication ratios. These cliques are then evaluated for the amount of duplication they have with data existing in tape plexes. Based on the number of replicas being written, the top few tape plexes are selected from within the tape group. The cliques are deduplicated with data on the selected tape plexes, compressed, and written to tape.
-