METHODS FOR ESTIMATING COST SAVINGS USING DEDUPLICATION AND COMPRESSION IN A STORAGE SYSTEM

    公开(公告)号:US20180246649A1

    公开(公告)日:2018-08-30

    申请号:US15445890

    申请日:2017-02-28

    CPC classification number: G06F3/0605 G06F3/0608 G06F3/0641 G06F3/067

    Abstract: Methods for estimating cost savings in a storage system using an external host system. One method includes accessing over a communication network data from a unit of storage of a data storage system, wherein each of the blocks of data is uncompressed. A plurality of blocks is parsed from the data. A plurality of fingerprints is generated from the blocks using a hash algorithm. A deduplication ratio is estimated for the plurality of blocks stored in the unit of storage using a hyperloglog algorithm and a first plurality of buckets compartmentalizing the plurality of blocks, wherein the first plurality of buckets is defined by precision bits of the plurality of fingerprints. An effective compression ratio is estimated for the plurality of blocks stored in the unit of storage using the hyperloglog algorithm and a second plurality of buckets compartmentalizing the plurality of blocks, wherein the second plurality of buckets is defined by ranges of compression ratios.

Patent Agency Ranking