-
公开(公告)号:US20140181465A1
公开(公告)日:2014-06-26
申请号:US14190492
申请日:2014-02-26
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Duane M. BALDWIN , Nilesh P. BHOSALE , John T. OLSON , Sandeep R. PATIL
IPC: G06F3/06
CPC classification number: G06F3/0641 , G06F3/0604 , G06F3/0683 , G06F17/30159
Abstract: Exemplary embodiments for increased in-line deduplication efficiency in a computing environment are provided. Embodiments include incrementing the size of data samples from fixed size data chunks for each nth iteration for reaching a full size of an object requested for in-line deduplication, calculating in nth iterations hash values on data samples from fixed size data chunks extracted from the object, and matching in a nth hash index table the calculated nth iteration hash values for the data samples from the fixed size data chunks with a corresponding hash value of existing objects in storage, wherein the nth hash index table is built for each nth iteration of the data samples belonging to the fixed data chunks.
Abstract translation: 提供了用于在计算环境中提高在线重复数据删除效率的示例性实施例。 实施例包括从每固定大小的数据块中增加数据样本的大小,以达到用于进行在线重复数据消除所请求的对象的全部大小的第n次迭代的全尺寸的数据样本的大小;在从对象中提取的固定大小数据块的数据样本上进行第n次迭代计算散列值 ,并且在第n个散列索引表中匹配来自具有存储中的现有对象的相应哈希值的固定大小数据块的数据样本的计算的第n个迭代散列值,其中第n个散列索引表是针对 属于固定数据块的数据样本。