Apparatus and method to sequentially deduplicate data
    5.
    发明授权
    Apparatus and method to sequentially deduplicate data 有权
    依次重复数据删除的设备和方法

    公开(公告)号:US09275067B2

    公开(公告)日:2016-03-01

    申请号:US12404998

    申请日:2009-03-16

    IPC分类号: G06F7/00 G06F17/30

    摘要: A method to sequentially deduplicate data, wherein the method receives a plurality of computer files, wherein each of the plurality of computer files comprises a label comprising a file name, a file type, a version number, and file size, and stores that plurality of computer files in a deduplication queue. The method then identifies a subset of the plurality of computer files, wherein each file of the subset comprises the same file name but a different version number, and wherein the subset comprises a maximum count of version numbers, and wherein the subset comprises a portion of the plurality of computer files. The method deduplicates the subset using a hash algorithm, and removes the subset from said deduplication queue.

    摘要翻译: 一种用于顺序地重复数据删除数据的方法,其中所述方法接收多个计算机文件,其中所述多个计算机文件中的每一个包括包括文件名,文件类型,版本号和文件大小的标签,并且存储所述多个 重复数据删除队列中的计算机文件。 该方法然后识别多个计算机文件的子集,其中该子集的每个文件包含相同的文件名但不同的版本号,并且其中该子集包括版本号的最大计数,并且其中该子集包括 多个计算机文件。 该方法使用散列算法对子集进行重复数据删除,并从所述重复数据消除队列中删除该子集。

    Apparatus and method to select a deduplication protocol for a data storage library
    6.
    发明授权
    Apparatus and method to select a deduplication protocol for a data storage library 有权
    为数据存储库选择重复数据消除协议的装置和方法

    公开(公告)号:US08234444B2

    公开(公告)日:2012-07-31

    申请号:US12046315

    申请日:2008-03-11

    IPC分类号: G06F12/00

    CPC分类号: G06F11/1084

    摘要: A method to select a deduplication protocol for a data storage library comprising a plurality of data storage devices configured as a RAID array, by establishing a normal deduplication protocol, a RAID failure deduplication protocol, and a multiple storage device failure deduplication protocol. The method receives host data comprising a plurality of interleaved data blocks. If the system is operating without any storage device failures, then the method processes the host data using the normal deduplication protocol. If the system is operating with a storage device failure, then the method processes the host data using the RAID failure deduplication protocol. If the system is operating with multiple storage device failures, then the method processes the host data using the multiple storage device failure deduplication protocol.

    摘要翻译: 通过建立正常的重复数据消除协议,RAID故障重复数据删除协议和多存储设备故障重复数据消除协议,为包括配置为RAID阵列的多个数据存储设备的数据存储库选择重复数据删除协议的方法。 该方法接收包括多个交错数据块的主机数据。 如果系统运行时没有任何存储设备故障,则该方法使用正常的重复数据删除协议处理主机数据。 如果系统正在运行存储设备故障,则该方法使用RAID故障重复数据删除协议处理主机数据。 如果系统正在运行多个存储设备故障,则该方法使用多个存储设备故障重复数据消除协议来处理主机数据。

    Apparatus and method to sequentially deduplicate groups of files comprising the same file name but different file version numbers
    7.
    发明授权
    Apparatus and method to sequentially deduplicate groups of files comprising the same file name but different file version numbers 失效
    依次重复数据删除包含相同文件名但不同文件版本号的文件组的装置和方法

    公开(公告)号:US08719240B2

    公开(公告)日:2014-05-06

    申请号:US12488489

    申请日:2009-06-19

    摘要: A method to sequentially deduplicate data, wherein the method receives a plurality of computer files, wherein each of the plurality of computer files comprises a label comprising a file name, a file type, a version number, and file size, and stores that plurality of computer files in a deduplication queue. The method then identifies a subset of the plurality of computer files, wherein each file of the subset comprises the same file name but a different version number, and wherein the subset comprises a maximum count of version numbers, and wherein the subset comprises a portion of the plurality of computer files. The method deduplicates the subset using a hash algorithm, and removes the subset from said deduplication queue. During the deduplicating, the method receives new computer files comprising the same file name, stores those new computer files to the deduplication queue, but does not add those new computer files to the subset.

    摘要翻译: 一种用于顺序地重复数据删除数据的方法,其中所述方法接收多个计算机文件,其中所述多个计算机文件中的每一个包括包括文件名,文件类型,版本号和文件大小的标签,并且存储所述多个 重复数据删除队列中的计算机文件。 该方法然后识别多个计算机文件的子集,其中该子集的每个文件包含相同的文件名但不同的版本号,并且其中该子集包括版本号的最大计数,并且其中该子集包括 多个计算机文件。 该方法使用散列算法对子集进行重复数据删除,并从所述重复数据消除队列中删除该子集。 在重复数据删除期间,该方法会接收包含相同文件名的新计算机文件,将这些新计算机文件存储到重复数据消除队列中,但不会将这些新计算机文件添加到子集。