Apparatus and method to sequentially deduplicate data
    1.
    发明授权
    Apparatus and method to sequentially deduplicate data 有权
    依次重复数据删除的设备和方法

    公开(公告)号:US09275067B2

    公开(公告)日:2016-03-01

    申请号:US12404998

    申请日:2009-03-16

    IPC分类号: G06F7/00 G06F17/30

    摘要: A method to sequentially deduplicate data, wherein the method receives a plurality of computer files, wherein each of the plurality of computer files comprises a label comprising a file name, a file type, a version number, and file size, and stores that plurality of computer files in a deduplication queue. The method then identifies a subset of the plurality of computer files, wherein each file of the subset comprises the same file name but a different version number, and wherein the subset comprises a maximum count of version numbers, and wherein the subset comprises a portion of the plurality of computer files. The method deduplicates the subset using a hash algorithm, and removes the subset from said deduplication queue.

    摘要翻译: 一种用于顺序地重复数据删除数据的方法,其中所述方法接收多个计算机文件,其中所述多个计算机文件中的每一个包括包括文件名,文件类型,版本号和文件大小的标签,并且存储所述多个 重复数据删除队列中的计算机文件。 该方法然后识别多个计算机文件的子集,其中该子集的每个文件包含相同的文件名但不同的版本号,并且其中该子集包括版本号的最大计数,并且其中该子集包括 多个计算机文件。 该方法使用散列算法对子集进行重复数据删除,并从所述重复数据消除队列中删除该子集。

    Apparatus and method to select a deduplication protocol for a data storage library
    2.
    发明授权
    Apparatus and method to select a deduplication protocol for a data storage library 有权
    为数据存储库选择重复数据消除协议的装置和方法

    公开(公告)号:US08234444B2

    公开(公告)日:2012-07-31

    申请号:US12046315

    申请日:2008-03-11

    IPC分类号: G06F12/00

    CPC分类号: G06F11/1084

    摘要: A method to select a deduplication protocol for a data storage library comprising a plurality of data storage devices configured as a RAID array, by establishing a normal deduplication protocol, a RAID failure deduplication protocol, and a multiple storage device failure deduplication protocol. The method receives host data comprising a plurality of interleaved data blocks. If the system is operating without any storage device failures, then the method processes the host data using the normal deduplication protocol. If the system is operating with a storage device failure, then the method processes the host data using the RAID failure deduplication protocol. If the system is operating with multiple storage device failures, then the method processes the host data using the multiple storage device failure deduplication protocol.

    摘要翻译: 通过建立正常的重复数据消除协议,RAID故障重复数据删除协议和多存储设备故障重复数据消除协议,为包括配置为RAID阵列的多个数据存储设备的数据存储库选择重复数据删除协议的方法。 该方法接收包括多个交错数据块的主机数据。 如果系统运行时没有任何存储设备故障,则该方法使用正常的重复数据删除协议处理主机数据。 如果系统正在运行存储设备故障,则该方法使用RAID故障重复数据删除协议处理主机数据。 如果系统正在运行多个存储设备故障,则该方法使用多个存储设备故障重复数据消除协议来处理主机数据。

    Apparatus and method to sequentially deduplicate groups of files comprising the same file name but different file version numbers
    3.
    发明授权
    Apparatus and method to sequentially deduplicate groups of files comprising the same file name but different file version numbers 失效
    依次重复数据删除包含相同文件名但不同文件版本号的文件组的装置和方法

    公开(公告)号:US08719240B2

    公开(公告)日:2014-05-06

    申请号:US12488489

    申请日:2009-06-19

    摘要: A method to sequentially deduplicate data, wherein the method receives a plurality of computer files, wherein each of the plurality of computer files comprises a label comprising a file name, a file type, a version number, and file size, and stores that plurality of computer files in a deduplication queue. The method then identifies a subset of the plurality of computer files, wherein each file of the subset comprises the same file name but a different version number, and wherein the subset comprises a maximum count of version numbers, and wherein the subset comprises a portion of the plurality of computer files. The method deduplicates the subset using a hash algorithm, and removes the subset from said deduplication queue. During the deduplicating, the method receives new computer files comprising the same file name, stores those new computer files to the deduplication queue, but does not add those new computer files to the subset.

    摘要翻译: 一种用于顺序地重复数据删除数据的方法,其中所述方法接收多个计算机文件,其中所述多个计算机文件中的每一个包括包括文件名,文件类型,版本号和文件大小的标签,并且存储所述多个 重复数据删除队列中的计算机文件。 该方法然后识别多个计算机文件的子集,其中该子集的每个文件包含相同的文件名但不同的版本号,并且其中该子集包括版本号的最大计数,并且其中该子集包括 多个计算机文件。 该方法使用散列算法对子集进行重复数据删除,并从所述重复数据消除队列中删除该子集。 在重复数据删除期间,该方法会接收包含相同文件名的新计算机文件,将这些新计算机文件存储到重复数据消除队列中,但不会将这些新计算机文件添加到子集。

    HAMMING RADIUS SEPARATED DEDUPLICATION LINKS
    7.
    发明申请
    HAMMING RADIUS SEPARATED DEDUPLICATION LINKS 审中-公开
    激活RADIUS分离的重复链接

    公开(公告)号:US20120210192A1

    公开(公告)日:2012-08-16

    申请号:US13453062

    申请日:2012-04-23

    IPC分类号: H03M13/19 G06F11/10 H03M13/29

    摘要: A data storage system includes a data storage array configured for de-duplication of duplicate data therein by: identification of a plurality of portions of data; a comparison of each portion of the data to identify duplicate data and identification of a link associated with each duplicate data; a determination of whether a Hamming link-separation-distance of the identified link is greater than twice a Hamming radius of an error correction code in the data storage system; and replacement of the duplicate data with the identified link when it is determined that the Hamming link-separation-distance is greater than twice the Hamming radius.

    摘要翻译: 数据存储系统包括:数据存储阵列,其配置用于通过以下方式对重复数据删除重复数据:识别数据的多个部分; 数据的每个部分的比较以识别与每个重复数据相关联的重复数据和标识; 确定所识别的链路的汉明链路间隔距离是否大于数据存储系统中的纠错码的汉明半径的两倍; 并且当确定汉明链路间隔距离大于汉明半径的两倍时,用所识别的链路替换重复数据。

    Mirrored Storage System and Methods for Operating a Mirrored Storage System
    9.
    发明申请
    Mirrored Storage System and Methods for Operating a Mirrored Storage System 有权
    镜像存储系统和操作镜像存储系统的方法

    公开(公告)号:US20080168246A1

    公开(公告)日:2008-07-10

    申请号:US11959642

    申请日:2007-12-19

    IPC分类号: G06F12/16

    CPC分类号: G06F11/2071

    摘要: A mirrored storage system for applications is provided, which enables and supports the variation and dynamic adaptation of the Recovery Point Objectives (RPO) based on policies. Furthermore, methods are provided for running such a mirrored storage system. Said mirrored storage system comprises a first storage system and at least one further storage system, wherein said first and said further storage system are connected via at least one mirror link. An application accesses said mirrored storage system via a network. Therewith, the data to be stored as response to a write command of said application can be mirrored according to a configurable time-varying RPO requirement of the application transmitting the corresponding write command.

    摘要翻译: 提供了一种用于应用程序的镜像存储系统,它支持并支持基于策略的恢复点目标(RPO)的变化和动态调整。 此外,提供了用于运行这样的镜像存储系统的方法。 所述镜像存储系统包括第一存储系统和至少一个另外的存储系统,其中所述第一和所述另外的存储系统经由至少一个镜像链路连接。 应用程序通过网络访问所述镜像存储系统。 因此,可以根据发送相应写入命令的应用程序的可配置的时变RPO要求,将要存储的数据作为对所述应用的写入命令的响应进行镜像。

    HAMMING RADIUS SEPARATED DEDUPLICATION LINKS
    10.
    发明申请
    HAMMING RADIUS SEPARATED DEDUPLICATION LINKS 有权
    激活RADIUS分离的重复链接

    公开(公告)号:US20100185922A1

    公开(公告)日:2010-07-22

    申请号:US12355442

    申请日:2009-01-16

    IPC分类号: H03M13/19 G06F11/07

    摘要: A method of de-duplicating duplicate data in a data storage system that includes identifying a plurality of portions of data, comparing each portion of the data to identify duplicate data and identifying a link associated with each duplicate data, determining whether a Hamming link-separation-distance between the identified link and all other existing links is greater than twice the Hamming radius of an error correction code in the data storage system, and then replacing the duplicate data with the identified link.

    摘要翻译: 一种在数据存储系统中去重复数据的方法,包括识别数据的多个部分,比较数据的每个部分以识别重复数据并识别与每个重复数据相关联的链接,确定汉明链路分离 所识别的链路和所有其他现有链路之间的距离大于数据存储系统中的纠错码的汉明半径的两倍,然后用所识别的链路替换重复数据。