Hamming radius separated deduplication links
    1.
    发明授权
    Hamming radius separated deduplication links 有权
    汉明半径分离重复数据删除链接

    公开(公告)号:US08196022B2

    公开(公告)日:2012-06-05

    申请号:US12355442

    申请日:2009-01-16

    IPC分类号: H03M13/00

    摘要: A method of de-duplicating duplicate data in a data storage system that includes identifying a plurality of portions of data, comparing each portion of the data to identify duplicate data and identifying a link associated with each duplicate data, determining whether a Hamming link-separation-distance between the identified link and all other existing links is greater than twice the Hamming radius of an error correction code in the data storage system, and then replacing the duplicate data with the identified link.

    摘要翻译: 一种在数据存储系统中去重复数据的方法,包括识别数据的多个部分,比较数据的每个部分以识别重复数据并识别与每个重复数据相关联的链接,确定汉明链路分离 所识别的链路和所有其他现有链路之间的距离大于数据存储系统中的纠错码的汉明半径的两倍,然后用所识别的链路替换重复数据。

    Apparatus and method to store information
    2.
    发明授权
    Apparatus and method to store information 失效
    用于存储信息的装置和方法

    公开(公告)号:US08311663B2

    公开(公告)日:2012-11-13

    申请号:US11219451

    申请日:2005-08-31

    IPC分类号: G06F7/00

    摘要: A method to store data is disclosed. The method provides a plurality of data storage media, an automated data library comprising one or more data storage devices, a first plurality of storage cells, and a robotic accessor. The method further provides a storage vault comprising a second plurality of storage cells but no data storage devices. The method selects the (i)th data storage medium and sets the (i)th data state, where that (i)th data state is selected from the group consisting of online, offline, and vault. If the method sets the (i)th data state is set to online, then the method mounts that (i)th data storage medium in one of the data storage devices. If the method sets the (i)th data state to offline, then the method removeably places the (i)th data storage medium in one of the first plurality of storage cells. If the method sets the (i)th data state is set to vault, then the method places the (i)th data storage medium in one of the second plurality of storage cells.

    摘要翻译: 公开了存储数据的方法。 该方法提供多个数据存储介质,包括一个或多个数据存储设备的自动数据库,第一多个存储单元和机器人存取器。 该方法还提供包括第二多个存储单元但不包括数据存储设备的存储库。 该方法选择第(i)个数据存储介质并设置第(i)个数据状态,其中第(i)个数据状态从由在线,离线和库组成的组中选择。 如果方法设置(i)数据状态设置为联机,则该方法将第(i)个数据存储介质安装在数据存储设备之一中。 如果该方法将(i)数据状态设置为离线,则该方法可移除地将第(i)个数据存储介质放置在第一多个存储单元之一中。 如果该方法将第(i)个数据状态设置为保险库,则该方法将第(i)个数据存储介质放置在第二多个存储单元之一中。

    Method and System for Command-Ordering and Command-Execution Within a Command Group for a Disk-to-Disk-to-Holographic Data Storage System
    3.
    发明申请
    Method and System for Command-Ordering and Command-Execution Within a Command Group for a Disk-to-Disk-to-Holographic Data Storage System 审中-公开
    用于磁盘到磁盘到全息数据存储系统的命令组中的命令排序和命令执行的方法和系统

    公开(公告)号:US20090196144A1

    公开(公告)日:2009-08-06

    申请号:US12027045

    申请日:2008-02-06

    IPC分类号: G11B7/00

    摘要: A system, method and computer program product for managing command ordering and command execution for a host-Disk-to-intermediate-Disk-to-Holographic (D2D2H) data storage system. Specifically, a command ordering and execution (COE) utility selects the command group from a command queue. A determination is made whether the command group includes a write command for writing an entire hologram segment. Responsive to a determination that the command group does not include the write command for writing the entire hologram segment, the entire hologram segment is read to an intermediate system disk. Conflicting commands are then sorted from non-conflicting commands. Specifically, all conflicting write commands are executed before all conflicting read commands. After execution, the entire hologram segment of the intermediate system disk is closed and written in holographic medium.

    摘要翻译: 一种用于管理主机 - 磁盘到中间磁盘到全息(D2D2H)数据存储系统的命令排序和命令执行的系统,方法和计算机程序产品。 具体来说,命令排序和执行(COE)实用程序从命令队列中选择命令组。 确定命令组是否包括用于写入整个全息图段的写入命令。 响应于确定命令组不包括用于写入整个全息片段的写入命令,整个全息片段被读取到中间系统盘。 然后从不冲突的命令中排序冲突的命令。 具体来说,所有冲突的写入命令都在所有冲突的读取命令之前执行。 执行完毕后,中间系统盘的整个全息片段被封闭并写在全息介质中。

    Method and System for Command-Ordering for a Disk-to-Disk-to-Holographic Data Storage System
    4.
    发明申请
    Method and System for Command-Ordering for a Disk-to-Disk-to-Holographic Data Storage System 审中-公开
    磁盘到磁盘到全息数据存储系统的命令排序方法和系统

    公开(公告)号:US20090196143A1

    公开(公告)日:2009-08-06

    申请号:US12026986

    申请日:2008-02-06

    IPC分类号: G11B7/00

    摘要: A system, method and computer program product for managing command ordering for a host-Disk-to-intermediate-Disk-to-Holographic (D2D2H) data storage system. Specifically, a command ordering detects a command from a host system. A hologram segment associated with the detected command is identified and a determination is made whether the hologram segment is an open hologram segment or a closed hologram segment. A determination is made whether the detected command is to be prioritized. If the detected command is prioritized, the detected command is added to a prioritized command queue. Moreover, if the detected command is not prioritized, the detected command is added to a normal command queue. The detected commands addressing the same hologram segment are then grouped. The execution of one or more grouped commands (prioritized or normal) is deferred for a predetermined period to allow for additional commands to be received for a same command group.

    摘要翻译: 一种用于管理主机 - 磁盘到中间磁盘到全息(D2D2H)数据存储系统的命令排序的系统,方法和计算机程序产品。 具体来说,命令排序检测来自主机系统的命令。 识别与检测到的命令相关联的全息片段,并且确定全息片段是开放全息片段还是闭合全息片段。 确定检测到的命令是否被优先化。 如果检测到的命令被优先,则检测到的命令被添加到优先级命令队列。 此外,如果检测到的命令未被优先化,则检测到的命令被添加到正常命令队列。 然后将检测到的寻址相同全息图段的命令分组。 一个或多个分组的命令的执行(优先级或正常)被延迟一段预定的时间段,以允许为相同的命令组接收附加的命令。

    Method of and system for adaptive selection of a deduplication chunking technique
    5.
    发明授权
    Method of and system for adaptive selection of a deduplication chunking technique 失效
    重复数据删除技术的自适应选择方法和系统

    公开(公告)号:US07519635B1

    公开(公告)日:2009-04-14

    申请号:US12059874

    申请日:2008-03-31

    IPC分类号: G06F12/00 G06F17/30

    摘要: A method of adaptively selecting an optimum data deduplication chunking method receives a request to deduplicate a file, wherein the file has a file type. The method searches a table of file types, wherein the table includes, for each file type, a chunking method, a deduplication ratio, and a depulication ratio threshold. The method selects a chunking method for the file according to the table. The method chunks the file using the selected chunking method. The method deduplicates the chunked file according to prior art deduplication methods. The method calculates a deduplication ratio for the file type and updates the table with the calculated deduplication ratio for the file type. If the calculated deduplication ratio for the file type is less than the deduplication ratio threshold for the file type, the method selects a new chunking method for the file type and updates the table of file types with the new chunking method for the file type.

    摘要翻译: 一种自适应地选择最佳重复数据删除分块方法的方法接收对文件进行重复数据删除的请求,其中文件具有文件类型。 该方法搜索文件类型的表格,其中对于每个文件类型,表格包括分块方法,重复数据删除比率和递减比率阈值。 该方法根据表格选择文件的分块方法。 该方法使用所选的分块方法来对文件进行分块。 该方法根据现有技术的重复数据删除方法对分块文件进行重复数据删除。 该方法计算文件类型的重复数据删除率,并使用计算出的文件类型的重复数据删除比更新表。 如果文件类型的计算重复数据删除率小于文件类型的重复数据删除率阈值,则该方法将为文件类型选择一种新的分块方法,并使用文件类型的新的分块方法更新文件类型表。

    Data deduplication using CRC-seed differentiation between data and stubs
    6.
    发明授权
    Data deduplication using CRC-seed differentiation between data and stubs 有权
    使用数据和存根之间的CRC种子差异进行重复数据删除

    公开(公告)号:US08453031B2

    公开(公告)日:2013-05-28

    申请号:US12730400

    申请日:2010-03-24

    IPC分类号: H03M13/00 G06F13/00 G06F17/00

    摘要: Various embodiments for differentiating between data and stubs pointing to a parent copy of deduplicated data are provided. Undeduplicated data is stored with a first cyclic redundancy check (CRC) seed. A stub pointing to the parent copy of the deduplicated data is stored with a second CRC seed. Subsequent to reading the deduplicated data, the first CRC seed is associated with the undeduplicated data, and the second CRC seed is associated with the stub. A CRC check is performed using one of the first and second CRC seeds. If the CRC check is positive, an I/O operation is allowed to proceed. If the CRC check is negative, an additional CRC check is performed using another one of the first and second CRC seeds.

    摘要翻译: 提供了用于区分指向重复数据删除数据的父副本的数据和存根之间的各种实施例。 未经复制的数据与第一循环冗余校验(CRC)种子一起存储。 指向重复数据删除数据的父副本的存根与第二个CRC种子一起存储。 在读取重复数据删除的数据之后,第一个CRC种子与未被复制的数据相关联,第二个CRC种子与存根相关联。 使用第一和第二CRC种子之一执行CRC校验。 如果CRC校验为正,则允许进行I / O操作。 如果CRC校验是否定的,则使用第一和第二CRC种子中的另一个来执行附加的CRC校验。

    Method of and system for deduplicating backed up data in a client-server environment
    7.
    发明授权
    Method of and system for deduplicating backed up data in a client-server environment 失效
    在客户端 - 服务器环境中对备份数据进行重复数据删除的方法和系统

    公开(公告)号:US07539710B1

    公开(公告)日:2009-05-26

    申请号:US12101541

    申请日:2008-04-11

    IPC分类号: G06F17/30

    摘要: In a method of and a system for deduplicating backed-up data backup clients create respective backup tables comprising a list of files and respective file types to be backed up. A backup server receives backup tables from the backup clients. The backup server merges the received backup tables to form a merged backup table. The backup server sorts the merged backup table according to file type from a file type yielding a best deduplication ratio to a file type yielding a worst deduplication ratio, thereby forming a sorted backup table. The backup server requests the files listed in the sorted backup table, in order, from the backup clients. The backup server deduplicates files received from the backup clients, in order, using deduplication parameters optimized according to file type. The method calculates an updated deduplication ratio for each deduplicated file type. Examples of deduplication parameters include chunking techniques and hashing techniques.

    摘要翻译: 在用于重复数据删除备份数据备份客户端的方法和系统中,将创建包含要备份的文件列表和相应文件类型的相应备份表。 备份服务器从备份客户端接收备份表。 备份服务器将收到的备份表合并,形成合并的备份表。 备份服务器根据来自文件类型的文件类型对合并的备份表进行排序,从而产生最佳的重复数据删除率,从而产生最差的重复数据删除率,从而形成排序的备份表。 备份服务器从备份客户端按顺序请求排序的备份表中列出的文件。 备份服务器重复数据删除从备份客户端接收的文件,以便使用根据文件类型优化的重复数据删除参数。 该方法针对每个重复数据删除的文件类型计算更新的重复数据删除率。 重复数据删除参数的示例包括分块技术和散列技术。

    Apparatus and Method to Store and Manage Information and Meta Data
    8.
    发明申请
    Apparatus and Method to Store and Manage Information and Meta Data 失效
    存储和管理信息和元数据的设备和方法

    公开(公告)号:US20080016128A1

    公开(公告)日:2008-01-17

    申请号:US11457087

    申请日:2006-07-12

    IPC分类号: G06F12/00

    CPC分类号: G06F17/30315 G06Q10/00

    摘要: An apparatus and method to store data are disclosed. The method provides a data storage system comprising a fossilized data management apparatus interconnected with one or more data storage devices. The method provides to the fossilized data management apparatus information and meta data associated with that information, wherein the meta data comprises a format field, a context field, a retention field, a data management field, and a storage management field. The fossilized data management apparatus instructs the one or more data storage devices to write the information to the one or more data storage devices based upon the meta data format field.

    摘要翻译: 公开了存储数据的装置和方法。 该方法提供一种包括与一个或多个数据存储设备互连的化石数据管理设备的数据存储系统。 该方法提供了与该信息相关联的化石数据管理装置信息和元数据,其中元数据包括格式字段,上下文字段,保留字段,数据管理字段和存储管理字段。 化石数据管理装置指示一个或多个数据存储装置基于元数据格式字段将信息写入一个或多个数据存储装置。

    FILE SYSTEM WITH INTERNAL DEDUPLICATION AND MANAGEMENT OF DATA BLOCKS
    10.
    发明申请
    FILE SYSTEM WITH INTERNAL DEDUPLICATION AND MANAGEMENT OF DATA BLOCKS 有权
    具有数据块内部重用和管理的文件系统

    公开(公告)号:US20100121825A1

    公开(公告)日:2010-05-13

    申请号:US12270101

    申请日:2008-11-13

    IPC分类号: G06F12/02

    CPC分类号: G06F17/30156

    摘要: A method for deduplicating and managing data blocks within a file system includes adding a deduplication identifier to each pointer pointing to a data block to indicate whether the data block is deduplicated, detecting duplicate data blocks, determining whether one of the duplicate data blocks has been deduplicated, when detected, determining that one duplicate data block is a master copy when it is determined that one duplicate data block has been deduplicated, selecting one of the duplicate data blocks to be a master copy when it is determined that the duplicate data blocks have not been deduplicated, and setting the deduplication identifier of the selected duplicate data block to indicate deduplication, and determining that the other duplicate data block is a new duplicate data block and setting the deduplication identifier of the other duplicate data block to indicate deduplication and directing the respective pointer to the master copy.

    摘要翻译: 一种用于对文件系统内的数据块进行重复数据删除和管理的方法包括:将重复数据删除标识符添加到指向数据块的每个指针,以指示数据块是否被重复数据删除,检测重复数据块,确定重复数据块中的一个是否已被重复数据删除 当检测到时,当确定一个重复数据块已被重复数据删除时,确定一个重复数据块是主副本,当确定重复数据块没有被复制时,选择一个复制数据块作为主副本 被重复数据删除,并且将所选择的重复数据块的重复数据删除标识符设置为指示重复数据消除,并且确定另一个重复数据块是新的重复数据块,并且设置其他重复数据块的重复数据删除标识符以指示重复数据删除并且指示相应的 指向主副本的指针。