MANAGING BACKUPS OF DATA OBJECTS IN CONTAINERS
    1.
    发明申请
    MANAGING BACKUPS OF DATA OBJECTS IN CONTAINERS 有权
    管理集装箱数据对象的备份

    公开(公告)号:US20130110784A1

    公开(公告)日:2013-05-02

    申请号:US13285331

    申请日:2011-10-31

    IPC分类号: G06F7/00

    摘要: Containers that store data objects that were written to those containers during a particular backup are accessed. Then, a subset of the containers is identified; the containers in the subset have less than a threshold number of data objects associated with the particular backup. Data objects that are in containers in that subset and that are associated with the backup are copied to one or more other containers. Those other containers are subsequently used to restore data objects associated with the backup.

    摘要翻译: 存储在特定备份期间存储写入这些容器的数据对象的容器。 然后,识别容器的一个子集; 子集中的容器具有小于阈值数量的与特定备份相关联的数据对象。 位于该子集中并与备份关联的容器中的数据对象将复制到一个或多个其他容器。 这些其他容器随后用于还原与备份相关联的数据对象。

    Managing backups of data objects in containers
    2.
    发明授权
    Managing backups of data objects in containers 有权
    管理容器中数据对象的备份

    公开(公告)号:US08874522B2

    公开(公告)日:2014-10-28

    申请号:US13285331

    申请日:2011-10-31

    IPC分类号: G06F17/30 G06F7/00 G06F11/14

    摘要: Containers that store data objects that were written to those containers during a particular backup are accessed. Then, a subset of the containers is identified; the containers in the subset have less than a threshold number of data objects associated with the particular backup. Data objects that are in containers in that subset and that are associated with the backup are copied to one or more other containers. Those other containers are subsequently used to restore data objects associated with the backup.

    摘要翻译: 存储在特定备份期间存储写入这些容器的数据对象的容器。 然后,识别容器的一个子集; 子集中的容器具有小于阈值数量的与特定备份相关联的数据对象。 位于该子集中并与备份关联的容器中的数据对象将复制到一个或多个其他容器。 这些其他容器随后用于还原与备份相关联的数据对象。

    De-duplication storage system with improved reference update efficiency
    3.
    发明授权
    De-duplication storage system with improved reference update efficiency 有权
    重复数据删除存储系统具有改进的参考更新效率

    公开(公告)号:US08914324B1

    公开(公告)日:2014-12-16

    申请号:US12580785

    申请日:2009-10-16

    IPC分类号: G06F17/30

    CPC分类号: G06F11/1453 G06F17/30156

    摘要: A system and method for backing up files to a single-instance storage system are disclosed. The files may be split into segments, and the file data may be stored in the single-instance storage system as individual segments. The single-instance storage system uses the concept of a file region which covers multiple segments of the file. If a region of a file is unchanged from one backup to the next, the system may use a region object to refer to the unchanged region. This avoids the need to update the reference information for each of the segments within the region, thus increasing the efficiency of backing up the new version of the file.

    摘要翻译: 公开了将文件备份到单实例存储系统的系统和方法。 文件可以被分割成段,并且文件数据可以作为单个段存储在单实例存储系统中。 单实例存储系统使用涵盖文件多个段的文件区域的概念。 如果文件的一个区域从一个备份到下一个备份不变,则系统可以使用区域对象来引用未更改的区域。 这避免了需要更新区域内每个段的参考信息,从而提高了备份新版本文件的效率。

    Systems and methods for providing increased scalability in deduplication storage systems
    4.
    发明授权
    Systems and methods for providing increased scalability in deduplication storage systems 有权
    在重复数据删除存储系统中提供更高可扩展性的系统和方法

    公开(公告)号:US08954401B2

    公开(公告)日:2015-02-10

    申请号:US13007301

    申请日:2011-01-14

    摘要: A computer-implemented method for providing increased scalability in deduplication storage systems may include (1) identifying a database that stores a plurality of reference objects, (2) determining that at least one size-related characteristic of the database has reached a predetermined threshold, (3) partitioning the database into a plurality of sub-databases capable of being updated independent of one another, (4) identifying a request to perform an update operation that updates one or more reference objects stored within at least one sub-database, and then (5) performing the update operation on less than all of the sub-databases to avoid processing costs associated with performing the update operation on all of the sub-databases. Various other systems, methods, and computer-readable media are also disclosed.

    摘要翻译: 用于在重复数据删除存储系统中提供增加的可扩展性的计算机实现的方法可以包括(1)识别存储多个参考对象的数据库,(2)确定数据库的至少一个尺寸相关特性已经达到预定阈值, (3)将数据库分割成能够彼此独立地更新的多个子数据库,(4)识别执行更新存储在至少一个子数据库中的一个或多个参考对象的更新操作的请求,以及 然后(5)在小于所有子数据库的情况下执行更新操作,以避免处理与对所有子数据库执行更新操作相关联的成本。 还公开了各种其它系统,方法和计算机可读介质。

    Systems and Methods for Providing Increased Scalability in Deduplication Storage Systems
    5.
    发明申请
    Systems and Methods for Providing Increased Scalability in Deduplication Storage Systems 有权
    在重复数据删除存储系统中提高可扩展性的系统和方法

    公开(公告)号:US20120185447A1

    公开(公告)日:2012-07-19

    申请号:US13007301

    申请日:2011-01-14

    IPC分类号: G06F17/30

    摘要: A computer-implemented method for providing increased scalability in deduplication storage systems may include (1) identifying a database that stores a plurality of reference objects, (2) determining that at least one size-related characteristic of the database has reached a predetermined threshold, (3) partitioning the database into a plurality of sub-databases capable of being updated independent of one another, (4) identifying a request to perform an update operation that updates one or more reference objects stored within at least one sub-database, and then (5) performing the update operation on less than all of the sub-databases to avoid processing costs associated with performing the update operation on all of the sub-databases. Various other systems, methods, and computer-readable media are also disclosed.

    摘要翻译: 用于在重复数据删除存储系统中提供增加的可扩展性的计算机实现的方法可以包括(1)识别存储多个参考对象的数据库,(2)确定数据库的至少一个尺寸相关特性已经达到预定阈值, (3)将数据库分割成能够彼此独立地更新的多个子数据库,(4)识别执行更新存储在至少一个子数据库中的一个或多个参考对象的更新操作的请求,以及 然后(5)在小于所有子数据库的情况下执行更新操作,以避免处理与对所有子数据库执行更新操作相关联的成本。 还公开了各种其它系统,方法和计算机可读介质。

    Systems and methods for validating ownership of deduplicated data
    6.
    发明授权
    Systems and methods for validating ownership of deduplicated data 有权
    用于验证重复数据删除数据的所有权的系统和方法

    公开(公告)号:US08769627B1

    公开(公告)日:2014-07-01

    申请号:US13314496

    申请日:2011-12-08

    IPC分类号: G06F7/04 G06F11/14 G06F3/06

    摘要: A computer-implemented method for validating ownership of deduplicated data may include (1) identifying a request from a remote client to store a data object in a data store that already includes an instance of the data object, (2) in response to the request, verifying that the remote client possesses the data object by (i) issuing a randomized challenge to the remote client, the randomized challenge including a random value which, when combined with at least a portion of the data object, produces an authentication token demonstrating possession of the data object and, in response to the randomized challenge, (ii) receiving the authentication token from the remote client; and, in response to receiving the authentication token from the remote client, (3) storing the data object in the data store on behalf of the remote client. Various other methods and systems are also disclosed.

    摘要翻译: 用于验证重复数据删除数据的所有权的计算机实现的方法可以包括(1)识别来自远程客户端的请求以将数据对象存储在已经包括数据对象的实例的数据存储中,(2)响应于该请求 ,通过(i)向远程客户端发出随机挑战来验证远程客户端拥有数据对象,随机挑战包括随机值,该随机值当与数据对象的至少一部分组合时产生证明拥有的认证令牌 并且响应于随机挑战,(ii)从远程客户端接收认证令牌; 并且响应于从所述远程客户端接收所述认证令牌,(3)代表所述远程客户机将所述数据对象存储在所述数据存储器中。 还公开了各种其它方法和系统。

    System and method for high performance deduplication indexing
    7.
    发明授权
    System and method for high performance deduplication indexing 有权
    高性能重复数据删除索引的系统和方法

    公开(公告)号:US08370315B1

    公开(公告)日:2013-02-05

    申请号:US12790461

    申请日:2010-05-28

    IPC分类号: G06F7/00 G06F17/00 G06F17/30

    摘要: A system and method for efficiently reducing latency of accessing an index for a data segment stored on a server. A server both removes duplicate data and prevents duplicate data from being stored in a shared data storage. The file server is coupled to an index storage subsystem holding fingerprint and pointer value pairs corresponding to a data segment stored in the shared data storage. The pairs are stored in a predetermined order. The file server utilizes an ordered binary search tree to identify a particular block of multiple blocks within the index storage subsystem corresponding to a received memory access request. The index storage subsystem determines whether an entry corresponding to the memory access request is located within the identified block. Based on at least this determination, the file server processes the memory access request accordingly. In one embodiment, the index storage subsystem is a solid-state disk (SSD).

    摘要翻译: 一种用于有效地减少访问存储在服务器上的数据段的索引的延迟的系统和方法。 一个服务器都会删除重复的数据,并防止重复的数据存储在共享的数据存储中。 文件服务器耦合到保存与存储在共享数据存储器中的数据段对应的指纹和指针值对的索引存储子系统。 这些对以预定顺序存储。 文件服务器利用有序的二叉搜索树来识别与所接收的存储器访问请求对应的索引存储子系统内的多个块的特定块。 索引存储子系统确定与存储器访问请求相对应的条目是否位于所识别的块内。 至少基于这一决定,文件服务器相应地处理存储器访问请求。 在一个实施例中,索引存储子系统是固态盘(SSD)。

    Progressive sampling for deduplication indexing
    8.
    发明授权
    Progressive sampling for deduplication indexing 有权
    重复数据删除索引的逐行抽样

    公开(公告)号:US08311964B1

    公开(公告)日:2012-11-13

    申请号:US12617426

    申请日:2009-11-12

    IPC分类号: G06F17/00 G06N5/00

    摘要: A system and method for efficiently reducing a number of duplicate blocks of stored data. A file server both removes duplicate data and prevents duplicate data from being stored in the shared storage. A sampling rate may be used to determine which fingerprints, or hash values, are stored in an index. The sampling rate may be modified in response to changes in characteristics of the system, such as a change in the shared storage size, a change in a utilization of the shared storage, a change in the size of the storage unit, and reaching a threshold corresponding to utilization of the index. Also, a small cache may be maintained for holding fingerprint and pointer pair values prefetched from the shared storage. Each prefetched pair may be associated with data corresponding to a previous hit in the index. The association may be related to spatial locality, temporal locality, or otherwise.

    摘要翻译: 一种用于有效地减少存储数据的多个重复块的系统和方法。 文件服务器同时删除重复数据,并防止重复数据存储在共享存储中。 可以使用采样率来确定哪些指纹或散列值存储在索引中。 可以响应于系统特性的变化来修改采样率,例如共享存储大小的变化,共享存储器的利用率的改变,存储单元的大小的变化以及达到阈值 对应于索引的利用。 此外,可以维护小的缓存以保持从共享存储器预取的指纹和指针对值。 每个预取对可以与对应于索引中的先前命中的数据相关联。 该关联可能与空间局部性,时间局部性或其他方面有关。

    De-duplication Storage System with Multiple Indices for Efficient File Storage
    9.
    发明申请
    De-duplication Storage System with Multiple Indices for Efficient File Storage 审中-公开
    具有多个索引的重复数据删除存储系统,用于高效的文件存储

    公开(公告)号:US20110093439A1

    公开(公告)日:2011-04-21

    申请号:US12580697

    申请日:2009-10-16

    申请人: Fanglu Guo Weibao Wu

    发明人: Fanglu Guo Weibao Wu

    IPC分类号: G06F12/00 G06F12/16

    CPC分类号: G06F11/1453 G06F11/1464

    摘要: A de-duplication storage system which uses multiple indices is described. A first group of one or more indices may be stored in random access memory (RAM) or another type of fast storage. A second group of one or more indices may be stored on one or more disk drives or another type of storage where large amounts of data can be stored inexpensively. The first group of indices may be used when adding new files to the de-duplication storage system in order to determine whether the file segments of the new files are already stored. The second group of indices may be used when restoring files in order to lookup the segments of the files.

    摘要翻译: 描述了使用多个索引的重复数据删除存储系统。 一个或多个索引的第一组可以存储在随机存取存储器(RAM)或另一类型的快速存储器中。 可以将一个或多个索引的第二组存储在一个或多个磁盘驱动器或其他类型的存储器上,其中可以廉价地存储大量数据。 当将新文件添加到重复数据删除存储系统中时,可以使用第一组索引,以便确定新文件的文件段是否已被存储。 在恢复文件时可以使用第二组索引,以便查找文件的段。

    Systems and methods for removing unreferenced data segments from deduplicated data systems
    10.
    发明授权
    Systems and methods for removing unreferenced data segments from deduplicated data systems 有权
    从重复数据删除的数据系统中删除未引用的数据段的系统和方法

    公开(公告)号:US08224875B1

    公开(公告)日:2012-07-17

    申请号:US12652333

    申请日:2010-01-05

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30303

    摘要: A computer-implemented method for removing unreferenced data segments from deduplicated data systems may include: 1) identifying a deduplicated data system that contains a plurality of data segments, 2) identifying a plurality of containers within the deduplicated data system, with each container containing a subset of the data segments within the deduplicated data system, 3) identifying at least one container within the plurality of containers that is likely to include a large proportion of data segments that are not referenced by data objects within the deduplicated data system, and then, for each identified container, 4) searching for unreferenced data segments within the identified container and 5) removing the unreferenced data segments from the identified container. Various other methods, systems, and computer-readable media are also disclosed.

    摘要翻译: 用于从重复数据删除的数据系统中去除未引用的数据段的计算机实现的方法可以包括:1)识别包含多个数据段的重复数据删除的数据系统,2)识别重复数据删除数据系统内的多个容器,每个容器包含 在重复数据删除的数据系统内的数据段的子集,3)识别多个容器内的至少一个可能包含大量数据段的容器,这些数据段不被重复数据删除的数据系统内的数据对象引用, 对于每个已识别的容器,4)搜索所识别的容器内的未引用的数据段,以及5)从识别的容器中移除未引用的数据段。 还公开了各种其它方法,系统和计算机可读介质。