Executing prioritized replication requests for objects in a distributed storage system
    31.
    发明授权
    Executing prioritized replication requests for objects in a distributed storage system 有权
    对分布式存储系统中的对象执行优先级复制请求

    公开(公告)号:US08285686B2

    公开(公告)日:2012-10-09

    申请号:US13024243

    申请日:2011-02-09

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30575

    摘要: A system and method for executing replication requests for objects in a distributed storage system is provided. A replication queue is identified from a plurality of replication queues corresponding to a replication key. The replication key includes information related to at least a source storage device in a distributed storage system at which objects are located and a destination storage device in the distributed storage system to which the objects are to be replicated. A distributed database is scanned using an identifier of the replication queue to produce a list of replication requests corresponding to the replication queue. The records of the distributed database are distributed across a plurality of nodes of the distributed database. The replication requests in the list of replication requests are executed in priority order. Replication requests are deleted from the distributed database only when the replication requests are complete.

    摘要翻译: 提供了一种用于对分布式存储系统中的对象执行复制请求的系统和方法。 从对应于复制密钥的多个复制队列中识别复制队列。 复制密钥包括与对象所在的分布式存储系统中的至少一个源存储设备相关的信息,以及将被复制对象的分布式存储系统中的目的地存储设备。 使用复制队列的标识符扫描分布式数据库,以产生与复制队列对应的复制请求的列表。 分布式数据库的记录分布在分布式数据库的多个节点上。 复制请求列表中的复制请求按优先级顺序执行。 仅当复制请求完成时,才能从分布式数据库中删除复制请求。

    System and Method for Determining the Age of Objects in the Presence of Unreliable Clocks
    32.
    发明申请
    System and Method for Determining the Age of Objects in the Presence of Unreliable Clocks 有权
    用于确定不可靠时钟存在的对象时代的系统和方法

    公开(公告)号:US20110196901A1

    公开(公告)日:2011-08-11

    申请号:US13022551

    申请日:2011-02-07

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30336 G06F17/30551

    摘要: A system and method for determining an age of an object is provided. A first index for a timestamp entry in a sequence of timestamps corresponding to a time at which an object was created is identified. At least one subsequence of timestamps from the sequence of timestamps having indexes for entries in the sequence of timestamps that are between the first index in the sequence of timestamps and a last index for a last timestamp entry in the sequence of timestamps is identified, wherein the at least one subsequence of timestamps conforms to a function of a time interval between storage of consecutive current timestamps reported by clock of the computer system. Timestamps from the sequence of timestamps that are not included in the at least one subsequence of timestamps are removed. An age of the object is determined based on the at least one subsequence of timestamps.

    摘要翻译: 提供了一种用于确定对象的年龄的系统和方法。 识别对应于创建对象的时间的时间戳序列中的时间戳条目的第一索引。 识别来自具有时间戳序列中的时间序列序列中的条目的时间戳序列的时间戳的至少一个子序列,其中时间戳序列中的第一索引和时间戳序列中的最后时间戳条目的最后索引之间,其中, 时间戳的至少一个子序列符合由计算机系统的时钟报告的连续当前时间戳的存储之间的时间间隔的函数。 来自不包括在时间戳的至少一个子序列中的时间戳序列的时间戳被去除。 基于时间戳的至少一个子序列来确定对象的年龄。

    Storage of Data In A Distributed Storage System
    33.
    发明申请
    Storage of Data In A Distributed Storage System 审中-公开
    数据存储在分布式存储系统中

    公开(公告)号:US20110196900A1

    公开(公告)日:2011-08-11

    申请号:US13023482

    申请日:2011-02-08

    IPC分类号: G06F7/00

    摘要: A distributed storage system stores data for files. A first blob (binary large object) of data is received. The first blob is split into one or more first chunks of data. Content fingerprints for the first chunks of data are computed. The first chunks of data are stored in a chunk store while and their content fingerprints are stored in a store distinct from the chunk store. A second blob of data is received. The second blob is split into one or more second chunks of data. Content fingerprints for the second chunks of data are computed. Then for a second chunk of data whose content fingerprint matches a content fingerprint of a first chunk of data, a second reference to the corresponding first chunk of data that has a matching content fingerprint is stored, but the second chunk of data is not stored.

    摘要翻译: 分布式存储系统存储文件数据。 接收到第一个blob(二进制大对象)数据。 第一个blob被分成一个或多个第一批数据。 计算第一批数据的内容指纹。 第一批数据被存储在块存储器中,并且它们的内容指纹被存储在与块存储器不同的存储器中。 接收第二个数据块。 第二个Blob被分成一个或多个第二个数据块。 计算第二批数据的内容指纹。 然后,对于其内容指纹与第一组数据的内容指纹匹配的第二数据块,存储具有匹配内容指纹的对应的第一数据块的第二参考,但不存储第二组数据。

    System and Method for Replicating Objects In A Distributed Storage System
    34.
    发明申请
    System and Method for Replicating Objects In A Distributed Storage System 有权
    在分布式存储系统中复制对象的系统和方法

    公开(公告)号:US20110196873A1

    公开(公告)日:2011-08-11

    申请号:US13022564

    申请日:2011-02-07

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30575

    摘要: A system and method for inserting an object into a distributed database is provided. An object to be inserted into a priority queue is received, wherein the object includes a unique identifier and a priority. Next, an index for the object is generated. A row name for the object is then generated based on the index, the priority of the object, and the unique identifier of the object, wherein a lexicographical order of the row name for a higher priority object is smaller than the lexicographical order of the row name for a lower priority object. The object is then inserted into a row of a distributed database using the row name.

    摘要翻译: 提供了一种用于将对象插入分布式数据库的系统和方法。 接收到要插入到优先级队列中的对象,其中对象包括唯一标识符和优先级。 接下来,生成对象的索引。 然后基于索引,对象的优先级和对象的唯一标识符生成对象的行名称,其中用于较高优先级对象的行名称的字典顺序小于行的字典顺序 名称为较低优先级对象。 然后使用行名称将对象插入到分布式数据库的行中。

    Storage of Data In A Distributed Storage System
    35.
    发明申请
    Storage of Data In A Distributed Storage System 有权
    数据存储在分布式存储系统中

    公开(公告)号:US20110196833A1

    公开(公告)日:2011-08-11

    申请号:US13023503

    申请日:2011-02-08

    IPC分类号: G06F17/30

    摘要: A distributed storage system has multiple instances. There is a plurality of local instances, and at least some of the local instances are at physically distinct geographic locations. Each local instance is configured to store data for a non-empty set of blobs in a plurality of data stores having a plurality of distinct data store types. In addition, each local instance stores metadata for the respective set of blobs in a metadata store distinct from the data stores. There is also a plurality of global instances. Each global instance is configured to store data for zero or more blobs in zero or more data stores and store metadata for all blobs stored at any local or global instance. The system selects one global instance to run a replication module that replicates blobs between instances according to blob policies. Some systems also include dynamic replication based on user needs.

    摘要翻译: 分布式存储系统有多个实例。 存在多个本地实例,并且至少一些本地实例位于物理上不同的地理位置。 每个本地实例被配置为在具有多个不同数据存储类型的多个数据存储器中存储用于非空的一组Blob的数据。 此外,每个本地实例存储与数据存储不同的元数据存储中的相应组的组的元数据。 还有多个全局实例。 每个全局实例被配置为在零个或多个数据存储中存储零个或多个blob的数据,并存储在任何本地或全局实例中存储的所有blob的元数据。 系统选择一个全局实例来运行复制模块,该复制模块根据blob策略在实例之间复制Blob。 一些系统还包括基于用户需求的动态复制。

    Location Assignment Daemon (LAD) For A Distributed Storage System
    36.
    发明申请
    Location Assignment Daemon (LAD) For A Distributed Storage System 有权
    分布式存储系统的位置分配守护程序(LAD)

    公开(公告)号:US20110196832A1

    公开(公告)日:2011-08-11

    申请号:US13022258

    申请日:2011-02-07

    IPC分类号: G06F17/30

    摘要: A system and method for generating replication requests for objects in a distributed storage system is provided. For a respective object in a distributed storage system the following is performed. Replication policies for the object that have not been satisfied are determined. Replication requests are ranked for the object whose replication policies have not been satisfied based on a number of replicas of the object that need to be created in order to satisfy the replication policies for the object. Replication requests are generated for the object based at least in part on the replication policies for the object that have not been satisfied and on a current state of the distributed storage system. At least a subset of the replication requests for the objects in the distributed storage system are distributed to respective instances of the distributed storage system corresponding to the replication requests for execution.

    摘要翻译: 提供了一种用于在分布式存储系统中生成对象的复制请求的系统和方法。 对于分布式存储系统中的相应对象,执行以下操作。 确定尚未满足的对象的复制策略。 基于需要创建的对象的副本的数量,为了满足对象的复制策略,复制请求被排序为其复制策略尚未满足的对象。 至少部分地基于对于尚未满足的对象和分布式存储系统的当前状态的复制策略为该对象生成复制请求。 对分布式存储系统中的对象的复制请求的至少一个子集分配到与复制请求执行相对应的分布式存储系统的相应实例。

    Location Assignment Daemon (LAD) Simulation System and Method
    37.
    发明申请
    Location Assignment Daemon (LAD) Simulation System and Method 有权
    位置分配守护程序(LAD)模拟系统和方法

    公开(公告)号:US20110196664A1

    公开(公告)日:2011-08-11

    申请号:US13022236

    申请日:2011-02-07

    IPC分类号: G06F13/10

    摘要: A system and method for simulating a state of a distributed storage system is provided. A current state of a distributed storage system and replication policies for the objects in the distributed storage system is obtained. Proposed modifications to the current state of the distributed storage system are received. The state of the distributed storage system is simulated over time based on the current state of the distributed storage system, the replication policies for the objects in the distributed storage system, and the proposed modifications to the current state of the distributed storage system. Then reports relating to the time evolution of the current state of the distributed storage system are generated based on the simulation.

    摘要翻译: 提供了一种用于模拟分布式存储系统的状态的系统和方法。 获得分布式存储系统的当前状态和分布式存储系统中的对象的复制策略。 接收对分布式存储系统的当前状态的修改。 基于分布式存储系统的当前状态,分布式存储系统中对象的复制策略以及对分布式存储系统当前状态的修改,分布式存储系统的状态随着时间的推移而被模拟。 然后根据仿真生成关于分布式存储系统的当前状态的时间演化的报告。