FAULT-TOLERANT METHODS, SYSTEMS AND ARCHITECTURES FOR DATA STORAGE, RETRIEVAL AND DISTRIBUTION
    1.
    发明申请
    FAULT-TOLERANT METHODS, SYSTEMS AND ARCHITECTURES FOR DATA STORAGE, RETRIEVAL AND DISTRIBUTION 审中-公开
    用于数据存储,检索和分发的容错方法,系统和体系结构

    公开(公告)号:WO2017053321A3

    公开(公告)日:2017-05-18

    申请号:PCT/US2016052707

    申请日:2016-09-20

    IPC分类号: G06F11/20

    摘要: The disclosure is directed towards fault-tolerant methods, systems and architectures for data distribution. One method includes generating fault distribution tables. The table entries correspond to a copy of data records. The entry and copy are associated with a fault status, a node, and a group that are based on a position of the entry within the distribution table. The method also includes storing the copy of the data record that corresponds to the entry in a database that is included in a plurality of databases. In response to determining an unavailable node included in the plurality of nodes, the method determines a fault status, a node, and a group. The method provides an available node sequential access to data records that are stored in a particular database that is stored locally on the available node in a tree structure.

    摘要翻译: 本公开针对用于数据分发的容错方法,系统和体系结构。 一种方法包括生成故障分布表。 表条目对应于数据记录的副本。 条目和副本与故障状态,节点和基于分配表内条目位置的组相关联。 该方法还包括将对应于条目的数据记录的副本存储在包含在多个数据库中的数据库中。 响应于确定包括在多个节点中的不可用节点,该方法确定故障状态,节点和组。 该方法提供对存储在特定数据库中的数据记录的可用节点顺序访问,该数据库以树结构本地存储在可用节点上。

    DYNAMICALLY CHANGING MEMBERS OF A CONSENSUS GROUP IN A DISTRIBUTED SELF-HEALING COORDINATION SERVICE
    3.
    发明申请
    DYNAMICALLY CHANGING MEMBERS OF A CONSENSUS GROUP IN A DISTRIBUTED SELF-HEALING COORDINATION SERVICE 审中-公开
    分布式自我协调服务中的共同团体动态更改成员

    公开(公告)号:WO2016010972A1

    公开(公告)日:2016-01-21

    申请号:PCT/US2015/040289

    申请日:2015-07-14

    申请人: COHESITY, INC.

    IPC分类号: G06F7/00

    摘要: Systems, methods, and computer program products for managing a consensus group in a distributed computing cluster, by determining that an instance of an authority module executing on a first node, of a consensus group of nodes in the distributed computing cluster, has failed; and adding, by an instance of the authority module on a second node of the consensus group, a new node to the consensus group to replace the first node. The new node is a node in the computing cluster that was not a member of the consensus group at the time the instance of the authority module executing on the first node is determined to have failed.

    摘要翻译: 通过确定在分布式计算集群中的共同组节点的第一节点上执行的权限模块的实例已经失败的用于管理分布式计算集群中的共享组的系统,方法和计算机程序产品; 并且通过所述协同组的第二节点上的权限模块的实例,将所述共享组的新节点添加以替换所述第一节点。 新节点是在第一个节点上执行的权限模块的实例被确定为失败的时候,该计算集群中的节点不是共享组的成员。

    INTELLIGENT DISASTER RECOVERY
    4.
    发明申请
    INTELLIGENT DISASTER RECOVERY 审中-公开
    智能灾难恢复

    公开(公告)号:WO2015179533A1

    公开(公告)日:2015-11-26

    申请号:PCT/US2015/031796

    申请日:2015-05-20

    申请人: COHESITY, INC.

    IPC分类号: G06F11/00 G06F11/16

    摘要: One embodiment of the invention includes a system for performing intelligent disaster recovery. The system includes a processor and a memory. The memory stores a first monitor application that, when executed on the processor, performs an operation. The operation includes communicating with a second monitor application hosted at a secondary data center to determine an availability of one or more computer servers at a primary data center. The operation also includes upon reaching a consensus with the second monitor application that one or more computer servers at the primary data center are unavailable to process client requests, relative to both the first monitor application and the second monitor application, initiating a failover operation. Embodiments of the invention also include a method and a computer-readable medium for performing intelligent disaster recovery.

    摘要翻译: 本发明的一个实施例包括用于执行智能灾难恢复的系统。 该系统包括处理器和存储器。 存储器存储当在处理器上执行时执行操作的第一监视器应用。 该操作包括与驻留在辅助数据中心的第二监视器应用进行通信以确定主数据中心处的一个或多个计算机服务器的可用性。 该操作还包括在与第二监视器应用达成一致的情况下,主数据中心的一个或多个计算机服务器相对于第一监视器应用程序和第二监视器应用程序无法处理客户端请求,启动故障切换操作。 本发明的实施例还包括用于执行智能灾难恢复的方法和计算机可读介质。

    PARTITION TOLERANCE IN CLUSTER MEMBERSHIP MANAGEMENT
    5.
    发明申请
    PARTITION TOLERANCE IN CLUSTER MEMBERSHIP MANAGEMENT 审中-公开
    集体会员管理中的分部宽容

    公开(公告)号:WO2015030895A1

    公开(公告)日:2015-03-05

    申请号:PCT/US2014/041172

    申请日:2014-06-05

    申请人: VMWARE, INC.

    IPC分类号: G06F11/14 G06F11/20

    摘要: Techniques are disclosed for managing a cluster of computing nodes following a division of the cluster into at least a first and second partition, where the cluster aggregates local storage resources of the nodes to provide an object store, and objects stored in the object store are divided into data components stored across the nodes. In accordance with one method, it is determined that a majority of data components comprising a first object are stored within nodes in the first partition. It is determined that a majority of data components comprising a second object are stored within nodes in the second partition. Configuration objects are permitted to be performed on the first object in the first partition while denying access to the first object from the second partition, and on the second object in the second partition while denying access to the second object from the first partition.

    摘要翻译: 公开了用于在将集群划分成至少第一和第二分区之后管理计算节点集群的技术,其中,集群聚集节点的本地存储资源以提供对象存储,并且存储在对象存储中的对象被划分 转换成跨节点存储的数据组件。 根据一种方法,确定包括第一对象的大多数数据组件存储在第一分区中的节点内。 确定包括第二对象的大多数数据组件存储在第二分区中的节点内。 允许对第一分区中的第一对象执行配置对象,同时拒绝对来自第二分区的第一对象的访问,以及拒绝从第一分区访问第二对象的第二分区中的第二对象。

    MOBILE HADOOP CLUSTERS
    6.
    发明申请
    MOBILE HADOOP CLUSTERS 审中-公开
    移动HADOOP群集

    公开(公告)号:WO2014022674A1

    公开(公告)日:2014-02-06

    申请号:PCT/US2013/053239

    申请日:2013-08-01

    申请人: NETAPP, INC.

    发明人: HORN, Gustav

    IPC分类号: G06F17/00 G06F17/40

    摘要: Techniques for mobile clusters for collecting telemetry data and processing analytic tasks, are disclosed herein. The mobile cluster includes a processor, a plurality of data nodes and an analysis module. The data nodes receive and store a snapshot of at least a portion of data stored in a main Hadoop storage cluster and real-time acquired data received from a data capturing device. The analysis module is operatively coupled to the processor to process analytic tasks based on the snapshot and the real-time acquired data when the storage cluster is not connected to the main storage cluster.

    摘要翻译: 本文公开了用于收集遥测数据和处理分析任务的移动集群的技术。 移动集群包括处理器,多个数据节点和分析模块。 数据节点接收和存储存储在主Hadoop存储集群中的数据的至少一部分的快照和从数据捕获设备接收的实时采集数据。 当存储集群未连接到主存储集群时,分析模块可操作地耦合到处理器以基于快照和实时获取的数据来处理分析任务。

    METHOD FOR OPERATING A CONTROL NETWORK, AND CONTROL NETWORK
    7.
    发明申请
    METHOD FOR OPERATING A CONTROL NETWORK, AND CONTROL NETWORK 审中-公开
    一种用于操作控制网和管理网

    公开(公告)号:WO2013053643A2

    公开(公告)日:2013-04-18

    申请号:PCT/EP2012069688

    申请日:2012-10-05

    申请人: SIEMENS AG

    IPC分类号: G06F11/20

    摘要: The invention relates to a method for operating a control network. It should be possible to perform the method reliably with relatively little complexity. According to the invention, a method for operating a control network (1) is suitable for this purpose, said control network having a single physical connection between a first control computer (ST1) and a second redundant control computer (ST2) by means of a data line network (2), to which several functionally important data processing devices (A, C, D, F, H, K, L) are connected. The data connection between the control computers (ST1, ST2) and the functionally important devices (A, C, D, F, H, K, L) is achieved by means of a redundant and diverse heartbeat, wherein the communication connection between the two control computers (ST1, ST2) is checked in order to start the operation of the control network (1). If the result of the check is positive, a master function is assigned to a control computer (ST1), or if the result of the check is negative, both control computers (ST1, ST2) connect the functionally important devices (A, C, D, F, H, K, L) to themselves according to a defined sequence. If a specified quantity of the functionally important devices (A, C, D, F, H, K, L) is connected to one of the two control computers (ST1), said control computer assumes the master function and the other control computer (ST2) assumes the standby function, or if the number of functionally important devices (A, C, D, F, H, K, L) connected to each of the two control computers (ST1, ST2) lies below the specified quantity, a signal is generated that signals a faulty state of the control network (1). The invention further relates to a control network.

    摘要翻译: 本发明涉及一种用于操作一个控制网络,该网络应该是在相对低的成本安全可行的方法。 与单个物理的第一控制计算机(ST1)和一第二冗余控制计算机(ST2)之间通过数据传输系统(2)本发明的几个功能键连接,合适的操作控制网络(1),计算技术装置的方法(A, C,D,F,H,K,L)连接。 所述控制计算机(ST1,ST2)和功能键的设备(A,C,D,F,H,K,L)之间的数据链路由一个冗余和多样化的心跳的装置,给定的,其中,用于接收所述控制网络的操作(1) 两个控制计算机(ST1,ST2)之间的通信链路被选中。 与阳性测试结果给控制计算机(ST1)被分配一个主功能,或者与阴性测试结果,无论控制计算机(ST1,ST2)结合到指定的顺序,重要的功能单元(A,C,D,F,H,K,L),以 自己。 在功能上重要的设备(A,C,D,F,H,K,L)与两个控制计算机中的一个(ST1)接管该功能作为主站和其它控制计算机(ST2)的待机功能或的预定数量的连接 在预定数量以下的温度的各两个控制计算机(ST1,ST2)功能上重要的设备(A,C,D,F,H,K,L),则产生的信号的数量拴其中控制网络的一个扰动状态(1 )信号。 本发明还涉及一种控制网络。

    METHODS AND SYSTEMS OF MANAGING A DISTRIBUTED REPLICA BASED STORAGE
    8.
    发明申请
    METHODS AND SYSTEMS OF MANAGING A DISTRIBUTED REPLICA BASED STORAGE 审中-公开
    管理分布式复制存储的方法和系统

    公开(公告)号:WO2013024485A2

    公开(公告)日:2013-02-21

    申请号:PCT/IL2012/050314

    申请日:2012-08-15

    IPC分类号: G06F12/02

    摘要: A method of managing a distributed storage space. The method comprises mapping a plurality of replica sets to a plurality of storage managing modules installed in a plurality of computing units, each of the plurality of storage managing modules manages access of at least one storage consumer application to replica data of at least one replica of a replica set from the plurality of replica sets, the replica data is stored in at least one drive of a respective the computing unit, allocating at least one time based credit to at least one of each storage managing module and the replica data, iteratively renewing the time based credit as long a failure of at least one of the storage managing module, and the at least one drive and the replica data is not detected plurality of storage managing.

    摘要翻译:

    一种管理分布式存储空间的方法。 该方法包括将多个副本集映射到安装在多个计算单元中的多个存储管理模块,多个存储管理模块中的每一个管理至少一个存储消费者应用对至少一个副本的副本的副本数据的访问 来自所述多个副本集合的副本集合,所述副本数据被存储在相应所述计算单元的至少一个驱动器中,将至少一个基于时间的信用分配给每个存储管理模块和所述副本数据中的至少一个,迭代地更新 所述基于时间的信用与所述存储管理模块中的至少一个的故障一样长,并且所述至少一个驱动器和所述副本数据未被检测到多个存储管理。

    SYSTEM AND METHOD FOR ENHANCING AVAILABILITY OF A DISTRIBUTED OBJECT STORAGE SYSTEM DURING A PARTIAL DATABASE OUTAGE
    9.
    发明申请
    SYSTEM AND METHOD FOR ENHANCING AVAILABILITY OF A DISTRIBUTED OBJECT STORAGE SYSTEM DURING A PARTIAL DATABASE OUTAGE 审中-公开
    用于在部分数据库中断期间增强分布式对象存储系统的可用性的系统和方法

    公开(公告)号:WO2012039989A3

    公开(公告)日:2012-05-18

    申请号:PCT/US2011051316

    申请日:2011-09-13

    IPC分类号: G06F17/30 G06F12/16 G06F15/16

    摘要: An "operate with missing region" feature of this disclosure allows the cluster to continue servicing reads for available regions even when some regions are missing. In particular, upon a given node failure condition, the cluster is placed in an effective read-only mode for all regions. The node failure condition typically is one where there has been a failure of an authoritative region copy and no backup copy is available. As used herein, "read-only" means that no client write or update requests will succeed while the cluster is in this state. Preferably, such requests are then re-tried. In this mode, all regions are only allowed to perform read operations. During the read-only state, the cluster continues to operate with missing regions, and missing regions are entered on the region map. The cluster then automatically recovers returning missing region(s), after which is leaves the read-only state.

    摘要翻译: 本公开的“利用缺失区域操作”特征允许集群继续为可用区域提供服务读取,即使某些区域缺失。 特别是,在给定的节点故障条件下,群集被置于所有区域的有效只读模式。 节点故障情况通常是在授权区域副本出现故障并且没有备份副本可用的情况下。 如这里所使用的,“只读”意味着当群集处于该状态时,没有客户端写入或更新请求会成功。 优选地,然后重新尝试这样的请求。 在这种模式下,所有区域只允许执行读取操作。 在只读状态期间,群集继续以缺失区域进行操作,并且在区域映射上输入缺失区域。 然后群集自动恢复返回丢失的区域,之后保持只读状态。

    DISTRIBUTED COMPUTING SYSTEM
    10.
    发明申请
    DISTRIBUTED COMPUTING SYSTEM 审中-公开
    分布式计算系统

    公开(公告)号:WO2011152117A8

    公开(公告)日:2012-02-23

    申请号:PCT/JP2011058548

    申请日:2011-04-04

    发明人: WATANABE NORITAKA

    IPC分类号: G06F11/20

    CPC分类号: G06F11/1425 H04L41/06

    摘要: Disclosed is a distributed computing system that enables autonomous leader selection without relying on a particular server. The distributed computing system (S) is provided with: a leader candidate selection device (63) that, when communication is established with a majority of the initial total number of information processing devices, selects the aforementioned information processing device with the oldest accession time as a leader candidate information processing device, and transmits identification information thereof; and a leader recognition device (64) that investigates identification information of the aforementioned leader candidate information processing device and the transmitted identification information of the aforementioned leader candidate information processing device of the leader recognition device itself, and that, in the case that the information processing device that is the same as that recognized as the leader candidate information processing device is present among the aforementioned majority of the initial total number of information processing devices, recognizes said information processing device as a new leader.

    摘要翻译: 公开了一种分布式计算系统,其能够在不依赖于特定服务器的情况下实现自主领导者选择。 所述分布式计算系统(S)具备:当与所述信息处理装置的所述初始总数量的大部分建立通信时,选择所述最早登录时间为上述信息处理装置的前导候选者选择装置 领导候选信息处理装置,并发送其标识信息; 以及领导者识别装置(64),其调查前述领导候选信息处理装置的识别信息和领导识别装置本身的上述领导候选信息处理装置的发送识别信息,并且在信息处理 识别为前导候补信息处理装置的设备的设备存在于上述大多数信息处理设备的初始总数中,将所述信息处理设备识别为新的领导者。