Method and apparatus for providing fault-tolerant addresses for nodes in a clustered system
    1.
    发明授权
    Method and apparatus for providing fault-tolerant addresses for nodes in a clustered system 有权
    为集群系统中的节点提供容错地址的方法和装置

    公开(公告)号:US06535990B1

    公开(公告)日:2003-03-18

    申请号:US09480148

    申请日:2000-01-10

    IPC分类号: G06F1100

    摘要: One embodiment of the present invention provides a system that facilitates communications between a cluster of nodes within a clustered computing system in a manner that tolerates failures of communication pathways between the nodes. The system operates by configuring a distinct logical pathway between each possible source node and each possible destination node in the cluster, so that each distinct logical pathway is routed across one of at least two disjoint physical pathways between each possible source node and each possible destination node. In doing so, the system configures a first logical pathway between a first node and a second node across a first physical pathway of at least two disjoint physical pathways between the first node and the second node. Upon detecting a failure of the first physical pathway, the system reroutes the first logical pathway across a second physical pathway from the at least two disjoint physical pathways between the first node and the second node. In one embodiment of the present invention, the system associates a distinct per-node logical address with each node in the cluster. For each source node, the system associates the per-node logical address of each possible destination node with a corresponding logical pathway to the destination node. In this way, a communication from a given source node to a per-node logical address of a given destination node is directed across the corresponding logical pathway to the given destination node.

    摘要翻译: 本发明的一个实施例提供一种系统,其以允许节点之间的通信路径的故障的方式促进集群计算系统内的节点簇之间的通信。 该系统通过在每个可能的源节点和群集中的每个可能的目的地节点之间配置不同的逻辑路径来操作,使得每个不同的逻辑路径跨越每个可能的源节点和每个可能的目的地节点之间的至少两个不相交的物理路径之一 。 在这样做时,系统通过第一节点和第二节点之间的至少两个不相交物理路径的第一物理路径,在第一节点和第二节点之间配置第一逻辑路径。 当检测到第一物理路径的故障时,系统通过第一物理路径从第一节点和第二节点之间的至少两个不相交的物理路径重新路由第一逻辑路径。 在本发明的一个实施例中,系统将不同的每节点逻辑地址与集群中的每个节点相关联。 对于每个源节点,系统将每个可能目的地节点的每节点逻辑地址与到目的地节点的对应逻辑路径相关联。 以这种方式,从给定源节点到给定目的地节点的每节点逻辑地址的通信被引导到到给定目的地节点的相应逻辑路径。

    System and method for managing software version upgrades in a networked computer system
    2.
    发明授权
    System and method for managing software version upgrades in a networked computer system 有权
    用于管理联网计算机系统中的软件版本升级的系统和方法

    公开(公告)号:US07260818B1

    公开(公告)日:2007-08-21

    申请号:US10449584

    申请日:2003-05-29

    IPC分类号: G06F9/44

    CPC分类号: G06F8/71 G06F9/44536

    摘要: In a multi-node computer system, a software version management system is described having a version manager for ensuring that cluster nodes running completely incompatible software are unable to communicate with each other. The version manager provides a mechanism for determining when nodes in the cluster are running incompatible software and providing a way for determining the exact version of software that each node must run. The version manager provides support for rolling upgrades to enable the version management software to ensure the chosen version of software that runs the cluster stays constant even though the software installed on individual nodes is changing.

    摘要翻译: 在多节点计算机系统中,描述了具有版本管理器的软件版本管理系统,用于确保运行完全不兼容的软件的集群节点不能彼此通信。 版本管理器提供了一种机制,用于确定群集中的节点何时运行不兼容的软件,并提供确定每个节点必须运行的软件的确切版本的方法。 版本管理器提供对滚动升级的支持,使版本管理软件能够确保运行集群的所选版本的软件保持不变,即使安装在单个节点上的软件正在更改。

    Supporting interactions between different versions of software for accessing remote objects
    3.
    发明授权
    Supporting interactions between different versions of software for accessing remote objects 有权
    支持用于访问远程对象的不同版本的软件之间的交互

    公开(公告)号:US07055147B2

    公开(公告)日:2006-05-30

    申请号:US10376944

    申请日:2003-02-28

    IPC分类号: G06F9/44

    摘要: One embodiment of the present invention provides a system that facilitates interactions between different versions of software that support remote object invocations. During operation, the system receives a reference to an object that is implemented on a server. Next, the system identifies one or more versions of the object supported by the reference, wherein each successive version of the object inherits methods from a preceding version of the object. The system then invokes a method on the object that is supported by the one or more versions of the object.

    摘要翻译: 本发明的一个实施例提供一种促进支持远程对象调用的软件的不同版本之间的交互的系统。 在操作期间,系统接收对在服务器上实现的对象的引用。 接下来,系统识别由引用支持的对象的一个​​或多个版本,其中对象的每个连续版本从对象的先前版本继承方法。 然后,系统调用对象的一个​​或多个版本支持的方法。

    Method and apparatus for reaching agreement between nodes in a distributed system
    4.
    发明授权
    Method and apparatus for reaching agreement between nodes in a distributed system 有权
    在分布式系统中达成协议的方法和装置

    公开(公告)号:US06957254B1

    公开(公告)日:2005-10-18

    申请号:US09662553

    申请日:2000-09-15

    摘要: One embodiment of the present invention provides a system for selecting a node to host a primary server for a service from a plurality of nodes in a distributed computing system. The system operates by receiving an indication that a state of the distributed computing system has changed. In response to this indication, the system determines if there is already a node hosting the primary server for the service. If not, the system selects a node to host the primary server using the assumption that a given node from the plurality of nodes in the distributed computing system hosts the primary server. The system then communicates rank information between the given node and other nodes in the distributed computing system, wherein each node in the distributed computing system has a unique rank with respect to the other nodes in the distributed computing system. The system next compares the rank of the given node with the rank of the other nodes in the distributed computing system. If one of the other nodes has a higher rank than the given node, the system disqualifies the given node from hosting the primary server.

    摘要翻译: 本发明的一个实施例提供了一种用于从分布式计算系统中的多个节点中选择用于服务的主服务器的节点的系统。 系统通过接收分布式计算系统的状态已经改变的指示来操作。 响应于该指示,系统确定是否存在承载服务的主服务器的节点。 如果不是,则系统使用假设分布式计算系统中的多个节点中的给定节点托管主服务器来选择节点来托管主服务器。 然后,系统在给定节点和分布式计算系统中的其他节点之间传送等级信息,其中分布式计算系统中的每个节点相对于分布式计算系统中的其他节点具有唯一的等级。 系统接下来将给定节点的等级与分布式计算系统中其他节点的等级进行比较。 如果其他节点之一具有比给定节点更高的等级,则系统将给定节点不承担主服务器的资格。

    System and method for ensuring delivery of a single communication between nodes
    5.
    发明授权
    System and method for ensuring delivery of a single communication between nodes 有权
    确保节点之间传递单个通信的系统和方法

    公开(公告)号:US06662213B1

    公开(公告)日:2003-12-09

    申请号:US09480010

    申请日:2000-01-10

    IPC分类号: G06F1516

    摘要: A system and method are provided for ensuring delivery of a communication from one computer system or node to another. A first node includes an object handler, such as an ORB (Object Request Broker), that receives object references from higher-level services operating on the first node, wherein the referenced object resides on a second node. The first node's object handler generates a message to an object handler on the second node and attempts to send the message to the second node through a transport module. The message is assigned a unique identifier, such as a sequence number. If the first object handler receives an uncertain status concerning the message (e.g., other than a certain success or failure), it issues a query to the second node to determine if the message was received. If the query is received by the second object handler before the message itself is received, the message is considered lost or rescinded by the first node. The first node stores the identifier so that it will not be re-assigned to another message and the message is then re-sent with a different identifier. The second object handler notes the identifier and status of the rescinded message and will discard any message having that identifier that is received. The second node includes two or more data structures to track the status of communications sent from the first node. The first node, in addition to a collection of identifiers of lost messages, may also record the status of communications it attempts to send and may also note the identifiers of messages that could not be transmitted (e.g., because of communication link errors).

    摘要翻译: 提供了一种用于确保从一个计算机系统或节点到另一个计算机系统或节点传送通信的系统和方法。 第一节点包括诸如ORB(Object Request Broker)的对象处理器,其接收来自在第一节点上操作的上级服务的对象引用,其中被引用对象驻留在第二节点上。 第一个节点的对象处理程序生成一个消息给第二个节点上的对象处理程序,并尝试通过传输模块将消息发送到第二个节点。 该消息被分配唯一的标识符,例如序列号。 如果第一对象处理器接收到关于消息的不确定状态(例如,除了某个成功或失败之外),则向第二节点发出查询以确定该消息是否被接收。 如果在接收到消息本身之前第二个对象处理程序接收到查询,则该消息被第一个节点视为丢失或撤销。 第一个节点存储标识符,使其不会重新分配给另一个消息,然后使用不同的标识符重新发送该消息。 第二个对象处理程序注释被取消的消息的标识符和状态,并丢弃具有接收到的标识符的任何消息。 第二节点包括用于跟踪从第一节点发送的通信状态的两个或多个数据结构。 除了丢失消息的标识符的集合之外,第一节点还可以记录其尝试发送的通信的状态,并且还可以记录不能发送的消息的标识符(例如,由于通信链路错误)。