摘要:
A data processing system and a method implement a unique push, or streaming, model for communicating time sensitive encoded data, such as video and audio data, in a communication network. A pacing mechanism is implemented in the data processing system to allow a client to pace a streaming server in a stable way such that a fill level of a client buffer will oscillate around a single threshold value. A simple protocol is implemented to protect pacing primitives, allow recovery for pacing primitives, and to keep a client and a server synchronized during the pacing operation. To implement the pacing mechanism, the streaming server transmits data at a slightly faster rate than it was encoded. Subsequently, a decoder circuit at the client, or receiver, uses the transmitted data at the encoded rate. Thus, the utilization of buffers in the client will gradually increase. When the utilization of buffers reaches a threshold level, the client provides a pacing message to the server. When the pacing message is received, the server withholds sending data for a period of time sufficient to drop the client buffer utilization to a level below a threshold level.
摘要:
A unique device name is assigned to each of a plurality of shared storage devices in a cluster configuration database defining membership of nodes in a cluster. A particular node among the nodes defined by the cluster configuration database as a member of the cluster searches the cluster configuration database for a device identifier matching a device identifier of a shared storage device hosted by the particular node. In response to finding a matching device identifier in the cluster configuration database, the particular node renames, in a local configuration maintained at the particular node, a storage device associated with the matching device identifier with the unique name assigned to that storage device in the cluster configuration database.
摘要:
An event notification method for distributed processing systems provides remote and local node event notification in systems that require local registration of an event consumer in order produce an event notifications. To provide notification of an event occurring on a remote node, either event consumers on all nodes in the cluster register locally to receive event notifications and specify that the event is a cluster event, in which case the nodes send notification of their locally-occurring events to all nodes, or remote registrations are accepted at nodes and if a local consumer for the event is not present, a listener thread registers as an event consumer. The listener thread sends the event notifications to the remote nodes registered as consumers for the event by observing communication between the event producer and the local consumer, or receiving the event notifications directly if there is no local consumer.
摘要:
A scheme for monitoring node operational status according to communications transmits messages periodically according to a heartbeat rate among the nodes. The messages may be gossip messages containing the status of the other nodes in the pairs, are received at the nodes and indications of the communications delays of the received messages are stored, which are used to compute statistics of the stored communications delays. Parameters of the node status monitoring, which are used for determining operational status of the nodes, are adjusted according to the statistics, which may include adjusting the heartbeat rate, the maximum wait time before a message is considered missed, and/or the maximum number of missed messages, e.g., the sequence number deviation, before the node is considered non-operational (down).
摘要:
In response to a stimulus indicating configuration of a node into a cluster of a plurality of nodes including the node, the node determines whether or not the node has a universally unique identifier (UUID), and if not, the node provides its own persistent self-assigned UUID. The node searches a cluster configuration database for a temporary identifier associated with the node. In response to the node locating the temporary identifier of the node in the cluster configuration database, the node writes its self-assigned UUID into the cluster configuration database and joins the cluster.
摘要:
Communication ability between nodes in a cluster-based computer system is tracked to inform applications executing on the nodes of the existence and quality of the endpoint-to-endpoint communications available between the nodes. Communications between a node and other nodes are tracked, and a database records the communication ability between the node and the other nodes for each link between the nodes. The tracking and recording are repeated at the other nodes. A registration by an application executing at a particular one of the nodes to receive notifications of changes in the communication ability with another node over a particular link (or in general) will cause notification of the application when the link status changes.
摘要:
An event notification system for distributed processing systems provides remote and local node event notification in systems that require local registration of an event consumer in order produce an event notifications. To provide notification of an event occurring on a remote node, either event consumers on all nodes in the cluster register locally to receive event notifications and specify that the event is a cluster event, in which case the nodes send notification of their locally-occurring events to all nodes, or remote registrations are accepted at nodes and if a local consumer for the event is not present, a listener thread registers as an event consumer. The listener thread sends the event notifications to the remote nodes registered as consumers for the event by observing communication between the event producer and the local consumer, or receiving the event notifications directly if there is no local consumer.
摘要:
In a software partition (SWPAR) environment, a method, system and computer program product enables a SWPAR to be remotely booted, independent of the booting of the OS on the global system environment, using network file system (NFS) services and protocols. A request to mount a NFS, hosted by an external server into a SWPAR environment is transmitted. The NFS services are automatically transitioning to a first operating state that enables support for user-level NFS services without requiring the NFS services be active. The SWPAR is automatically booted and access to the SWPAR provided during operation of the NFS services in the first operating state. Once the SWPAR has completed booting, the NFS services is transitioned back to a normal operating state in which SWPAR operates as a standalone device providing its own user-level NFS services.
摘要:
Systems and methods for implementing recovery processes on failed nodes in a distributed computing environment are described. In accordance with this scheme, one or more migratory recovery modules are launched into the network. The recovery modules migrate from node to node, determine the status of each node, and initiate recovery processes on failed nodes. In this way, scalable recovery processes may be implemented in distributed systems, even with incomplete network topology and membership information. In addition, the complexity and cost associated with manual status monitoring and recovery operations may be avoided.
摘要:
In one embodiment, an operating system manages virtualized instances of hardware resources and migration enabled applications partitioned into one of multiple partitions with a separate operating system kernel running in each of the partitions. A migration event controller of the operating system manages the checkpoint and restart process during migration of a virtualized instance of at least one migration enabled application from a departure partition to an arrival partition. The migration event controller supports migration enabled applications to separately specify at least one application specific checkpoint script and restart script to be triggered by checkpoint and restart events by the migration event controller so the at least one migration enabled application can participate in performing the checkpoint and restart process for additional state information during migration of the virtualized instance from the departure partition to the arrival partition.