摘要:
A method and corresponding apparatus of managing transport operations between a first memory cluster and one or more other memory clusters, include selecting, at a clock cycle in the first memory cluster, at least one transport operation destined to at least one destination memory cluster, from one or more transport operations, based at least in part on priority information associated with the one or more transport operations or current states of available processing resources allocated to the first memory cluster in each of a subset of the one or more other memory clusters, and initiating the transport of the selected at least one transport operation.
摘要:
Techniques for selectively utilizing memory available in a redundant host system of a cluster are described. In one embodiment, a cluster of host systems, with at least one redundant host system, with each host system having a plurality of virtual machines with associated virtual machine (VM) reservation memory is provided. A portion of a data store is used to store a base file, the base file accessed by all the plurality of virtual machines. A portion of the memory available in the redundant host system is assigned as spare VM reservation memory. A copy of the base file is selectively stored in the spare VM reservation memory for access by all the plurality of virtual machines.
摘要:
Mechanisms, in a data processing system comprising a first adapter and second adapter, for performing a failover operation from the first adapter to the second adapter are provided. The mechanisms detect that an imminent failure of the first adapter is likely to occur and initiate a failover priming operation in the first adapter and second adapter in response to detecting the imminent failure. The failover priming operation configures ingress and egress buffers of the second adapter to have a similar configuration to ingress and egress buffers of the first adapter. The mechanisms migrate processing of ingress data traffic to the second adapter prior to failure of the first adapter such that the first adapter processes egress data traffic from the data processing system and the second adapter processes ingress data traffic to the data processing system.
摘要:
Disclosed herein is a system and method for automatically moving an application from one site to another site in the event of a disaster. Prior to coming back online the application is configured with information to allow it to run on the new site without having to perform the configuration actions after the application has come online. This enables a seamless experience to the user of the application while also reducing the associated downtime for the application.
摘要:
To prevent a user from initiating potentially dangerous virtual machine migrations, a storage migration engine is configured to be aware of replication properties for a source datastore and a destination datastore. The replication properties are obtained from a storage array configured to provide array-based replication. A recovery manager discovers the replication properties of the datastores stored in the storage array, and assigns custom tags to the datastores indicating the discovered replication properties. When storage migration of a virtual machine is requested, the storage migration engine performs or prevents the storage migration based on the assigned custom tags.
摘要:
Embodiments of the present invention provide systems, methods, and computer program products for optimizing a placement plan. In one embodiment, a method is disclosed in which a request for registration with an external advisor is received. A time to live is received from each external advisor and used to determine an overall timeout period value for a placement engine. After receiving a predictive failure alert, internal and external advisors are ranked according to criteria and advice is received from the qualified advisors. A placement plan is generated based on the advice received from the advisors.
摘要:
This disclosure generally describes methods and systems, including computer-implemented methods, computer-program products, and computer systems, for providing a proactive failure recovery model for distributed computing. One computer-implemented method includes building a virtual tree-like computing structure of a plurality of computing nodes, for each computing node of the virtual tree-like computing structure, performing, by a hardware processor, a node failure prediction model to calculate a mean time between failure (MTBF) associated with the computing node, determining whether to perform a checkpoint of the computing node based on a comparison between the calculated MTBF and a maximum and minimum threshold, migrating a process from the computing node to a different computing node acting as a recovery node, and resuming execution of the process on the different computing node.
摘要:
Provided are techniques for managing backup operations from a client system to a primary server and secondary server. A determination is made at the client system of whether a state of the data on the secondary server permits a backup operation in response to determining that the primary server is unavailable when a force failover parameter is not set. The client system reattempts to connect to the primary server to perform the backup operation at the primary server in response to determining that the state of the data on the secondary server does not permit the backup operation. The client system performs the backup operation at the secondary server in response to determining that the state of the secondary server permits the backup operation.
摘要:
Mechanisms for controlling access to storage volumes on the secondary storage system is provided. A determination is made as to whether a first site computing device has sent a notification of a failure condition of a first site. In response to a determination that the notification of the failure condition of the first site has not been received, secondary workloads of a second site computing device are permitted to access storage volumes on the secondary storage system. In response to a determination that the notification of the failure condition of the first site has been received, a mode of operation of the second site is modified from a normal mode of operation to a failure mode of operation. In the failure mode of operation, the storage system controller of the second site blocks at least a portion of access requests from secondary workloads of the second site computing device.
摘要:
A processing-based bypass “fail open” mode is provided for an intrusion prevention system by a primary process running on a first logical core (lcore) is used as a control plane, which invokes bypass-open run-to-completion threads in other lcores comprising a bypass data plane, and which spawns a secondary process to fully configure intrusion prevention threads on other lcores to create an Intrusion Prevention System data plane. Upon a ready signal from the secondary process, the primary process quiesces such that the secondary process IPS data plane exclusively owns and executes on the other lcores.