Abstract:
Fence randomization with inter-chip fencing constraints, including: receiving a fencing setup comprising one or more parameters for fencing a plurality of chips in a plurality of drawers; and selecting, based on the one or more parameters and one or more dependencies for implementing the one or more parameters, a subset of the plurality of chips for fencing, wherein the subset of the plurality of chips are selected at least partially randomly; generating a testing configuration indicating the selected subset of the plurality of chips.
Abstract:
The disclosure relates to a system implemented on the basis of a field broadband bus architecture of industrial internet, where this system is based upon a two-wire data transmission network widely applied in a traditional industry control system; multi-carrier orthogonal frequency division multiplexing technology is introduced to provide a large bandwidth above hundreds of megahertz; a design of a special frame structure, reasonable static and dynamic configurations of physical layer resource blocks, as well as a scheduling strategy of data services at medium access control layer, achieve proper mapping of transmission services to time slices; and a fast synchronized, real-time, high-speed, and reliable solution is provided with respect to the good performance, high reliability, strict real-time characteristic and high security required by a field broadband bus architecture of industrial internet.
Abstract:
A device, system, and/or method includes an internal circuit configured to perform at least one function, an input-output terminal set and a repair circuit. The input-output terminal set includes a plurality of normal input-output terminals connected to an external device via a plurality of normal signal paths and at least one repair input-output terminal selectively connected to the external device via at least one repair signal path. The repair circuit repairs at least one failed signal path included in the normal signal paths based on a mode signal and fail information signal, where the mode signal represents whether to use the repair signal path and the fail information signal represents fail information on the normal signal paths. Using the repair circuit, various systems adopting different repair schemes may be repaired and cost of designing and manufacturing the various systems may be reduced.
Abstract:
In an embodiment of the invention, an apparatus comprises: a plurality of bus masters and a plurality of bus arbiters to support routing and failover, wherein each bus arbiter is coupled to a plurality of bus masters; and a central processing unit (CPU) coupled to at least one of the bus arbiters; wherein the CPU is configured to execute a firmware that chooses bus re-routing or failover in response to a bus failure. In another embodiment of the invention, a method comprises: choosing, by a central processing unit (CPU) coupled to a plurality of bus arbiters, bus re-routing or failover in response to a bus failure. In yet another embodiment of the invention, an article of manufacture, comprises a non-transient computer-readable medium having stored thereon instructions that permit a method comprising: choosing, by a central processing unit (CPU) coupled to a plurality of bus arbiters, bus re-routing or failover in response to a bus failure.
Abstract:
Individual transport connections within a dual-star fabric connected multi-node storage system are disabled in response to associated failures due to faulty hardware or temporal congestion. Each configured IB transport connection is monitored for viability and, upon failure, removed from the pool of available resource. Following failure restoration the resource is tested to ensure proper functionality and then restored to the pool of resources. Mappings associated with the transport connections are maintained while the connections are disabled.
Abstract:
Various examples of techniques for identifying a corrupt data lane and using a spare data lane are described herein. Some examples include a method of coordinating spare lane usage between link partners. One such example comprises analyzing data from a link partner to identify a corrupt lane, and communicating the corrupt lane to the link partner, wherein the communication does not require sideband communication channel. In some embodiments, communicating the corrupt lane to the link partner comprises identifying a transmit lane corresponding to the corrupt lane, transmitting a set of data intended for a corresponding transmit lane using a spare data lane, and transmitting bad data to the link partner using the corresponding transmit lane.
Abstract:
A system and methodology to monitor system resources for a cluster computer environment and/or an application instance allows user to defined failover policies that take appropriate corrective actions when a predefined threshold is met. An engine comprising failover policies and mechanisms to define resource monitoring, consumption, allocation, and one or more thresholds for a computer server environment to identify capable servers and thereafter automatically transition an application between multiple servers so as to ensure the application is continually operating within the defined metrics.
Abstract:
Data integrity is maintained during failed communications between a member node of a primary cluster and a backup cluster by assigning an assisting member node to run an assisting process that transmits data entered into the member node to the backup cluster. In this way, a replicated database is maintained during a partial communication failure between the primary cluster and the backup cluster.
Abstract:
Embodiments of the present invention relate to an approach for reconfiguring interrelationships between components of virtual computing networks (e.g., a grid computing network, a local area network (LAN), a cloud computing network, etc.). In a typical embodiment, a set of information pertaining to a set of components associated with a virtual computing network is received in a computer memory medium or the like. Based on the set of information, a graphical representation (e.g., hierarchical tree) depicting the set of interrelationships between the set of components is generated. When a failure in the virtual computing network is detected, at least one of the set of interrelationships between the set of components is reconfigured based on the graphical representation and the set of rules to address the failure.
Abstract:
A high performance computing (HPC) system includes computing blades having a first region that includes processors for performing a computation, and a second region that includes non-volatile memory for use in performing the computation and another computing processor for performing data movement and storage. Because data movement and storage are offloaded to the secondary processor, the processors for performing the computation are not interrupted to perform these tasks. A method for use in the HPC system receives instructions in the computing processors and first data in the memory. The method includes receiving second data into the memory while continuing to execute the instructions in the computing processors, without interruption. A computer program product implementing the method is also disclosed.