摘要:
Methods and systems for controlling fluid coolant flow in cooling systems of computing devices are disclosed. According to an aspect, a method may include determining a temperature of a fluid coolant in a cooling system of a computing device. For example, a temperature of water exiting a cooling system of a server may be determined. The method may also include determining an operational condition of the computing device. For example, a temperature of a processor, memory, or input/output (I/O) component may be determined. Further, the method may include controlling a flow of the fluid coolant through the cooling system based on the temperature of the fluid coolant and/or the operational condition.
摘要:
A driver is provided to manage launching of tasks at different levels of priority and within the parameters of the firmware interface. The driver includes two anchors for managing the tasks, a dispatcher and an agent. The dispatcher operates at a medium priority level and manages communication from a remote administrator. The agent functions to receive communications from the dispatcher by way of a shared data structure and to launch lower priority level tasks in respond to the communication. The shared data structure stores communications received from the dispatcher. Upon placing the communication in the shared data structure, the dispatcher sends a signal to the agent indicating that a communication is in the data structure for reading by the agent. Following reading of the communication in the data structure, the agent launches the lower priority level task and sends a signal to the data structure indicating the status of the task. Accordingly, a higher level task maintains its level of operation and spawns lower level tasks through the dispatcher in conjunction with the agent.
摘要:
Managing firmware in a computing system storing a plurality of different firmware images for the same firmware includes: calculating, for each firmware image in dependence upon a plurality of predefined factors, a preference score; responsive to a failure of a particular firmware image, selecting a firmware image having a highest preference score; and failing over to the selected firmware image.
摘要:
A method of managing the workload in a computer system having one or more semi-redundant hardware components is provided. The method comprises detecting loss or degradation of the level of performance of one or more of the semi-redundant hardware components, identifying hardware components affected by the loss or degradation, migrating a critical job from an affected hardware component to an unaffected hardware component, and performing less-critical jobs on an affected hardware component. Loss or degradation of the semi-redundant component reduces the capacity of affected hardware components in the computer system without entirely disabling the computer system. Jobs identified as critical run on hardware components having the most capacity and reliability, while less-critical jobs use the remaining capacity of affected hardware components. Examples of semi-redundant hardware components include a memory module, CPU core, Ethernet port, power supply, fan, disk drive, and an input output port.
摘要:
One embodiment provides a method of initializing a federated computer system from a fabric of nodes connected by a federated interface. Each node casts a vote to the federated interface for a candidate firmware version supported by the node casting the vote. The candidate firmware version having received the greatest number of votes is identified, and the computer system is initialized as a federated system of the nodes that support the firmware version identified as having received the greatest number of votes. A process of iterative voting may be used to identify a greater number of nodes supporting a compatible firmware version.
摘要:
Embodiments include a power-efficient failover system. In one embodiment, a primary server is operable at one or more power states and is configured to dynamically generate a backup for the results of n executed program code while in a normal operating state. A redundant server is coupled to the primary server and is operable at the normal operating state or one or more reduced power states. The redundant server is also configured to dynamically receive the backup from the primary server and, in response to a failure of the primary server, to assume the workload of the primary server according to the backup. A controller manages the power state of the redundant server.
摘要:
Embodiments include a power-efficient failover system and method. In one embodiment, a primary server operating in a normal operating state is configured to dynamically backup device states or transaction logs. A redundant server coupled to the primary server in a failover cluster is operated at a reduced power state. The redundant server dynamically receives the backup from the primary server and is elevated to a normal operating state in response to a failure of the primary server. By enforcing a reduced power state of the redundant server, a failover system provides a desired combination of high power efficiency with low latency.
摘要:
A level of indirection is utilized when writing to a microprocessor array structure, thereby masking hard faults in the array structure. Among other benefits, this minimizes the use of a backward error recovery mechanism with its inherent delay for recovery. The indirection is used to effectively remove from use faulty portions of the array structure and substitute spare, functioning portions to perform the duties of the faulty portions. Thus, for example, faulty rows in microprocessor array structures are mapped out in favor of substitute, functioning rows.
摘要:
Multitasking in a hardware interrupt free environment. Event indicators are employed to multitask between processes of the environment. Processes to be multitasked register with one another, and then during processing, one of the processes toggles an event indicator to allow another process to execute. The toggling allows the processes to share resources in an interrupt free environment.
摘要:
Methods and systems for controlling fluid coolant flow in cooling systems of computing devices are disclosed. According to an aspect, a method may include determining a temperature of a fluid coolant in a cooling system of a computing device. For example, a temperature of water exiting a cooling system of a server may be determined. The method may also include determining an operational condition of the computing device. For example, a temperature of a processor, memory, or input/output (I/O) component may be determined. Further, the method may include controlling a flow of the fluid coolant through the cooling system based on the temperature of the fluid coolant and/or the operational condition.