摘要:
A method, system, and device for managing hardware resources in a cloud scheduling environment includes a zone controller. The zone controller can manage groups of node servers in a cloud datacenter using a checkin service. The checkin service allows server groups to be created automatically based on one or more hardware characteristics of the node servers, server health information, workload scheduling or facilities management parameters, and/or other criteria.
摘要:
A method, system, and device for managing hardware resources in a cloud scheduling environment includes a zone controller. The zone controller can manage groups of node servers in a cloud datacenter using a checkin service. The checkin service allows server groups to be created automatically based on one or more hardware characteristics of the node servers, server health information, workload scheduling or facilities management parameters, and/or other criteria.
摘要:
A microcode (uCode) hot-upgrade method for bare metal cloud deployment and associated apparatus. Under the uCode hot-upgrade method, a uCode path is received at an out-of-band controller (e.g., baseboard management controller (BMC)) and buffered in a memory buffer in the out-of-band controller. The out-of-band controller exposes the memory buffer as a Memory-Mapped Input-Output (MMIO) range to a host CPU. A uCode upgrade interrupt service is triggered to upgrade uCode for one or more CPUs in a bare-metal cloud platform during runtime of a tenant host operating system (OS) using an out-of-bound process. This innovation enables cloud service providers to deploy uCode hot-patches to bare metal servers for live-patch without touching the tenant operating system environment.
摘要:
A microcode (uCode) hot-upgrade method for bare metal cloud deployment and associated apparatus. The uCode hot-upgrade method applies a uCode patch to a firmware storage device (e.g., BIOS SPI flash) through an out-of-band controller (e.g., baseboard management controller (BMC)). In conjunction with receiving a uCode patch, a uCode upgrade interrupt service is triggered to upgrade uCode for one or more CPUs in a bare-metal cloud platform during runtime of a tenant host operating system (OS) using an out-of-bound process. This innovation enables cloud service providers to deploy uCode hot-patches to bare metal servers for persistent storage and live-patch without touching the tenant operating system environment.
摘要:
An apparatus and method are described for detecting and correcting data fetch errors within a processor core. For example, one embodiment of an instruction processing apparatus for detecting and recovering from data fetch errors comprises: at least one processor core having a plurality of instruction processing stages including a data fetch stage and a retirement stage; and error processing logic in communication with the processing stages to perform the operations of: detecting an error associated with data in response to a data fetch operation performed by the data fetch stage; and responsively performing one or more operations to ensure that the error does not corrupt an architectural state of the processor core within the retirement stage.
摘要:
A selectively upgradeable disaggregated server is generally described herein. An example modular server unit, the modular server unit includes a processor module coupled to an input/output (I/O) module via a connector. The processor module to communicate with the I/O module via the connector to store and retrieve data. The processor module is a separate hardware unit from the I/O module.
摘要:
Methods and apparatus for highly available rack management in Rack Scale environment. Rack Management Modules (RMMs) are configured to manage power and thermal zones in a rack including a plurality of pooled system drawers, wherein each pooled system drawer is associated with a respective power zone including power sensors and power control devices and a respective thermal zone including thermal sensors and thermal devices. During operation, one of the RMMs is implemented as a master RMM, and the other is implemented as a slave RMM. The master RMM is used to monitor the power and thermal zones. State information is periodically synchronized between the master RMM and the slave RMM. The RMMs are further configured to perform a fail-over operation in connection with a failed or failing RMM, where after the fail-over operation the slave becomes the new master RMM and the previous master RMM becomes the new slave.
摘要:
An apparatus for coherent shared memory across multiple clusters is described herein. The apparatus includes a fabric memory controller and one or more nodes. The fabric memory controller manages access to a shared memory region of each node such that each shared memory region is accessible using load store semantics, even in response to failure of the node. The apparatus also includes a global memory, wherein each shared memory region is mapped to the global memory by the fabric memory controller.
摘要:
An event management resource monitors a processor environment. In response to detecting occurrence of a trigger event in the processor environment, the event management resource initiates a transfer of processor cache data from volatile storage in the processor environment to non-volatile memory. The event management resource can be configured to produce status information associated with the transfer of cache data to a respective non-volatile memory resource. The event management resource stores the status information in a non-volatile storage resource for later retrieval. Accordingly, status information associated with the event causing the transfer is available for analysis on subsequent power up or reboot of a respective computer system.
摘要:
Apparatuses and methods associated with memory allocations for virtual machines are disclosed. In embodiments, an apparatus may include a processor; a plurality of memory modules; and a memory controller configured to provide a layout of the memory modules. The apparatus may further include a VMM configured to be operated by the processor to manage execution of a VM by the processor including selective allocation of the memory modules to the VM using the layout of the memory modules provided to the VMM by the memory controller. Other embodiments may be described and claimed.