摘要:
In some implementations, a processor may include a machine check architecture having a plurality of error reporting registers able to receive data for machine check errors. A summary register may include a plurality of settable locations that each represents at least one of the error reporting registers. One or more of the settable locations in the summary register may be set to indicate whether one or more of the error reporting registers maintain data for a machine check error. Accordingly, when a machine check error occurs, the summary register may be accessed to identify if any error reporting registers in a processor's view contain valid error data, rather than having to read each of the error reporting registers in the processor's view.
摘要:
In some implementations, a processor may include a machine check architecture having a plurality of error reporting registers able to receive data for machine check errors. A summary register may include a plurality of settable locations that each represents at least one of the error reporting registers. One or more of the settable locations in the summary register may be set to indicate whether one or more of the error reporting registers maintain data for a machine check error. Accordingly, when a machine check error occurs, the summary register may be accessed to identify if any error reporting registers in a processor's view contain valid error data, rather than having to read each of the error reporting registers in the processor's view.
摘要:
Dynamically configurable server platforms and associated apparatus and methods. A server platform including a plurality of CPUs installed in respective sockets may be dynamically configured as multiple single-socket servers and as a multi-socket server. The CPUs are connected to a platform manager component comprising an SoC including one or more processors and an embedded FPGA. Following a platform reset, an FPGA image is loaded, dynamically configuring functional blocks and interfaces on the platform manager. The platform manager also includes pre-defined functional blocks and interfaces. During platform initialization the dynamically-configured functional blocks and interfaces are used to initialize the server platform, while both the pre-defined and dynamically-configured functional blocks and interfaces are used to support run-time operations. The server platform may be used in conventional rack architectures or implemented in a disaggregated rack architecture under which the single-socket and/or multi-socket servers are dynamically composed to employ disaggregated resources, such as memory, storage, and accelerators.
摘要:
Examples may include a basic input/output system (BIOS) for a computing platform communicating with a controller for a non-volatile dual in-line memory module (NVDIMM). Communication between the BIOS and the controller may include a request for the controller to scan and identify error locations in non-volatile memory at the NVDIMM. The non-volatile memory may be capable of providing persistent memory for the NVDIMM.
摘要:
A non-volatile random access memory (NVRAM) is used in a computer system to enhance support to sleep states. The computer system includes a processor, a non-volatile random access memory (NVRAM) that is byte-rewritable and byte-erasable, and power management (PM) module. A dynamic random access memory (DRAM) provides a portion of system address space. The PM module intercepts a request initiated by an operating system for entry into a sleep state, copies data from the DRAM to the NVRAM, maps the portion of the system address space from the DRAM to the NVRAM, and turns off the DRAM when transitioning into the sleep state. Upon occurrence of a wake event, the PM module returns control to the operating system such that the computer system resumes working state operations without the operating system knowing that the portion of the system address space has been mapped to the NVRAM.
摘要:
Systems and methods of implementing server architectures that can facilitate the servicing of memory components in computer systems. The systems and methods employ nonvolatile memory/storage modules that include nonvolatile memory (NVM) that can be used for system memory and mass storage, as well as firmware memory. The respective NVM/storage modules can be received in front or rear-loading bays of the computer systems. The systems and methods further employ single, dual, or quad socket processors, in which each processor is communicably coupled to at least some of the NVM/storage modules disposed in the front or rear-loading bays by one or more memory and/or input/output (I/O) channels. By employing NVM/storage modules that can be received in front or rear-loading bays of computer systems, the systems and methods provide memory component serviceability heretofore unachievable in computer systems implementing conventional server architectures.
摘要:
Mechanisms for efficient discovery of storage resources in a Rack Scale Architecture (RSA) system and associated methods, apparatus, and systems. A rack is populated with pooled system drawers including pooled compute drawers and pooled storage drawers communicatively coupled via input-output (IO) cables. Compute nodes including one or more processors, memory resources, and optional local storage resources are installed in the pooled compute drawers, and are enabled to be selectively-coupled to storage resources in the pooled storage drawers over virtual attachment links. During a discovery process, a compute node determines storage resource characteristics of storage resources it may be selectively-coupled to and the attachment links used to access the storage resources. The storage resource characteristics are aggregated by a pod manager that uses corresponding configuration information to dynamically compose compute nodes for rack users based on user needs.
摘要:
In some embodiments a request is received to perform an error injection or a memory migration, a mode is entered that blocks requests from agents other than a current processor core or thread, the error is injected or the memory is migrated, and the mode that blocks requests from the agents other than the current processor core or thread is exited. Other embodiments are described and claimed.
摘要:
Methods and apparatus for highly available rack management in Rack Scale environment. Rack Management Modules (RMMs) are configured to manage power and thermal zones in a rack including a plurality of pooled system drawers, wherein each pooled system drawer is associated with a respective power zone including power sensors and power control devices and a respective thermal zone including thermal sensors and thermal devices. During operation, one of the RMMs is implemented as a master RMM, and the other is implemented as a slave RMM. The master RMM is used to monitor the power and thermal zones. State information is periodically synchronized between the master RMM and the slave RMM. The RMMs are further configured to perform a fail-over operation in connection with a failed or failing RMM, where after the fail-over operation the slave becomes the new master RMM and the previous master RMM becomes the new slave.
摘要:
Mechanisms for efficient discovery of storage resources in a Rack Scale Architecture (RSA) system and associated methods, apparatus, and systems. A rack is populated with pooled system drawers including pooled compute drawers and pooled storage drawers communicatively coupled via input-output (IO) cables. Compute nodes including one or more processors, memory resources, and optional local storage resources are installed in the pooled compute drawers, and are enabled to be selectively-coupled to storage resources in the pooled storage drawers over virtual attachment links. During a discovery process, a compute node determines storage resource characteristics of storage resources it may be selectively-coupled to and the attachment links used to access the storage resources. The storage resource characteristics are aggregated by a pod manager that uses corresponding configuration information to dynamically compose compute nodes for rack users based on user needs.