Abstract:
Examples provide an application program interface or manner of negotiating locking or pinning or unlocking or unpinning of a cache region by which an application, software, or hardware. A cache region can be part of a level-1, level-2, lower or last level cache (LLC), or translation lookaside buffer (TLB) are locked (e.g., pinned) or unlocked (e.g., unpinned). A cache lock controller can respond to a request to lock or unlock a region of cache or TLB by indicating that the request is successful or not successful. If a request is not successful, the controller can provide feedback indicating one or more aspects of the request that are not permitted. The application, software, or hardware can submit another request, a modified request, based on the feedback to attempt to lock a portion of the cache or TLB.
Abstract:
Devices and techniques for out-of-band platform tuning and configuration are described herein. A device can include a telemetry interface to a telemetry collection system and a network interface to network adapter hardware. The device can receive platform telemetry metrics from the telemetry collection system, and network adapter silicon hardware statistics over the network interface, to gather collected statistics. The device can apply a heuristic algorithm using the collected statistics to determine processing core workloads generated by operation of a plurality of software systems communicatively coupled to the device. The device can provide a reconfiguration message to instruct at least one software system to switch operations to a different processing core, responsive to detecting an overload state on at least one processing core, based on the processing core workloads. Other embodiments are also described.
Abstract:
Processors and methods implementing a machine instruction to perform cache line demotion on multiple cache lines to enable efficient sharing of cache lines between processor cores. One general aspect includes a processor comprising: a plurality of hardware processor cores, where each of the hardware processor cores to include a first cache. The processor also includes a second cache, communicatively coupled to and shared by the plurality of hardware processor cores. The processor to support a first machine instruction, the first machine instruction to include a vector register operand identifying a vector register which contains a plurality of data elements each used to identify a cache line. An execution of the first machine instruction by one of the plurality of hardware processor cores to cause a plurality of identified cache lines to be demoted, such that the demoted cache lines are moved from the first cache to the second cache.
Abstract:
Methods and apparatus for accelerating VM-to-VM Network Traffic using CPU cache. A virtual queue manager (VQM) manages data that is to be kept in VM-VM shared data buffers in CPU cache. The VQM stores a list of VM-VM allow entries identifying data transfers between VMs that may use VM-VM cache “fast-path” forwarding. Packets are sent from VMs to the VQM for forwarding to destination VMs. Indicia in the packets (e.g., in a tag or header) is inspected to determine whether a packet is to be forwarded via a VM-VM cache fast path or be forwarded via a virtual switch. The VQM determines the VM data already in the CPU cache domain while concurrently coordinating with the data to and from the external shared memory, and also ensures data coherency between data kept in cache and that which is kept in shared memory.
Abstract:
In embodiments, apparatuses, methods and storage media (transitory and non-transitory) are described that are associated with end-to-end datacenter performance control. In various embodiments, an apparatus for computing may receive a datacenter performance target, determine an end-to-end datacenter performance level based at least in part on quality of service data collected from a plurality of nodes, and send a mitigation command based at least in part on a result of a comparison of the end-to-end datacenter performance level determined to the datacenter performance target. In various embodiments, the apparatus for computing may include one or more processors, a memory, a datacenter performance monitor to receive a datacenter performance target corresponding to a service level agreement, and a mitigation module to send a mitigation command based at least in part on a result of a comparison of an end-to-end datacenter performance level to a datacenter performance target.
Abstract:
Examples may include techniques to coordinate the sharing of resources among virtual elements, including service chains, supported by a shared pool of configurable computing resources based on relative priority among the virtual element and service chains. Information including indications of the performance of the service chains and also the relative priority of the service chains may be received. The resource allocation of portions of the shared pool of configurable computing resources supporting the service chains can be adjusted based on the received performance and priority information.
Abstract:
A heterogeneous processor architecture and a method of booting a heterogeneous processor is described. A processor according to one embodiment comprises: a set of large physical processor cores; a set of small physical processor cores having relatively lower performance processing capabilities and relatively lower power usage relative to the large physical processor cores; and a package unit, to enable a bootstrap processor. The bootstrap processor initializes the homogeneous physical processor cores, while the heterogeneous processor presents the appearance of a homogeneous processor to a system firmware interface.
Abstract:
A heterogeneous processor architecture is described. For example, a processor according to one embodiment of the invention comprises: a first set of one or more physical processor cores having first processing characteristics; a second set of one or more physical processor cores having second processing characteristics different from the first processing characteristics; virtual-to-physical (V-P) mapping logic to expose a plurality of virtual processors to software, the plurality of virtual processors to appear to the software as a plurality of homogeneous processor cores, the software to allocate threads to the virtual processors as if the virtual processors were homogeneous processor cores; wherein the V-P mapping logic is to map each virtual processor to a physical processor within the first set of physical processor cores or the second set of physical processor cores such that a thread allocated to a first virtual processor by software is executed by a physical processor mapped to the first virtual processor from the first set or the second set of physical processors.
Abstract:
In accordance with some embodiments, a cloud service provider may operate a data center in a way that dynamically reallocates resources across nodes within the data center based on both utilization and service level agreements. In other words, the allocation of resources may be adjusted dynamically based on current conditions. The current conditions in the data center may be a function of the nature of all the current workloads. Instead of simply managing the workloads in a way to increase overall execution efficiency, the data center instead may manage the workload to achieve quality of service requirements for particular workloads according to service level agreements.
Abstract:
An architecture to perform resource management among multiple network nodes and associated resources is disclosed. Example resource management techniques include those relating to: proactive reservation of edge computing resources; deadline-driven resource allocation; speculative edge QoS pre-allocation; and automatic QoS migration across edge computing nodes. In a specific example, a technique for service migration includes: identifying a service operated with computing resources in an edge computing system, involving computing capabilities for a connected edge device with an identified service level; identifying a mobility condition for the service, based on a change in network connectivity with the connected edge device; and performing a migration of the service to another edge computing system based on the identified mobility condition, to enable the service to be continued at the second edge computing apparatus to provide computing capabilities for the connected edge device with the identified service level.