Abstract:
Converting a stale cache memory unique request to a read unique snoop response in a multiple (multi-) central processing unit (CPU) processor is disclosed. The multi-CPU processor includes a plurality of CPUs that each have access to either private or shared cache memories in a cache memory system. Multiple CPUs issuing unique requests to write data to a same coherence granule in a cache memory causes one unique request for a requested CPU to be serviced or "win" to allow that CPU to obtain the coherence granule in a unique state, while the other unsuccessful unique requests become stale. To avoid retried unique requests being reordered behind other pending, younger requests which would lead to lack of forward progress due to starvation or livelock, the snooped stale unique requests are converted to read unique snoop responses so that their request order can be maintained in the cache memory system.
Abstract:
Embodiments of apparatus, method, and storage medium associated with MCCG memory integrity for securing/protecting memory content/data of VM or enclave are described herein. In some embodiments, an apparatus may include one or more encryption engines to encrypt a unit of data to be stored in a memory in response to a write operation from a VM or an enclave of an application, prior to storing the unit of data into the memory in an encrypted form; wherein to encrypt the unit of data, the one or more encryption engines are to encrypt the unit of data using at least a key domain selector associated with the VM or enclave, and a tweak based on a color within a color group associated with the VM or enclave. Other embodiments may be described and/or claimed.
Abstract:
Systems, methods, and apparatuses are directed to requesting access to a memory address; storing an identification of the memory address in a data structure; receiving a first request for access to the memory address, the request comprising a reference to a second processor core; storing the reference to the second processor in the data structure; receiving a second request for access to the memory address, the second request comprising a reference to a third processor core; determining, based on the data structure, that the third processor core is different from the second processor core; and responding to the second request without buffering the second request.
Abstract:
A multiprocessor data processing system includes multiple vertical cache hierarchies supporting a plurality of processor cores, a system memory, and a system interconnect. In response to a load-and-reserve request from a first processor core, a first cache memory supporting the first processor core issues on the system interconnect a memory access request for a target cache line of the load-and-reserve request. Responsive to the memory access request and prior to receiving a system wide coherence response for the memory access request, the first cache memory receives from a second cache memory in a second vertical cache hierarchy by cache-to-cache intervention the target cache line and an early indication of the system wide coherence response for the memory access request. In response to the early indication and prior to receiving the system wide coherence response, the first cache memory initiating processing to update the target cache line in the first cache memory.
Abstract:
Maintaining cache coherency using conditional intervention among multiple master devices is disclosed. In one aspect, a conditional intervention circuit is configured to receive intervention responses from multiple snooping master devices. To select a snooping master device to provide intervention data, the conditional intervention circuit determines how many snooping master devices have a cache line granule size the same as or larger than a requesting master device. If one snooping master device has a same or larger cache line granule size, that snooping master device is selected. If more than one snooping master device has a same or larger cache line granule size, a snooping master device is selected based on an alternate criteria. The intervention responses provided by the unselected snooping master devices are canceled by the conditional intervention circuit, and intervention data from the selected snooping master device is provided to the requesting master device.
Abstract:
一种高速缓存cache存储器系统,该cache存储器系统包括多个上一级缓存和本级缓存,每个上一级缓存包括多个缓存行cache line,该本级缓存包括独占型标签随机接入存储器Exclusive Tag RAM和包容型标签随机接入存储器Inclusive Tag RAM,其中,该Exclusive Tag RAM用于优先存储每个上一级缓存中状态为修改独占UD的cache line的索引地址,该Inclusive Tag RAM用于存储每个上一级缓存中状态为独占UC、共享SC或修改共享SD的cache line的索引地址,由于该cache存储器系统采用Exclusive Tag RAM和Inclusive Tag RAM,一方面可以降低cache存储器系统中存储cache line的数据所需的容量,另一方面可以提高在cache存储器系统中获取cache line的数据的命中率,减少到主存中读取数据所造成的时延,提高了cache存储器系统的性能。
Abstract:
A method for accessing data in an electronic device is provided. The method includes receiving a request for the data from at least one processor by a first cache memory among a plurality of cache memories, transmitting the requested data to the at least one processor, and transmitting access-related information regarding the request to a second cache memory among the plurality of cache memories.
Abstract:
A processor includes a Level-2 (L2) cache, a first and second cluster of execution units, and a first and second data cache unit (DCU) communicatively coupled to the respective clusters of execution units and to the L2 cache. The DCUs each include a data cache and logic to receive a memory operation from an execution unit, respond to the memory operation with information from the data cache when the information is available in the data cache, and retrieve the information from the L2 cache when the information is unavailable in the data cache. The processor further includes logic to maintain contents of the data cache of the first DCU as equal to contents of the data cache of the second DCU at all clock cycles of operation of the processor.
Abstract:
Cache lines in a computing environment with transactional memory are configurable with a coherency mode. Cache lines in full-line coherency mode are operated or managed with full-line granularity. Cache lines in sub-line coherency mode are operated or managed as sub-cache line portions of a full cache line. When a transaction accessing a cache line in full-line coherency mode results in a transactional abort, the cache line may be placed in sub-line coherency mode if the cache line is a high-conflict cache line. The cache line may be associated with a counter in a conflict address detection table that is incremented whenever a transaction conflict is detected for the cache line. The cache line may be a high- conflict cache line when the counter satisfies a high-conflict criterion, such as reaching a threshold value. The cache line may be returned to full-line coherency mode when a reset criterion is satisfied.
Abstract:
A low latency cache intervention mechanism implements a snoop filter to dynamically select an intervener cache for a cache "hit" in a multiprocessor architecture of a computer system. The selection of the intervener is based on variables such as latency, topology, frequency, utilization, load, wear balance, and/or power state of the computer system.