Abstract:
A method and apparatus for handling low power and high performance loads is herein described. Software, such as a compiler, is utilized to identify producer loads, consumer reuse loads, and consumer forwarded loads. Based on the identification by software, hardware is able to direct performance of the load directly to a load value buffer, a store buffer, or a data cache. As a result, accesses to cache are reduced, through direct loading from load and store buffers, without sacrificing load performance.
Abstract:
Techniques to control power and processing among a plurality of asymmetric cores. In one embodiment, one or more asymmetric cores are power managed to migrate processes or threads among a plurality of cores according to the performance and power needs of the system
Abstract:
Various embodiments of the invention concern methods and apparatuses for power and time efficient load handling. A compiler may identify producer loads, consumer reuse loads, consumer forwarded loads, and producer/consumer hybrid loads. Based on this identification, performance of the load may be efficiently directed to a load value buffer, store buffer, data cache, or elsewhere. Consequently, accesses to cache are reduced, through direct loading from load value buffers and store buffers, thereby efficiently processing the loads.
Abstract:
An arrangement is provided for compressing microcode ROM (“uROM”) in a processor and for efficiently accessing a compressed “uROM”. A clustering-based approach may be used to effectively compress a uROM. The approach groups similar columns of microcode into different clusters and identifies unique patterns within each cluster. Only unique patterns identified in each cluster are stored in a pattern storage. Indices, which help map an address of a microcode word (“uOP”) to be fetched from a uROM to unique patterns required for the uOP, may be stored in an index storage. Typically it takes a longer time to fetch a uOP from a compressed uROM than from an uncompressed uROM. The compressed uROM may be so designed that the process of fetching a uOP (or uOPs) from a compressed uROM may be fully-pipelined to reduce the access latency.
Abstract:
Techniques to control power and processing among a plurality of asymmetric cores. In one embodiment, one or more asymmetric cores are power managed to migrate processes or threads among a plurality of cores according to the performance and power needs of the system.
Abstract:
A method for resolving data request conflicts in a cache coherency protocol for multiple caching agents using requester-generated data forwards. In one embodiment, a caching agent stores information used to auto-generate a forward of data received in response to a data request.
Abstract:
In one embodiment, the present invention includes a method for receiving a first portion of a first packet at a first agent and determining whether the first portion is an interleaved portion based on a value of an interleave indicator. The interleave indicator may be sent as part of the first portion. In such manner, interleaved packets may be sent within transmission of another packet, such as a lengthy data packet, providing improved processing capabilities. Other embodiments are described and claimed.
Abstract:
Architectures and techniques that allow legacy pin functionality to be replaced with a “virtual wire” that may communicate information that would otherwise be communicated by a wired interface. A message may be passed between a system controller and a processor that includes a virtual wire value and a virtual wire change indicator. The virtual wire value may include a signal corresponding to one or more pins that have been eliminated from the physical interface and the virtual wire change value may include an indication of whether the virtual wire value has changed. The combination of the virtual wire value and the virtual wire change indicator may allow multiple physical pins to be replaced by message values.
Abstract:
A memory controller is described that comprises a compression map cache. The compression map cache is to store information that identifies a cache line's worth of information that has been compressed with another cache line's worth of information. A processor and a memory controller integrated on a same semiconductor die is also described. The memory controller comprises a compression map cache. The compression map cache is to store information that identifies a cache line's worth of information that has been compressed with another cache line's worth of information.