摘要:
In one embodiment, a processor comprises a data cache configured to store a plurality of cache blocks and a control unit coupled to the data cache. The control unit is configured to flush the plurality of cache blocks from the data cache responsive to an indication that the processor is to transition to a low power state in which one or more clocks for the processor are inhibited.
摘要:
In one embodiment, a switch is configured to be coupled to an interconnect. The switch comprises a plurality of storage locations and an arbiter control circuit coupled to the plurality of storage locations. The plurality of storage locations are configured to store a plurality of requests transmitted by a plurality of agents. The arbiter control circuit is configured to arbitrate among the plurality of requests stored in the plurality of storage locations. A selected request is the winner of the arbitration, and the switch is configured to transmit the selected request from one of the plurality of storage locations onto the interconnect. In another embodiment, a system comprises a plurality of agents, an interconnect, and the switch coupled to the plurality of agents and the interconnect. In another embodiment, a method is contemplated.
摘要:
In one embodiment, a processor comprises a data cache configured to store a plurality of cache blocks and a control unit coupled to the data cache. The control unit is configured to flush the plurality of cache blocks from the data cache responsive to an indication that the processor is to transition to a low power state in which one or more clocks for the processor are inhibited.
摘要:
A processor supports a processing mode in which the address size is greater than 32 bits and the operand size may be 32 or 64 bits. The address size may be nominally indicated as 64 bits, although various embodiments of the processor may implement any address size which exceeds 32 bits, up to and including 64 bits, in the processing mode. The processing mode may be established by placing an enable indication in a control register into an enabled state and by setting a first operating mode indication and a second operating mode indication in a segment descriptor to predefined states. Other combinations of the first operating mode indication and the second operating mode indication may be used to provide compatibility modes for 32 bit and 16 bit processing compatible with the x86 processor architecture (with the enable indication remaining in the enabled state).
摘要:
A high speed bus system for use in a shared memory system that allows for the high speed transmissions of commands and data between a number of processors and a memory array of a multi-processor, shared memory system, with the high speed bus system including a central unit and a series of uni-directional buses that connect between the plurality of processors and shared memory, with the central unit including arbitration logic and a series of multiplexers to determine which CPUs are granted access to shared buses, scheduling logic that works with the arbitration logic and multiplexers to determine which CPUs are granted access to the shared buses, and port logic for combining the CPU transmissions and determining if such transmissions are valid.
摘要:
A computer system employs virtual channels and allocates different resources to the virtual channels. Packets which do not have logical/protocol-related conflicts are grouped into a virtual channel. Accordingly, logical conflicts occur between packets in separate virtual channels. The packets within a virtual channel may share resources (and hence experience resource conflicts), but the packets within different virtual channels may not share resources. Since packets which may experience resource conflicts do not experience logical conflicts, and since packets which may experience logical conflicts do not experience resource conflicts, deadlock-free operation may be achieved. Additionally, nodes within the computer system may be configured to preallocate resources to process response packets. Some response packets may have logical conflicts with other response packets, and hence would normally not be allocable to the same virtual channel. However, by preallocating response-processing resources, response packets are accepted by the destination node. Thus, any resource conflicts which may occur are temporary (as the response packets which make forward progress are processable). Viewed in another way, response packets may be logically independent if the destination node is capable of processing the response packets upon receipt. Accordingly, a response virtual channel is formed to which each response packet belongs.
摘要:
A memory controller provides programmable flexibility, via one or more configuration registers, for the configuration of the memory. The memory may be optimized for a given application by programming the configuration registers. For example, in one embodiment, the portion of the address of a memory transaction used to select a storage location for access in response to the memory transaction may be programmable. In an implementation designed for DRAM, a first portion may be programmably selected to form the row address and a second portion may be programmable selected to form the column address. Additional embodiments may further include programmable selection of the portion of the address used to select a bank. Still further, interleave modes among memory sections assigned to different chip selects and among two or more channels to memory may be programmable, in some implementations. Furthermore, the portion of the address used to select between interleaved memory sections or interleaved channels may be programmable. One particular implementation may include all of the above programmable features, which may provide a high degree of flexibility in optimizing the memory system.
摘要:
A computer system is presented which implements a system and method for tracking the progress of posted write transactions. In one embodiment, the computer system includes a processing subsystem and an input/output (I/O) subsystem. The processing subsystem includes multiple processing nodes interconnected via coherent communication links. Each processing node may include a processor preferably executing software instructions. The I/O subsystem includes one or more I/O nodes. Each I/O node may embody one or more I/O functions (e.g., modem, sound card, etc.). The multiple processing nodes may include a first processing node and a second processing node, wherein the first processing node includes a host bridge, and wherein a memory is coupled to the second processing node. An I/O node may generate a non-coherent write transaction to store data within the second processing node's memory, wherein the non-coherent write transaction is a posted write transaction. The I/O node may dispatch the non-coherent write transaction directed to the host bridge. The host bridge may respond to the non-coherent write transaction by translating the non-coherent write transaction to a coherent write transaction, and dispatching the coherent write transaction to the second processing node. The second processing node may respond to the coherent write transaction by dispatching a target done response directed to the host bridge.
摘要翻译:提出了一种实现用于跟踪已发布的写入事务进度的系统和方法的计算机系统。 在一个实施例中,计算机系统包括处理子系统和输入/输出(I / O)子系统。 处理子系统包括通过相干通信链路互连的多个处理节点。 每个处理节点可以包括优选执行软件指令的处理器。 I / O子系统包括一个或多个I / O节点。 每个I / O节点可以体现一个或多个I / O功能(例如,调制解调器,声卡等)。 多个处理节点可以包括第一处理节点和第二处理节点,其中第一处理节点包括主机桥,并且其中存储器耦合到第二处理节点。 I / O节点可以生成非相干写事务以在第二处理节点的存储器内存储数据,其中非相干写事务是已发布的写事务。 I / O节点可以调度定向到主桥的非相干写入事务。 主桥可以通过将非相干写事务转换为相干写事务来响应非相干写事务,并将相干写事务分派到第二处理节点。 第二处理节点可以通过调度定向到主桥的目标完成响应来响应相干写事务。
摘要:
A processor employs a store to load forward (STLF) predictor which may indicate, for dispatching loads, a dependency on a store. The dependency is indicated for a store which, during a previous execution, interfered with the execution of the load. Since a dependency is indicated on the store, the load is prevented from scheduling and/or executing prior to the store. The STLF predictor is trained with information for a particular load and store in response to executing the load and store and detecting the interference. Additionally, the STLF predictor may be untrained (e.g. information for a particular load and store may be deleted) if a load is indicated by the STLF predictor as dependent upon a particular store and the dependency does not actually occur. In one implementation, the STLF predictor records at least a portion of the PC of a store which interferes with the load in a first table indexed by the load PC. A second table maintains a corresponding portion of the store PCs of recently dispatched stores, along with tags identifying the recently dispatched stores. In another implementation, the STLF predictor records a difference between the tags assigned to a load and a store which interferes with the load in a first table indexed by the load PC. The PC of the dispatching load is used to select a difference from the table, and the difference is added to the tag assigned to the load.
摘要:
A cache is coupled to receive an input address and a corresponding way prediction. The cache provides output bytes in response to the predicted way (instead of, performing tag comparisons to select the output bytes). Furthermore, a tag may be read from the predicted way and only partial tags are read from the non-predicted ways. The tag is compared to the tag portion of the input address, and the partial tags are compared to a corresponding partial tag portion of the input address. If the tag matches the tag portion of the input address, a hit in the predicted way is detected and the bytes provided in response to the predicted way are correct. If the tag does not match the tag portion of the input address, a miss in the predicted way is detected. If none of the partial tags match the corresponding partial tag portion of the input address, a miss in the cache is determined. On the other hand, if one or more of the partial tags match the corresponding partial tags portion of the input address, the cache searches the corresponding ways to determine whether or not the input address hits or misses in the cache.