摘要:
A multi-dimension engine, connected to a system TLB, generates sequences of addresses to request page address translation prefetch requests in advance of predictable accesses to elements within data arrays. Prefetch requests are filtered to avoid redundant requests of translations to the same page. Prefetch requests run ahead of data accesses but are tethered to within a reasonable range. The number of pending prefetches are limited. A system TLB stores a number of translations, the number being relative to the dimensions of the range of elements accessed from within the data array.
摘要:
System TLBs are integrated within an interconnect, use a and share a transport network to connect to a shared walker port. Transactions are able to pass STLB allocation information through a second initiator side interconnect, in a way that interconnects can be cascaded, so as to allow initiators to control a shared STLB within the first interconnect. Within the first interconnect, multiple STLBs share an intermediate-level translation cache that improves performance when there is locality between requests to the two STLBs.
摘要:
A multiprocessor switching device substantially implemented on a single CMOS integrated circuit is described in connection with a parallel routing scheme for calculating routing information for incoming packets. Using the programmable hash and route routing scheme, a hash and route circuit can be programmed for a variety of applications, such as routing, flow-splitting or load balancing.
摘要:
Power conservation via DRAM access reduction is provided by a buffer/mini-cache selectively operable in a normal mode and a buffer mode. In the buffer mode, entered when CPUs begin operating in low-power states, non-cacheable accesses (such as generated by a DMA device) matching specified physical address ranges are processed by the buffer/mini-cache, instead of by a memory controller and DRAM. The buffer/mini-cache processing includes allocating lines when references miss, and returning cached data from the buffer/mini-cache when references hit. Lines are replaced in the buffer/mini-cache according to one of a plurality of replacement policies, including ceasing replacement when there are no available free lines. In the normal mode, entered when CPUs begin operating in high-power states, the buffer/mini-cache operates akin to a conventional cache and non-cacheable accesses are not processed therein. In one usage scenario, data retained in the buffer/mini-cache is graphics refresh data maintained in a compressed format metal layer on the second electrically
摘要:
A method for receiving data from a plurality of virtual channels begins by storing a stream of data as a plurality of data segments, wherein the stream of data includes multiplexed data fragments from at least one of the plurality of virtual channels, and wherein a data segment of the plurality of data segments corresponds to one of the multiplexed data fragments. The method continues by decoding at least one of the plurality of data segments in accordance with one of a plurality of data transmission protocols to produce at least one decoded data segment. The method continues by storing the at least one decoded data segment, in a generic format, to reassemble at least a portion of a packet provided by the at least one of the plurality of virtual channels. The method continues by routing the at least one decoded data segment as at least part of the reassembled packet to one of a plurality of destinations in accordance with the at least one of the plurality of virtual channels.
摘要:
Power conservation via DRAM access reduction is provided by a buffer/mini-cache selectively operable in a normal mode and a buffer mode. In the buffer mode, entered when CPUs begin operating in low-power states, non-cacheable accesses (such as generated by a DMA device) matching specified physical address ranges are processed by the buffer/mini-cache, instead of by a memory controller and DRAM. The buffer/mini-cache processing includes allocating lines when references miss, and returning cached data from the buffer/mini-cache when references hit. Lines are replaced in the buffer/mini-cache according to one of a plurality of replacement policies, including ceasing replacement when there are no available free lines. In the normal mode, entered when CPUs begin operating in high-power states, the buffer/mini-cache operates akin to a conventional cache and non-cacheable accesses are not processed therein. In one usage scenario, data retained in the buffer/mini-cache is graphics refresh data maintained in a compressed format.
摘要:
An apparatus includes one or more interface circuits, an interconnect, a memory controller, a memory bridge, a packet DMA circuit, and a switch. The memory controller, the memory bridge, and the packet DMA circuit are coupled to the interconnect. Each interface circuit is coupled to a respective interface to receive packets and/or coherency commands from the interface. The switch is coupled to the interface circuits, the memory bridge, and the packet DMA circuit. The switch is configured to route the coherency commands from the interface circuits to the memory bridge and the packets from the interface circuits to the packet DMA circuit. The memory bridge is configured to initiate corresponding transactions on the interconnect in response to at least some of the coherency commands. The packet DMA circuit is configured to transmit write transactions on the interconnect to the memory controller to store the packets in memory.
摘要:
A node comprises at least one agent and an input/output (I/O) circuit coupled to an interconnect within the node. The I/O circuit is configured to communicate on a global interconnect to which one or more other nodes are coupled during use. Addresses transmitted on the interconnect are in a first local address space of the node, and addresses transmitted on the global interconnect are in a global address space. The first local address space includes at least a first region used to address at least a first resource of the node. The node is programmable, during use, to relocate the first region within the first local address space, whereby a same numerical value in the first local address space and a second local address space corresponding to one of the other nodes coupled to the global interconnect refers to the first resource in the node during use.
摘要翻译:节点包括耦合到节点内的互连的至少一个代理和输入/输出(I / O)电路。 I / O电路被配置为在使用期间一个或多个其他节点耦合到的全局互连上进行通信。 在互连上发送的地址在节点的第一本地地址空间中,并且在全局互连上发送的地址在全局地址空间中。 第一本地地址空间至少包括用于寻址节点的至少第一资源的第一区域。 该节点在使用期间是可编程的,以重新定位第一本地地址空间内的第一区域,由此第一本地地址空间中的相同数值和对应于耦合到全局互连的其他节点之一的第二本地地址空间 到使用中的节点的第一个资源。
摘要:
A multiprocessor switching device substantially implemented on a single CMOS integrated circuit is described in connection with a parallel routing scheme for calculating routing information for incoming packets. Using the programmable hash and route routing scheme, a hash and route circuit can be programmed for a variety of applications, such as routing, flow-splitting or load balancing.
摘要:
A system TLB accepts translation prefetch requests from initiators. Misses generate external translation requests to a walker port. Attributes of the request such as ID, address, and class, as well as the state of the TLB affect the allocation policy of translations within multiple levels of translation tables. Translation tables are implemented with SRAM, and organized in groups.