摘要:
A method and a cache control circuit for replacing a cache line using an alternate pseudo least-recently-used (PLRU) algorithm with a victim cache coherency state, and a design structure on which the subject cache control circuit resides are provided. When a requirement for replacement in a congruence class is identified, a first PLRU cache line for replacement and an alternate PLRU cache line for replacement in the congruence class are calculated. When the first PLRU cache line for replacement is in the victim cache coherency state, the alternate PLRU cache line is picked for use.
摘要:
A method and a cache control circuit for replacing a cache line using an alternate pseudo least-recently-used (PLRU) algorithm with a victim cache coherency state, and a design structure on which the subject cache control circuit resides are provided. When a requirement for replacement in a congruence class is identified, a first PLRU cache line for replacement and an alternate PLRU cache line for replacement in the congruence class are calculated. When the first PLRU cache line for replacement is in the victim cache coherency state, the alternate PLRU cache line is picked for use.
摘要:
Methods and apparatus for tracking dependencies of commands to be executed by a command processor are provided. By determining the dependency of incoming commands against all commands awaiting execution, dependency information can be stored in a dependency scoreboard. Such a dependency scoreboard may be used to determine if a command is ready to be issued by the command processor. The dependency scoreboard can also be updated with information relating to the issuance of commands, for example, as commands complete.
摘要:
A method handles concurrent address translation cache misses and hits under those misses while maintaining command order based upon virtual channel. Commands are stored in a command processing unit that maintains ordering of the commands. A command buffer index is assigned to each address being sent from the command processing unit to an address translation unit. When an address translation cache miss occurs, a memory fetch request is sent. The CBI is passed back to the command processing unit with a signal to indicate that the fetch request has completed. The command processing unit uses the CBI to locate the command and address to be reissued to the address translation unit.
摘要:
In a first aspect, a first method of issuing a command on a bus of a system is provided. The first method includes the steps of (1) receiving a first functional memory command in the system; (2) receiving a command to force the system to execute functional memory commands in order; (3) receiving a second functional memory command in the system; and (4) employing a dependency matrix to indicate the second functional memory command requires access to a same address as the first functional memory command whether or not the second functional memory command actually has an ordering dependency on the first functional memory command. The dependency matrix is adapted to store data indicating whether a functional memory command received by the system has an ordering dependency on one or more functional memory commands previously received by the system. Numerous other aspects are provided.
摘要:
In a first aspect, a first method of issuing a command on a bus is provided. The first method includes the steps of (1) receiving a first command associated with a first address; (2) delaying the issue of the first command on the bus for a time period; (3) if a second command associated with a second address contiguous with the first address is not received before the time period elapses, issuing the first command on the bus after the time period elapses; and (4) if the second command associated with the second address contiguous with the first address is received before the first command is issued on the bus, combining the first and second commands into a combined command associated with the first address. Numerous other aspects are provided.
摘要:
A first first-in-first-out (FIFO) memory may receive first processor input from a first processor group that includes a first processor. The first processor group is configured to execute program code based on the first processor input that includes a set of input signals, a clock signal, and corresponding data. The first FIFO may store the first processor input and may output the first processor input to a second FIFO memory and to a second processor according to a first delay. The second FIFO memory may store the first processor input and may output the first processor input to a third processor according to a second delay. The second processor may execute at least a first portion of the program code and the third processor may execute at least a second portion of the program code responsive to the first processor input.
摘要:
A first first-in-first-out (FIFO) memory may receive first processor input from a first processor group that includes a first processor. The first processor group is configured to execute program code based on the first processor input that includes a set of input signals, a clock signal, and corresponding data. The first FIFO may store the first processor input and may output the first processor input to a second FIFO memory and to a second processor according to a first delay. The second FIFO memory may store the first processor input and may output the first processor input to a third processor according to a second delay. The second processor may execute at least a first portion of the program code and the third processor may execute at least a second portion of the program code responsive to the first processor input.
摘要:
In a first aspect, a pipelined hardware implementation of a neural network circuit includes an input stage, two or more processing stages and an output stage. Each processing stage includes one or more processing units. Each processing unit includes storage for weighted values, a plurality of multipliers for multiplying input values by weighted values, an adder for adding products outputted from product multipliers, a function circuit for applying a non-linear function to the sum outputted by the adder, and a register for storing the output of the function circuit.
摘要:
An apparatus, program product and method utilize heuristic clustering to generate assignments of circuit elements to clusters or groups to optimize a desired spatial locality metric. For example, circuit elements such as scan-enabled latches may be assigned to individual scan chains using heuristic clustering to optimize the layout of the scan chains in a scan architecture for a circuit design.