摘要:
The present invention relates to improvements concerning logic and timing verification as the testability of a hardware circuit comprising embeddings of dynamic logic circuits in a static environment. The clocked macros comprising the dynamic logic circuit are bounded at both input and output by latches, keeping input and output signals to the clocked macro static. The static input signals are processed with wave formatting means in order to generate a wave form usable for an evaluation by the dynamic logic circuit, and the dynamic logic output signal is converted back to a static signal by a set/reset latch such that it can be latched by the clock signal of the static embedding circuit. Thus, the analysis methods for timing and logic simulation during chip design can be the same as those used for static logic and, in particular, the LSSD testing methods can be used.
摘要:
Systems and methods for adaptively mapping system memory address bits into an instruction tag and an index into the cache are disclosed. More particularly, hardware and software are disclosed for observing collisions that occur for a given mapping of system memory bits into a tag and an index. Based on the observations, an optimal mapping may be determined that minimizes collisions.
摘要:
A crossbar (20) circuit with multiplexer (22A, 22B) circuits implemented in a polygonal form on a chip. The crossbar can be used for implementing a permutation of input bits (24A, 24B) controlled by a bit vector (25). Horizontal and vertical wiring lengths in the crossbar (20) are reduced by stacking the operand latches (24A, 24B, 25) and horizontal or vertical multiplexers (22A, 22B). This implementation decreases the latency of the crossbar and avoids latches to store intermediated results, thus reducing area and power consumption.
摘要:
A crossbar (20) circuit with multiplexer (22A, 22B) circuits implemented in a polygonal form on a chip. The crossbar can be used for implementing a permutation of input bits (24A, 24B) controlled by a bit vector (25). Horizontal and vertical wiring lengths in the crossbar (20) are reduced by stacking the operand latches (24A, 24B, 25) and horizontal or vertical multiplexers (22A, 22B). This implementation decreases the latency of the crossbar and avoids latches to store intermediated results, thus reducing area and power consumption.
摘要:
Systems and methods for adaptively mapping system memory address bits into an instruction tag and an index into the cache are disclosed. More particularly, hardware and software are disclosed for observing collisions that occur for a given mapping of system memory bits into a tag and an index. Based on the observations, an optimal mapping may be determined that minimizes collisions.
摘要:
A digital system and method for scanning sequential logic elements are disclosed. The digital system may comprise a plurality of sequential logic elements subdivided into power domains, wherein at least one of the power domains is power gated; a scan chain configured for processing a scan data sequence; a scan enable switch configured for controlling a scan mode; and at least one shadow engine, wherein the at least one shadow engine comprises a control circuit. At least some of the power domains may be interconnected to the scan chain with the scan enable switch, and the scan enable switch may control the scan mode by asserting a scan enable signal. The at least one power gated power domain with one or more sequential logic elements to be power gated may be bypassed via the at least one shadow engine.
摘要:
The present invention relates to a method and system for determining the status of each entry in an instruction window buffer in multi-processor, parallel processing environments. A combinatorial circuit, which automatically generates active instruction window status information, is added to the buffer itself. This status information is used by a plurality of processes like renaming registers and issuing and committing instructions as an output associated with a respective buffer entry.
摘要:
A considerable amount of area can be saved according to the present invention by reducing the number of input ports and the number of output ports to the number n of concurrently intended array accesses. This remarkable reduction of ports and thus an extraordinary associated area saving can be achieved when some knowledge about array utilization is exploited: The array accesses are to be performed with concurrent accesses from at most k particular groups. A group is defined by a plurality of array accesses which have at most one access to the same port at a time. Then, for reading the read results are aligned according to a simple re-wiring scheme to the respective read requesters, whereas for writing the accesses are aligned prior to the array access according to the same or a similar scheme.
摘要:
A method and system for operating a high frequency out-of-order processor with increased pipeline length. A new scheme is disclosed to reduce the pipeline by the detection and exploitation of so called “no dependency” for an instruction. A “no dependency” signal tells that all required source data is available for the instruction at least one cycle before the source data valid bit(s) are inserted into the issue queue. Therefore, one or more stages of the pipeline are bypassed.
摘要:
An improved method and system for operating an out of order processor at a high frequency enabled by an increased pipeline length. It is proposed to shorten the pipeline by a considerable number of stages by accepting that a write after read conflict may occur, when directly after renaming, during the “read ROB” pipeline stage, all the information (tag, validity and data) is read from an Reorder Buffer ROB entry, and is next written, in a following pipeline stage “write RS”, into a reservation station (RS) entry. In order to assure the correctness of processing in particular in cases of dependencies, e.g., write after read conflicts a separate inventional add in logic covers these cases. The logic detects the write after read conflict case of an Instructional Execution Unit (IEU) writing into the particular entry that is selected by the renaming logic during “read ROB”. Then, a separate issue process selects the entries for which a conflict is reported and writes the data into the respective entry of the RS. This increases performance because those conflict cases are rather seldom compared to the broad majority of instructions to be found in a statistically determined average instruction flow.