摘要:
A cache design is described in which corresponding accesses to tag and information arrays are phased in time, and in which tags are retrieved (typically speculatively) from a tag array without benefit of an effective address calculation subsequently used for a corresponding retrieval from an information array. In some exploitations, such a design may allow cycle times (and throughput) of a memory subsystem to more closely match demands of some processor and computation system architectures. Our techniques seek to allow early (indeed speculative) retrieval from the tag array without delays that would otherwise be associated with calculation of an effective address eventually employed for a corresponding retrieval from the information array. Speculation can be resolved using the eventually calculated effective address or using separate functionality. In some embodiments, we use calculated effective addresses for way selection based on tags retrieved from the tag array.
摘要:
A cache design is described in which corresponding accesses to tag and information arrays are phased in time, and in which tags are retrieved (typically speculatively) from a tag array without benefit of an effective address calculation subsequently used for a corresponding retrieval from an information array. In some exploitations, such a design may allow cycle times (and throughput) of a memory subsystem to more closely match demands of some processor and computation system architectures. In some cases, phased access can be described as pipelined tag and information array access, though strictly speaking, indexing into the information array need not depend on results of the tag array access. Our techniques seek to allow early (indeed speculative) retrieval from the tag array without delays that would otherwise be associated with calculation of an effective address eventually employed for a corresponding retrieval from the information array. Speculation can be resolved using the eventually calculated effective address or using separate functionality. In some embodiments, we use calculated effective addresses for way selection based on tags retrieved from the tag array.
摘要:
The present invention provides a latch circuit that is operable to generate a pulse from first and second clock signals to allow gates in a datapath to propagate data with minimal latency. The first clock signal is a version of the system clock and the second control signal is a time-shifted, inverted version of the system clock signal. Each of the individual latches in a datapath comprises data propagation logic. In one embodiment of the invention, the data propagation logic uses the first and second clock signals to generate an “implicit” pulse. In another embodiment of the invention, the data propagation logic uses the first and second clock signals to generate an “explicit” pulse. The implicit and explicit pulses are used to control the transmission gate of the latch to provide propagation of data through the latch with minimal latency.
摘要:
The present invention provides a latch circuit that is operable to generate a pulse from first and second clock signals to allow gates in a datapath to propagate data with minimal latency. The first clock signal is a version of the system clock and the second control signal is a time-shifted, inverted version of the system clock signal. Each of the individual latches in a datapath comprises data propagation logic. In one embodiment of the invention, the data propagation logic uses the first and second clock signals to generate an “implicit” pulse. In another embodiment of the invention, the data propagation logic uses the first and second clock signals to generate an “explicit” pulse. The implicit and explicit pulses are used to control the transmission gate of the latch to provide propagation of data through the latch with minimal latency.
摘要:
A method and data processing system for accessing an entry in a memory array by placing a tag memory unit (114) in parallel with an operand adder circuit (112) to enable tag lookup and generation of speculative way hit/miss information (126) directly from the operands (111, 113) without using the output sum of the operand adder. PGZ-encoded address bits (0:51) from the operands (111, 113) are applied with a carry-out value (Cout48) to a content-addressable memory array (114) to generate two speculative hit/miss signals. A sum value (EA51) computed from the least significant base and offset address bits determines which of the speculative hit/miss signals is selected for output (126).
摘要:
A method and data processing system for accessing an entry in a memory array by placing a tag memory unit (114) in parallel with an operand adder circuit (112) to enable tag lookup and generation of speculative way hit/miss information (126) directly from the operands (111, 113) without using the output sum of the operand adder. PGZ-encoded address bits (0:51) from the operands (111, 113) are applied with a carry-out value (Cout48) to a content-addressable memory array (114) to generate two speculative hit/miss signals. A sum value (EA51) computed from the least significant base and offset address bits determines which of the speculative hit/miss signals is selected for output (126).
摘要:
A method includes storing a first transaction entry to a first software configurable storage location, storing a second transaction entry to a second software configurable storage location, determining that a first transaction indicated by the first transaction entry has occurred, determining that a second transaction indicated by the second transaction entry has occurred subsequent to the first transaction, and, in response to determining that the first transaction occurred and the second transaction occurred, storing at least one transaction attribute captured during at least one clock cycle subsequent to the second transaction. The first and second software configurable storage locations may be located in a trace buffer, where the at least one transaction attribute is stored to the trace buffer and overwrites the first and second transaction attributes. Each transaction entry may include a dead cycle field, a consecutive transaction requirement field, and a last entry field.
摘要:
A multiplexed data flip-flop circuit (500) is described in which a multiplexer (510) outputs functional or scan data, a master latch (520) generates a master latch output signal at a hold time under control of a master clock signal, a slave latch (540) generates a flip flop output signal at a launch time under control of a slave clock signal, clock generation circuitry (550) generates a second clock signal that has a DC state during a functional mode and has a switching state during a scan mode, and data propagation logic circuitry (564) uses the first and second clock signals to generate the master and slave clock signals during a scan mode to delay the launch time of the slave latch with respect to the hold time of the master latch.
摘要:
A digital scan chain system having test scan has a plurality of flip-flop modules, each of the plurality of flip-flop modules having a first data bit input, a second data bit input, a test bit input, a clock input, a first data bit output, a second data bit output, and a test bit output. The test bit output of a first flip-flop module is directly connected to the test bit input of a second flip-flop module with no intervening circuitry. First and second multiplexed master/slave flip-flops are directly serially connected. A clocked latch is coupled to the output of the second multiplexed master/slave flip-flop and provides the test bit output. The clocked latch is clocked only during a test mode to save power.
摘要:
A method of operating a circuit includes receiving a first data signal at a first node. The first node is coupled to a second node to couple the first data signal to the second node. After coupling the first node to the second node, the second node is coupled to a third node to couple the first data signal to the third node. The first node is decoupled from the second node and a first step of latching the first data signal at the third node is performed, wherein the first step of latching is through the second node while the second node is coupled to the third node. The second node is decoupled from the third node and a second step of latching is performed wherein the first data signal latched at the third node while the second node is decoupled from the third node.