摘要:
A method and processor system that substantially eliminates data bus operations when completing updates of an entire cache line with a full store queue entry. The store queue within a processor chip is designed with a series of AND gates connecting individual bits of the byte enable bits of a corresponding entry. The AND output is fed to the STQ controller and signals when the entry is full. When full entries are selected for dispatch to the RC machines, the RC machine is signaled that the entry updates the entire cache line. The RC machine obtains write permission to the line, and then the RC machine overwrites the entire cache line. Because the entire cache line is overwritten, the data of the cache line is not retrieved when the request for the cache line misses at the cache or when data goes state before write permission is obtained by the RC machine.
摘要:
A method, system, and processor chip design for reducing the latency between completing a LARX operation and receiving the associated STCX operation to complete the update to the cache line. Each entry of the store queue of the issuing processor is provided an additional tracking bit (priority bit). The priority bit is set whenever a STCX operation is placed within the entry. During selection of an entry for dispatch by the arbitration logic, the arbitration logic scans the value of the priority bits of each eligible entry. An entry with the priority bit set is given priority in the selection process within architectural rules. That entry is then selected for dispatch as early as is possible within the established rules.
摘要:
A processor includes at least one instruction execution unit that executes store instructions to obtain store operations and a store queue coupled to the instruction execution unit. The store queue includes a queue entry in which the store queue gathers multiple store operations during a store gathering window to obtain a data portion of a write transaction directed to lower level memory. In addition, the store queue includes dispatch logic that varies a size of the store gathering window to optimize store performance for different store behaviors and workloads.
摘要:
A system and method for cache management in a data processing system having a memory hierarchy of upper memory and lower memory cache. A lower memory cache controller accesses a coherency state table to determine replacement policies of coherency states for cache lines present in the lower memory cache when receiving a cast-in request from one of the upper memory caches. The coherency state table implements a replacement policy that retains the more valuable cache coherency state information between the upper and lower memory caches for a particular cache line contained in both levels of memory at the time of cast-out from the upper memory cache.
摘要:
A method for determining the configuration of a digital design first obtains a set of latch values of a plurality of latches within the digital design. A setting of a Dial instance is then determined based upon the set of latch values by reference to a configuration database that specifies a mapping table uniquely associating each a plurality of different settings of the Dial with a respective one of a plurality of different sets of latch values. The setting of the Dial instance is then output. In one embodiment, the setting of the Dial is contained in a simulation setup file utilized to configure a simulation model to a state approximating the state of the digital design represented by the set of latch values.
摘要:
Methods, data processing systems, and program products supporting multi-cycle simulation are disclosed. According to one method, a configuration database including at least one data structure representing an instance of a Dial entity is received. The instance of the Dial entity has at least an input, an output, and at least one associated latch within a digital design. A value of the output of the instance of the Dial entity controls a value stored within the associated latch. A control file is also received. The control file indicates that at least one associated latch data structure is to be inserted within the configuration database to represent the latch during multi-cycle simulation. In response to receipt of the configuration database and the control file, the configuration database is processed with reference to the control file to insert within the configuration database at least one latch data structure and to associate, within the configuration database, the at least one latch data structure with the instance of the Dial entity.
摘要:
A processing unit for a multiprocessor data processing system includes a processor core including a store-through upper level cache, an instruction sequencing unit that fetches instructions for execution, a data register, and at least one instruction execution unit coupled to the instruction sequencing unit that concurrently executes multiple threads of instructions. The processor core, responsive to the at least one instruction execution unit executing a load-reserve instruction in a first thread that binds to a load target address in the store-through upper level cache during a reservation hazard window associated with a conflicting store-conditional operation of a second thread, causes a subsequent store-conditional operation of the first thread to a store target address matching the load target address to fail if the store-conditional operation of the second thread succeeds.
摘要:
A simulation control program receives a hardware description language (HDL) model including design entities and count event registers. Each count event registers is associated with a respective instance of an event. The count event registers include first and second registers for counting occurrences of a same replicated event generated within different instances of a same design entity having a same hierarchical level within the HDL model. The simulation control program also receives a correlation data structure indicating which count event registers are associated with instances of the same replicated event. During simulation processing, each of the count event registers maintains a respective count value representing a number of times an associated event instance occurs. The simulation control program sums count values of the first and second count event registers in accordance with the correlation data structure and outputs a count event data packet containing the aggregate count value.
摘要:
A system configuration database is constructed in volatile memory by first determining which types of integrated circuits are present in a hardware system and the number of each type. In response to a determination, a system configuration database is loaded into volatile memory that includes a respective chip hardware database for each type of integrated circuit in the hardware system. Each chip hardware database defines a Dial entity controlling which of a plurality of different possible latch values is placed in a hardware latch of the associated type of integrated circuit. The system configuration database includes at least a first chip hardware database for a first type of integrated circuit that contains per-instance information for each of the multiple instances of the first type of integrated circuit within the hardware system.
摘要:
A method and processor system that substantially enhances the store gathering capabilities of a store queue entry to enable gathering of a maximum number of proximate-in-time store operations before the entry is selected for dispatch. A counter is provided for each entry to track a time since a last gather to the entry. When a new gather does not occur before the counter reaches a threshold saturation point, the entry is signaled ready for dispatch. By defining an optimum threshold saturation point before the counter expires, sufficient time is provided for the entry to gather a proximate-in-time store operation. The entry may be deemed eligible for selection when certain conditions occur, including the entry becoming full, issuance of a barrier operation, and saturation of the counter. The use of the counter increases the ability of a store queue entry to complete gathering of enough store operations to update an entire cache line before that entry is dispatched to an RC machine.