摘要:
A processing unit includes a local processor core and a cache memory coupled to the local processor core. The cache memory includes a data array, a directory of contents of the data array. The cache memory further includes one or more state machines that service a first set of memory access requests, an arbiter that directs servicing of a second set of memory access requests by reference to the data array and the directory on a fixed schedule, address collision logic that protects memory access requests in the second set by detecting and signaling address conflicts between active memory access requests in the second set and subsequent memory access requests, and dispatch logic coupled to the address collision logic. The dispatch logic dispatches memory access requests in the first set to the one or more state machines for servicing and signals the arbiter to direct servicing of memory access requests in the second set according to the fixed schedule.
摘要:
According to a method of data processing, a predictor is maintained that indicates a historical scope of broadcast for one or more previous operations transmitted on an interconnect of a data processing system. A scope of broadcast of a subsequent operation is predictively selected by reference to the predictor.
摘要:
A cache, system and method for reducing the number of rejected snoop requests. A “stall/reorder unit” in a cache receives a snoop request from an interconnect. The snoop request is entered in the first available latch of the stall/reorder unit unless the stall/reorder unit is full in which case the new snoop request is transmitted to a second unit configured to transmit a request to retry resending the new snoop request. Snoop requests have a higher priority than requests from processors and snoop requests are selected by the arbitration mechanism over processor requests unless the arbitration mechanism requests otherwise (“stall request”) to the stall/reorder unit. By snoop requests having a higher priority than processor requests, the number of snoop requests rejected is reduced. By having the arbitration mechanism issue a stall request, the processor will not be starved.
摘要:
A cache coherent data processing system includes at least a first cache memory supporting a first processing unit and a second cache memory supporting a second processing unit. The first cache memory includes a cache array and a cache directory of contents of the cache array. In response to the first cache memory detecting on an interconnect a broadcast operation that specifies a request address, the first cache memory determines from the operation a type of the operation and a coherency state associated with the request address. In response to determining the type and the coherency state, the first cache memory filters out the broadcast operation without accessing the cache directory.
摘要:
A cache, system and method for reducing the number of rejected snoop requests. A “stall/reorder unit” in a cache receives a snoop request from an interconnect. The snoop request is entered in the first available latch of the stall/reorder unit unless the stall/reorder unit is full in which case the new snoop request is transmitted to a second unit configured to transmit a request to retry resending the new snoop request. Snoop requests have a higher priority than requests from processors and snoop requests are selected by the arbitration mechanism over processor requests unless the arbitration mechanism requests otherwise (“stall request”) to the stall/reorder unit. By snoop requests having a higher priority than processor requests, the number of snoop requests rejected is reduced. By having the arbitration mechanism issue a stall request, the processor will not be starved.
摘要:
According to one embodiment, a method of coherency management in a data processing system includes holding a cache line in an upper level cache memory in an exclusive ownership coherency state and thereafter removing the cache line from the upper level cache memory and transmitting a castout request for the cache line from the upper level cache memory to a lower level cache memory. The castout request includes an indication of a shared ownership coherency state. In response to the castout request, the cache line is placed in the lower level cache memory in a coherency state determined in accordance with the castout request.
摘要:
A data processing system includes a local processor core and a cache memory coupled to the local processor core. The cache memory includes a data array, a directory of contents of the data array, at least one snoop machine that services memory access requests of a remote processor core, and multiple state machines that service memory access requests of the local processor core. The multiple state machines include a first state machine that has a first set of memory access requests of the local processor core that it is capable of servicing and a second state machine that has a different second set of memory access requests of the local processor core that it is capable of servicing.
摘要:
A data processing system includes a processor core and a memory subsystem coupled to the processor core. The memory subsystem includes data storage and a store queue including a plurality of entries for buffering store operations to be performed with reference to the data storage. The memory subsystem further includes a store queue controller that gathers multiple store requests received from the processor core into a single store operation buffered within an entry of the store queue. The store queue controller applies store gathering windows of differing durations to differing ones of the plurality of entries in response to control information received from the processor core.
摘要:
One or more hardware description language (HDL) files describe a plurality of hierarchically arranged design entities defining a digital design to be simulated and a plurality of configuration entities not belonging to the digital design that logically control settings of a plurality of configuration latches in the digital design. The HDL file(s) are compiled to obtain a simulation executable model of the digital design and an associated configuration database. The compiling includes parsing a configuration statement that specifies an association between an instance of a configuration entity and a specified configuration latch, determining whether or not the specified configuration latch is described in the HDL file(s), and if not, creating an indication in the configuration database that the instance of the configuration latch had a specified association to a configuration latch to which it failed to bind.
摘要:
According to a method of simulation processing, an instrumented simulation executable model of a design is built by compiling one or more hardware description language (HDL) files specifying one or more design entities within the design and one or more instrumentation entities and instantiating instances of the one or more instrumentation entities within instances of the one or more design entities. Operation of the design is then simulated utilizing the instrumented simulation executable model. Simulating operation includes each of multiple instantiations of the one or more instrumentation entities generating a respective external phase signal representing an occurrence of a particular phase of operation and instrumentation combining logic generating from external phase signals of the multiple instantiations of the one or more instrumentation entities an aggregate phase signal representing an occurrence of the particular phase.