摘要:
In a computer system, an apparatus for handling lock conditions wherein a first instruction executed by a first processor processes data that is common to a second processor while the second processor is locked from simultaneously executing a second instruction that also processes this same data. A lock bit is set when the first processor begins execution of the first instruction. Thereupon, the second processor is prevented from executing its instruction until the first processor has completed its processing of the shared data. Hence, the second processor queues its request in a buffer. The lock bit is cleared after the first processor has completed execution of its instruction. The first processor then checks the buffer for any outstanding requests. In response to the second processor's queued request, the first processor transmits a signal to the second processor indicating that the data is now not locked.
摘要:
A computer system comprising a plurality of caching agents with a cache hierarchy, the caching agents sharing memory across a system bus and issuing memory access requests in accordance with a protocol wherein a line of a cache has a present state comprising one of a plurality of line states. The plurality of line states includes a modified (M) state, wherein a line of a first caching agent in M state has data which is more recent than any other copy in the system; an exclusive (E) state, wherein a line in E state in a first caching agent is the only one of the agents in the system which has a copy of the data in a line of the cache, the first caching agent modifying the data in the cache line independent of other said agents coupled to the system bus; a shared (S) state, wherein a line in S state indicates that more than one of the agents has a copy of the data in the line; and an invalid (I) state indicating that the line does not exist in the cache. A read or a write to a line in I state results in a cache miss. The present invention associates states with lines and defines rules governing state transitions. State transitions depend on both processor generated activities and activities by other bus agents, including other processors. Data consistency is guaranteed in systems having multiple levels of cache and shared memory and/or multiple active agents, such that no agent ever reads stale data and actions are serialized as needed.
摘要:
A microprocessor having a bus for the transmission of data, an execution unit for processing data and instructions, a memory for storing data and instructions, and a write combining buffer for combining data of at least two write commands into a single data set, wherein the combined data set is transmitted over the bus in one clock cycle rather than two or more clock cycles. Thereby, buss traffic is minimized. The write combining buffer is comprised of a single line having a 32-byte data portion, a tag portion, and a validity portion. The tag entry specifies the address corresponding to the data currently stored in the data portion. There is one valid bit corresponding to each byte of the data portion which specifies whether that byte currently contains useful data. So long as subsequent write operations to the write combining buffer result in hits, the data is written to the buffer's data portion. But when a miss occurs, the line is reallocated, and the old data is written to the main memory. Thereupon, the valid bits are cleared, and the new data and its address are written to the write combining buffer.
摘要:
The data cache unit includes a separate fill buffer and a separate write-back buffer. The fill buffer stores one or more cache lines for transference into data cache banks of the data cache unit. The write-back buffer stores a single cache line evicted from the data cache banks prior to write-back to main memory. Circuitry is provided for transferring a cache line from the fill buffer into the data cache banks while simultaneously transferring a victim cache line from the data cache banks into the write-back buffer. Such allows the overall replace operation to be performed in only a single clock cycle. In a particular implementation, the data cache unit is employed within a microprocessor capable of speculative and out-of-order processing of memory instructions. Moreover, the microprocessor is incorporated within a multiprocessor computer system wherein each microprocessor is capable of snooping the cache lines of data cache units of each other microprocessor. The data cache unit is also a non-blocking cache.
摘要:
A data cache and a plurality of companion fill buffers having corresponding tag matching circuitry are provided to a computer system. Each fill buffer independently stores and tracks a replacement cache line being filled with data returning from main memory in response to a cache miss. When the cache fill is completed, the replacement cache line is output for the cache tag and data arrays of the data cache if the memory locations are cacheable and the cache line has not been snoop hit while the cache fill was in progress. Additionally, the fill buffers are organized and provided with sufficient address and data ports as well as selectors to allow the fill buffers to respond to subsequent processor loads and stores, and external snoops that hit their cache lines while the cache fills are in progress. As a result, the cache tag and data arrays of the data cache can continue to serve subsequent processor loads and stores, and external snoops, while one or more cache fills are in progress, without ever having to stall the processor.
摘要:
A method of preparing a circuit model for simulation comprises decomposing the circuit model having a number of latches into a plurality of extended latch boundary components and partitioning the plurality of extended latch boundary components. Decomposing and partitioning the circuit model may include decomposing hierarchical cells of the circuit model, and using a constructive bin-packing heuristic to partition the plurality of extended latch boundary components. The partitioned circuit model is compiled, and simulated on a uni-processor, a multi-processor, or a distributed processing computer system.
摘要:
Disclosed are embodiments for seamless, single-step, and speech-triggered transition of a host processor and/or computing device from a low functionality mode to a high functionality mode in which full vocabulary speech recognition can be accomplished. First audio samples are captured by a low power audio processor while the host processor is in a low functionality mode. The low power audio processor may identify a predetermined audio pattern. The low power audio processor, upon identifying the predetermined audio pattern, triggers the host processor to transition to a high functionality mode. An end portion of the first audio samples that follow an end-point of the predetermined audio pattern may be stored in system memory accessible by the host processor. Second audio samples are captured and stored with the end portion of the first audio samples. Once the host processor transitions to a high functionality mode, multi-channel full vocabulary speech recognition can be performed and functions can be executed based on detected speech interaction phrases.
摘要:
A data cache and a plurality of companion fill buffers having corresponding tag matching circuitry are provided to a computer system. Each fill buffer independently stores and tracks a replacement cache line being filled with data returning from main memory in response to a cache miss. When the cache fill is completed, the replacement cache line is output for the cache tag and data arrays of the data cache if the memory locations are cacheable and the cache line has not been snoop hit while the cache fill was in progress. Additionally, the fill buffers are organized and provided with sufficient address and data ports as well as selectors to allow the fill buffers to respond to subsequent processor loads and stores, and external snoops that hit their cache lines while the cache fills are in progress. As a result, the cache tag and data arrays of the data cache can continue to serve subsequent processor loads and stores, and external snoops, while one or more cache fills are in progress, without ever having to stall the processor.
摘要:
A method and apparatus for performing distributed simulation is presented. According to an embodiment of the present invention, simulators are interfaced to a simulation backplane via simulator-dependent interfaces (SDI's). The simulators exchange messages via the simulation backplane and the SDI's. The SDI's convert the exchanged messages between a data format supported by the backplane and a data format supported by the simulator to which the interface is connected. By interfacing the simulators with the backplane via SDI's, the validation environment may be changed without reconfiguring the backplane.
摘要:
In one embodiment, a memory cell having a first port and a second port is provided. A first word line is associated with the first port, and a second word line is associated with the second port. A first driver is associated with the first word line, and a second driver is associated with the second word line. A decoder is associated with the first and second drivers.