摘要:
A method and system for messaging between processors and co-processors connected through a bus. The method permits a multi-thread system processor to request the services of a processor or co-processor located on the bus. Message control blocks are stored in a memory which identify the physical address of the target processor, as well as a memory location in the memory dedicated to the thread requesting the service. When the system processor requests service of a processor or co-processor, a DCR command is created pointing to the message control block. A message is built from information contained in the message control block or transferred to the processor or co-processor. The return address for the processor or co-processor message is concatenated with the thread number, so that the processor or co-processor can create a return message specifically identifying memory space dedicated to the requesting thread for storage of the response message.
摘要:
Hazard detection is simplified by converting a conditional instruction, operative to perform an operation if a condition is satisfied, into an emissary instruction operative to evaluate the condition and an unconditional base instruction operative to perform the operation. The emissary instruction is executed, while the base instruction is halted. The emissary instruction evaluates the condition and reports the condition evaluation back to the base instruction. Based on the condition evaluation, the base instruction is either launched into the pipeline for execution, or it is discarded (or a NOP, or null instruction, substituted for it). In either case, the dependencies of following instructions may be resolved.
摘要:
A register file is disclosed. The register file includes a plurality of registers and a decoder. The decoder may be configured to receive an address for any one of the registers, and disable a read operation to the addressed register if data in the addressed register is invalid.
摘要:
Data from a source domain operating at a first data rate is transferred to a FIFO in another domain operating at a different data rate. The FIFO buffers data before transfer to a sink for further processing or storage. A source side counter tracks space available in the FIFO. In disclosed examples, the initial counter value corresponds to FIFO depth. The counter decrements in response to a data ready signal from the source domain, without delay. The counter increments in response to signaling from the sink domain of a read of data off the FIFO. Hence, incrementing is subject to the signaling latency between domains. The source may send one more beat of data when the counter indicates the FIFO is full. The last beat of data is continuously sent from the source until it is indicated that a FIFO position became available; effectively providing one more FIFO position.
摘要:
Techniques for ensuring a synchronized predecoding of an instruction string are disclosed. The instruction string contains instructions from a variable length instruction set and embedded data. One technique includes defining a granule to be equal to the smallest length instruction in the instruction set and defining the number of granules that compose the longest length instruction in the instruction set to be MAX. The technique further includes determining the end of an embedded data segment, when a program is compiled or assembled into the instruction string and inserting a padding of length, MAX−1, into the instruction string to the end of the embedded data. Upon predecoding of the padded instruction string, a predecoder maintains synchronization with the instructions in the padded instruction string even if embedded data is coincidentally encoded to resemble an existing instruction in the variable length instruction set.
摘要:
A method for optimizing throughput in a microprocessor that is capable of processing multiple threads of instructions simultaneously. Instruction issue logic is provided between the input buffers and the pipeline of the microprocessor. The instruction issue logic speculatively issues instructions from a given thread based on the probability that the required operands will be available when the instruction reaches the stage in the pipeline where they are required. Issue of an instruction is blocked if the current pipeline conditions indicate that there is a significant probability that the instruction will need to stall in a shared resource to wait for operands. Once the probability that the instruction will stall is below a certain threshold, based on current pipeline conditions, the instruction is allowed to issue.
摘要:
Ways of a cache memory system are designated as being in one of three subsets: a normal subset, a transient subset, and a locked subset. The designation of the respective subsets is provided by a normal subset floor index, a transient subset floor index, and a transient subset ceiling index. The respective indexes are used to select the subset into which new entries are copied from main memory as a result of a cache miss. If the new entry is designated as being characterized by normal program behavior, it is copied into the normal subset in the cache. If the new entry is designated as being characterized by transient program behavior, it is copied into the transient subset in the cache. The relationship between the normal subset and the transient subset is programmable. For example, the normal and the transient subsets may include at least one common way of the cache memory or the transient subset may be completely included in the normal subset or completely separate therefrom.
摘要:
A process and system for transferring data including at least one slave device connected to at least one master device through an arbiter device. The master and slave devices are connected by a single address bus, a write data bus and a read data bus. The arbiter device receives requests for data transfers from the master devices and selectively transmits the requests to the slave devices. The master devices and the slave devices are further connected by a plurality of transfer qualifier signals which may specify predetermined characteristics of the requested data transfers. Control signals are also communicated between the arbiter device and the slave devices to allow appropriate slave devices to latch addresses of requested second transfers during the pendency of current or primary data transfers so as to obviate an address transfer latency typically required for the second transfer. The design is configured to advantageously function in mixed systems which may include address-pipelining and non-address-pipelining slave devices.
摘要:
A system and method for tracing program code within a processor having an embedded cache memory. The non-invasive tracing technique minimizes the need for trace information to be broadcast externally. The tracing technique monitors changes in instruction flow from the normal execution stream of the code. The tracing technique monitors the updating of processor branch target register contents in order to monitor branch target flow of the code. Tracing of the program flow includes tracing instructions both before and after a trace triggering event. The implementation of periodic synchronizing events enables the tracing of instructions occurring before and after a triggering event.
摘要:
A system and method for tracing program code within a processor having an embedded cache memory. The non-invasive tracing technique minimizes the need for trace information to be broadcast externally. The tracing technique monitors changes in instruction flow from the normal execution stream of the code. The tracing technique monitors the updating of processor branch target register contents in order to monitor branch target flow of the code. A FIFO and serial logic circuitry is utilized to minimize the number of chip pins required to broadcast the information from the chip. The tracing technique utilizes instruction and data breakpoint debug functions to signal an external trace tool that a trace event has occurred.