摘要:
A method for providing endianness control in a data processing system includes initiating an access which accesses a peripheral, providing a first endianness control that corresponds to the peripheral, and completing the access using the endianness control to affect the endianness order of the information transferred during the access. In one embodiment, the first endianness control overrides a default endianness corresponding to the access. The default endianness may be provided by a master endianness control corresponding to a master requesting the current access. A data processing system includes a first bus master, first and second peripherals, first endianness control corresponding to the first peripheral and second endianness control corresponding to the second peripheral, and control circuitry which uses the first endianness control to control endianness for an access between the first bus master and the first peripheral. In one embodiment, the data processing system may include multiple masters.
摘要:
In a data processing system a processor including processing logic performs data processing. An address translator that is coupled to the processing logic performs address translation and a method thereof. The address translator receives a logical address and converts the logical address to both a physical address and one or more address attributes. Bypass circuitry that is coupled to the address translator selectively provides the logical address as a translated address of the logical address which was received. In order to speed up the memory address translation, the logical address is selectively provided as the translated address prior to providing the one or more address attributes associated with the logical address.
摘要:
Various load and store instructions may be used to transfer multiple vector elements between registers in a register file and memory. A cnt parameter may be used to indicate a total number of elements to be transferred to or from memory, and an rcnt parameter may be used to indicate a maximum number of vector elements that may be transferred to or from a single register within a register file. Also, the instructions may use a variety of different addressing modes. The memory element size may be specified independently from the register element size such that source and destination sizes may differ within an instruction. With some instructions, a vector stream may be initiated and conditionally enqueued or dequeued. Truncation or rounding fields may be provided such that source data elements may be truncated or rounded when transferred. Also, source data elements may be sign- or unsigned-extended when transferred.
摘要:
A coprocessor (14) may be used to perform one or more specialized operations that can be off-loaded from a primary or general purpose processor (12). It is important to allow efficient communication and interfacing between the processor (12) and the coprocessor (14). In one embodiment, a coprocessor (14) generates and provides instructions (200, 220) to an instruction pipe (20) in the processor (12). Because the coprocessor (14) generated instructions are part of the standard instruction set of the processor (12), cache (70) coherency is easy to maintain. Also, circuitry (102) in coprocessor (14) may perform an operation on data while circuitry (106) in coprocessor (14) is concurrently generating processor instructions (200, 220).
摘要:
The present invention relates generally to interfacing a processor with at least one coprocessor. One embodiment relates to a processor having a set of broadcast specifiers which it uses to selectively broadcast an operand that is being written to a register within the processor to a coprocessor communication bus. Each broadcast specifier may therefore include a broadcast indicator corresponding to each general purpose register of the processor. An alternate embodiment may also use the concept of broadcast regions where each broadcast region may have a corresponding broadcast specifier where one broadcast specifier may correspond to multiple broadcast regions. Alternatively, in one embodiment, the processor may use broadcast regions independent of the broadcast specifiers where the coprocessor is able to alter its functionality in response to the current broadcast region. In one embodiment, the processor may provide a region specifier via the coprocessor communication bus to indicate the current broadcast region.
摘要:
A read allocation indicator (e.g. read allocation signal 30) is provided to storage circuitry (e.g. cache 22) to selectively determine whether read allocation will be performed for the read access. Read allocation may include modification of the information content of the cache (22) and/or modification of the read replacement algorithm state implemented by the read allocation circuitry (70) in cache (22). For certain types of debug operations, it may be very useful to provide a read allocation indicator that ensures that no unwanted modification are made to the storage circuitry during a read access. Yet other types of debug operations may want the storage circuitry to be modified in the standard manner when a read access occurs.
摘要:
A system (100) having a plurality of bus masters (111–113) coupled to an arbiter (150) is disclosed. An arbiter (150) is coupled to a first storage location (151) and a second storage location (152), where the first and second storage locations store bus master parking information for a system bus (141). The arbiter (150) receives a parking context indicator (131) that is used to select one of the first and second storage locations (151, 152) to provide bus master parking information to the arbiter (150).
摘要:
The present invention relates generally to data processors and more specifically, to data processors having an adaptive priority controller. One embodiment relates to a method for prioritizing requests in a data processor (12) having a bus interface unit (32). The method includes receiving a first request from a first bus requesting resource (e.g. 30) and a second request from a second bus requesting resource (e.g. 28), and using a threshold corresponding to the first or second bus requesting resource to prioritize the first and second requests. The first and second bus requesting resources may be a push buffer (28) for a cache, a write buffer (30), or an instruction prefetch buffer (24). According to one embodiment, the bus interface unit (32) includes a priority controller (34) that receives the first and second requests, assigns the priority, and stores the threshold in a threshold register (66). The priority controller (34) may also include one or more threshold registers (66), subthreshold registers (68), and control registers (70).
摘要:
Embodiments of the present invention relate generally to data processing systems having instruction folding and methods for controlling execution of a program loop. One embodiment includes detecting execution of a program loop and prefetching data in response to detecting execution of the program loop. Another embodiment includes detecting execution of a program loop and scanning the program loop for remote independent instructions or data dependencies during at least one iteration. Another embodiment includes detecting execution of a program loop and storing intra-loop data dependency information in a dependency bit vector, and using the dependency bit vector to select at least one local independent instruction available for folding. One embodiment includes an instruction folding unit comprising a first controller, a second controller, and a storage unit coupled to the second controller. Another embodiment includes a data processing system comprising a validation counter and a storage unit coupled to the validation counter where the storage unit includes a dependency bit vector corresponding to instructions of a program loop.
摘要:
Embodiments of the present invention relate to instruction fetching in data processing systems. One aspect involves a data processor (202) to execute instructions and to fetch instructions from a memory (208) according to a fetch size. This data processor (202) comprises a first input (212) to receive instructions, control logic (402) to decode the instructions, and an instruction pipeline (400) coupled to the first input (212) and the control logic (400). The instruction pipeline (400) is responsive to a first signal (214) to set the fetch size to one of a first size and a second size. The data processor (202) therefore allows an instruction fetch policy to be altered based on the characteristics of an accessed device in order to achieve improved performance.