摘要:
A dynamically configured memory controller to prevent speculative memory accesses of non-well behaved memory. Such dynamic configuration of a memory controller may be accomplished by providing to the memory controller guard information associated with memory requests. The memory controller may then prevent speculative memory accesses when the guard information indicates that the memory requests are of non-well behaved memory. The guard information provided to the memory controller may include guard information associated with each memory request provided to the memory controller.
摘要:
Bus performance in a computer system having multiple devices accessing a common shared bus may be improved by providing a flexible bus arbiter. Bus access is controlled using a bus arbiter which is operationally connected to each of the devices. Each device has a fixed programmable priority level and a dynamic priority level associated with it. The dynamic priority level comprises an arbiter dynamic priority level and a master dynamic priority level. Access to the bus by a device is controlled based on the combination of the programmable fixed priority level and the dynamic priority level associated with each device. While the programmable fixed priority level and the arbiter dynamic priority level as set by the arbiter are not controlled by the master, the master dynamic priority level is controlled by the master. If master dynamic priority is enabled, it overrides the arbiter dynamic priority level. If master dynamic priority is not enabled but arbiter dynamic priority is enabled, master dynamic priority overrides the programmable fixed priority level.
摘要:
A communication system and method of communicating including a slave function connected to a master function by a single address bus, a write data bus and a read data bus so as to allow for overlapping multiple cycle read and write operations between the master function and the slave function. Preferably the communication system includes a plurality of slave functions connected to a master function by the single address bus, the write data bus and the read data bus. A plurality of master functions may be connected to the slave functions through a bus arbiter connected to the plurality of master functions by an address bus, a write data bus and a read data bus for each master function. The bus arbiter receives requests for communication operations from the plurality of master functions and selectively transmits the communication operations to the slave functions. In a preferred embodiment of the present invention, the master function and the slave function are further connected by a plurality of transfer qualifier signals which may specify whether the operation is a read or a write operation, the size of the transfer, the direction of the transfer or the type of transfer so as to further facilitate multiple cycle transfers with a single address specified on the single address bus.
摘要:
A process and system for transferring data including at least one slave device connected to at least one master device through an arbiter device. The master and slave devices are connected by a single address bus, a write data bus and a read data bus. The arbiter device receives requests for data transfers from the master devices and selectively transmits the requests to the slave devices. The master devices and the slave devices are further connected by a plurality of transfer qualifier signals which may specify predetermined characteristics of the requested data transfers. Control signals are also communicated between the arbiter device and the slave devices to allow appropriate slave devices to latch addresses of requested second transfers during the pendency of current or primary data transfers so as to obviate an address transfer latency typically required for the second transfer. The design is configured to advantageously function in mixed systems which may include address-pipelining and non-address-pipelining slave devices.
摘要:
Bus performance in a computer system having multiple devices accessing a common shared bus may be improved by increasing throughput and decreasing latency while accounting for dynamic changes in bus usage. Devices submit a priority level along with a bus request to a bus controller. Upon receiving multiple requests, an arbiter of the bus controller compares the priority levels associated with the different bus requests and grants control of the bus to the device having the highest priority level. During each cycle that a device has control of the bus, a feedback logic circuit of the bus controller determines whether other bus requests are pending, and if so, determines the highest pending request priority level. Signals corresponding to the results of these determinations are fed back to each device. The device having control of the bus uses the combination of the currently pending request priority level and the device's own latency timer to determine whether it should maintain control of the bus or relinquish control of the bus.
摘要:
A hierarchical instruction set architecture (ISA) provides pluggable instruction set capability and support of array processors. The term pluggable is from the programmer's viewpoint and relates to groups of instructions that can easily be added to a processor architecture for code density and performance enhancements. One specific aspect addressed herein is the unique compacted instruction set which allows the programmer the ability to dynamically create a set of compacted instructions on a task by task basis for the primary purpose of improving control and parallel code density. These compacted instructions are parallelizable in that they are not specifically restricted to control code application but can be executed in the processing elements (PEs) in an array processor. The ManArray family of processors is designed for this dynamic compacted instruction set capability and also supports a scalable array of from one to N PEs. In addition, the ManArray ISA is defined as a hierarchy of ISAs which allows for future growth in instruction capability and supports the packing of multiple instructions within a hierarchy of instructions.
摘要:
A SIMD machine employing a plurality of parallel processor (PEs) in which communications hazards are eliminated in an efficient manner. An indirect Very Long Instruction Word instruction memory (VIM) is employed along with execute and delimiter instructions. A masking mechanism may be employed to control which PEs have their VIMs loaded. Further, a receive model of operation is preferably employed. In one aspect, each PE operates to control a switch that selects from which PE it receives. The present invention addresses a better machine organization for execution of parallel algorithms that reduces hardware cost and complexity while maintaining the best characteristics of both SIMD and MIMD machines and minimizing communication latency. This invention brings a level of MIMD computational autonomy to SIMD indirect Very Long Instruction Word (iVLIW) processing elements while maintaining the single thread of control used in the SIMD machine organization. Consequently, the term Synchronous-MIMD (SMIMD) is used to describe the present approach.
摘要:
A SIMD machine employing a plurality of parallel processor (PEs) in which communications hazards are eliminated in an efficient manner. An indirect Very Long Instruction Word instruction memory (VIM) is employed along with execute and delimiter instructions. A masking mechanism may be employed to control which PEs have their VIMs loaded. Further, a receive model of operation is preferably employed. In one aspect, each PE operates to control a switch that selects from which PE it receives. The present invention addresses a better machine organization for execution of parallel algorithms that reduces hardware cost and complexity while maintaining the best characteristics of both SIMD and MIMD machines and minimizing communication latency. This invention brings a level of MIMD computational autonomy to SIMD indirect Very Long Instruction Word (iVLIW) processing elements while maintaining the single thread of control used in the SIMD machine organization. Consequently, the term Synchronous-MIMD (SMIMD) is used to describe the present approach.
摘要:
A ManArray processor pipeline design addresses an indirect VLIW memory access problem without increasing branch latency by providing a dynamically reconfigurable instruction pipeline for SIWs requiring a VLIW to be fetched. By introducing an additional cycle in the pipeline only when a VLIW fetch is required, the present invention solves the VLIW memory access problem. The pipeline stays in an expanded state, in general, until a branch type or load VLIW memory type operation is detected returning the pipeline to a compressed pipeline operation. By compressing the pipeline when a branch type operation is detected, the need for an additional cycle for the branch operation is avoided. Consequently, the shorter compressed pipeline provides more efficient performance for branch intensive control code as compared to a fixed pipeline with an expanded number of pipeline stages. In addition, the dynamic reconfigurable pipeline is scalable allowing each processing element (PE) in an array of PEs to expand and compress the pipeline in synchronism allowing iVLIW operations to execute independently in each PE. This is accomplished by having distributed pipelines in operation in parallel, one in each PE and in the controller sequence processor (SP).
摘要:
A data processing system (10) having a bus controller (5) that uses a communication bus (22) which adapts to various system resources (7) and is capable of burst transfers. In one embodiment, the processor core (2) and system resources (7) supply control signals supplying required parameters of the next transfer. The bus controller is capable of transferring operands and/or instructions in incremental bursts from these system resources. Each transfer data burst has an associated unique access address where successive bytes of data are associated with sequential addresses and the burst increment equals the data port size. The burst capability is dependent on the ability of system resource (7) to burst data and can be inhibited with a transfer burst inhibit signal. The length of the desired data is controlled by a sizing signal from the core (2) or from cache and the increment size is supplied by the resource (7).