Abstract:
Command translation of burst commands is described. A slave processor local bus (“PLB”) bridge, part of a processor block core embedded in a host IC, has a data size threshold to allow access to a crossbar switch device. A master device, coupled to the slave PLB bridge, has any of a plurality of command bus widths. A burst command is issued via a command bus, having a command bus width of the plurality, from the master device for the slave PLB bridge. The burst command is converted to a native bus width of the slave processor logic block if the command bus width is not equal to the native bus width. The burst command is translated if execution of the burst command will exceed the data size threshold and passed without the translating if the execution of the burst command will not exceed the data size threshold.
Abstract:
An ASIC block embedded in a host IC has a first clock domain with a first frequency of operation that is at least equal to a second frequency of operation of a second clock domain in the host IC but external to the ASIC block. FPGA logic in the second clock domain interfaces with the ASIC block; and a PLL located in the host integrated circuit but external to the ASIC block is coupled to receive a reference clock signal and configured to generate clock signals. Two of the clock signals are respectively sent to the FPGA logic and the ASIC block to make one appear to be produced earlier in time than the other with respect to the ASIC block to compensate for a clock insertion delay and for a clock-to-output time associated with the FPGA logic that at least approximates zero.
Abstract:
A method of enabling variable latency data transfers in an electronic device, such as an FPGA with an embedded processor, is described. According to one aspect of the invention, a method comprises steps of providing an address for a data transfer between a memory controller and a peripheral device; coupling an address valid signal to the peripheral device; transferring the data between the memory controller and the peripheral device; and receiving a data transfer complete signal at the memory controller. According to another aspect of the invention, an integrated circuit enabling a variable latency data transfer is described. The integrated circuit comprises peripheral device; a memory controller coupled to the peripheral device; an address valid signal coupled from the memory controller to the peripheral device; and a transfer complete signal coupled from the peripheral device to the memory controller.
Abstract:
Method of informing a processor that a coprocessor instruction is not executable by a coprocessor is described. The coprocessor, instantiated in configurable logic, is configured to execute a subset of coprocessor instructions, excluding user-selected instructions not instantiated. The processor is coupled to the coprocessor via a controller. The coprocessor instruction is sent from the processor to the controller, which queries control logic to determine whether the coprocessor is configured to execute the coprocessor instruction. If a control bit is set to disable an instruction or group of instructions, the coprocessor instruction is not executable by the coprocessor.
Abstract:
Method and apparatus for non-lock-step operation between a processor and a controller is described. An instruction is provided from the processor to the controller. A busy signal is provided from the controller to the processor to indicate that the controller is not ready to execute the instruction. Initiation of execution of the instruction by the controller is done while continuing to indicate to the processor that the controller is not ready to execute the instruction.
Abstract:
A bus arbiter controls the bus frequency in a system that includes a plurality of bus masters and a plurality of slaves. The bus frequency is determined according to the internal frequency of the devices that are part of the transaction. Additionally, the bus frequency is set according to the length of the bus between the devices that are a part of the transaction and, correspondingly, the expected amount of impedance there between. As a part of the present invention, a master seeking bus resources to initiate a transaction generates a bus request and a destination address to the bus arbiter so that it may determine a corresponding bus frequency in advance. Thereafter, the bus arbiter sets the bus frequency to a value that corresponds to the transaction that is about to take place thereon. Next, the bus arbiter issues a grant signal to enable the master to use the bus. Each slave device for a transaction then generates or receives sample cycle signals indicating when a signal should be read on the bus.
Abstract:
The present invention provides a bus architecture for a data processing system that improves transfers of vector data using a vector transfer unit (VTU). An external bus is coupled between the vector transfer unit and the memory. The external bus includes a system command bus that is used to transmit a data transfer command. The command is based on a corresponding vector transfer instruction in the application program, such as load vector data or store vector data. The commands for transferring the data elements include a burst read command and a burst write command. A variable number of data elements may be transferred, according to the user's requirements. The system command bus is also capable of transmitting a packing ratio that indicates the number of data elements that fit in the width of the external bus. This allows the entire bandwidth of the external bus to be used during vector data transfers. The external bus also includes an address bus for transmitting the starting address, the length, and the stride of the vector data to be transferred. This allows an external agent to properly unpack the data elements on their correct boundary. Input and output validity signals provide an indication of the validity of the data elements transferred as well as to indicate when the transfer of the data elements is interrupted. A system clock signal is also included in the external bus to indicate the transfer rate of the data elements.
Abstract:
A vector transfer unit for handling transfers of vector data between a memory and a data processor by one or more application programs in a computer system. A compiler identifies the use of vector data in the application program and implements one or more vector instructions for transferring the vector data between memory and registers used to perform calculations on the vector data. The compiler also schedules transfers of portions of the vector data required in a calculation so that calculations on a portion of the vector data are performed while a subsequent portion of the vector data is transferred. A vector buffer pool is partitioned into one or more vector buffers based on configuration information including the number of vectors buffers required by an application program and the size required for each vector buffer. The vector buffers are allocated for exclusive use by an application program that is executing in the data processor. During a context switch between application programs, a synchronization instruction is used to allow the instructions issued by one application program to finish before any transfer instructions issued by the second application program may begin. Instructions for indicating whether the vector buffer pool is available for use are also included.
Abstract:
A vector transfer unit for handling transfers of vector data between a memory and a data processor in a computer system. Vector data transfer instructions are posted to an instruction queue in the vector transfer unit. Program instructions for performing a burst transfer include determining the starting address of the vector data to be transferred, the ending address of the vector data to be transferred, and whether the ending address of the vector data to be transferred is within the same virtual memory page as the starting address. The ending address of the vector data to be transferred is determined based on the number of data elements to be transferred, the stride of the vector data to be transferred, and the width of the vector data elements to be transferred. When the amount of data to be transferred is divisible by a factor of two, the multiplication of the stride and width of the data elements is carried out by shifting. An address error exception occurs when the ending address of the vector data to be transferred is not within the same virtual memory page as the starting address. The ending address of the vector data to be transferred is determined in parallel with determining the starting address of the vector data to be transferred.
Abstract:
A processor local bus bridge for a processor block ASIC core for embedding in an IC is described. A core logic-to-core logic bridge includes a slave processor local bus interface, a crossbar switch coupled to the slave processor local bus interface and a master processor local bus interface coupled to the crossbar switch. The slave processor local bus interface and the master processor local bus interface are coupled to one another via the crossbar switch for bidirectional communication between a first and a second portion of core logic. The bridge provides rate adaptation for bridging for use of a frequency of operation associated with the crossbar switch which has substantially greater frequencies of operation than those associated with the core logic sides of the master and slave processor local bus interfaces.