Abstract:
Systems, apparatuses, and methods related to a block-based processor core topology register are disclosed. In one example of the disclosed technology, a processor can include a plurality of block-based processor cores for executing a program including a plurality of instruction blocks. A respective block-based processor core can include a sharable resource and a programmable composition topology register. The programmable composition topology register can be used to assign a group of the physical processor cores that share the sharable resource.
Abstract:
Apparatus and methods are disclosed for controlling execution of memory access instructions in a block-based processor architecture using a hardware structure that generates a relative ordering of memory access instruction in an instruction block. In one example of the disclosed technology, a method of executing an instruction block having a plurality of memory load and/or memory store instructions includes decoding an instruction block encoding a plurality of memory access instructions and generating data indicating a relative order for executing the memory access instructions in the instruction block and scheduling operation of a portion of the instruction block based at least in part on the relative order data. In some examples, a store vector data register can store the generated relative ordering data for use in subsequent instances of the instruction block.
Abstract:
A vector data access unit for accessing data stored within a data store in response to decoded vector data access instructions is disclosed. Each of the vector data access instructions comprise a plurality of elements indicating a data access to be performed, the elements being in an order within the vector data access instruction that the corresponding data access is instructed to be performed in. The vector data access unit comprises data access ordering circuitry for issuing data access requests indicated by the elements to the data store, the data access ordering circuitry being configured in response to receipt of at least two decoded vector data access instructions, an earlier of the at least two decoded vector data access instructions being received before a later of the at least two decoded vector instructions and one of the at least two decoded vector data access instructions being a write instruction and to an indication that data accesses from the at least two decoded vector data access instructions can be interleaved to a limited extent, to: determine for each of the at least two vector data access instructions, from a position of the elements within the plurality of elements which of the plurality of data accesses indicated by the plurality of elements is a next data access to be performed for the vector data access instructions, the data accesses being performed in the instructed order; determine an element indicating the next data access for each of said vector data access instructions; select one of the next data accesses as a next data access to be issued to the data store in dependence upon an order the at least two vector data instructions were received in and the position of the elements indicating the next data accesses relative to each other within their respective plurality of elements, subject to a constraint that a difference between a numerical position of the element indicating the next data access within the plurality of elements of a later of the vector data access instructions and a numerical position of the element indicating the next data access within the plurality of elements of an earlier vector access data instruction is less than a predetermined value.
Abstract:
A microcontroller has a data memory divided into a plurality of memory banks, an address multiplexer for providing an address to the data memory, an instruction register providing a first partial address to a first input of the address multiplexer, a bank select register which is not mapped to the data memory for providing a second partial address to a the first input of the address multiplexer, and a plurality of special function registers mapped to the data memory, wherein the plurality of special function registers comprises an indirect access register coupled with a second input of the address multiplexer, and wherein the data memory comprises more than one memory bank of the plurality of memory banks that form a block of linear data memory to which no special function registers are mapped.
Abstract:
A computer (12) having multiple data paths (38a-d) connecting to other devices, which may be similar computers. A register (40d) is provided that has bits (110) programmatically settable to address each of the data paths such that the computer can communicate via multiple of the data paths based on which bits are concurrently set in the register. Optionally, multiple of the computers can be connected in series (termed a pipeline") or to form an array (10).
Abstract:
A method, system and apparatus are providing fast access to memory in a stack. The system and apparatus include an address bit, a stack pointer, and fast access random access memory ("RAM"). The method provides that, when a first address mode is used in conjunction with the address bit and the stack pointer, the location of the access RAM can be shifted in order to achieve an index of literal offset address mode.
Abstract:
Technology related to prefetching instruction blocks is disclosed. In one example of the disclosed technology, a processor comprises a block-based processor core for executing a program comprising a plurality of instruction blocks. The block-based processor core can include prefetch logic and a local buffer. The prefetch logic can be configured to receive a reference to a predicted instruction block and to determine a mapping of the predicted instruction block to one or more lines. The local buffer can be configured to selectively store portions of the predicted instruction block and to provide the stored portions of the predicted instruction block when control of the program passes along a predicted execution path to the predicted instruction block.
Abstract:
Technology related to prefetching data associated with predicated stores of programs in block-based processor architectures is disclosed. In one example of the disclosed technology, a processor includes a block-based processor core for executing an instruction block comprising a plurality of instructions. The block-based processor core includes decode logic and prefetch logic. The decode logic is configured to detect a predicated store instruction of the instruction block. The prefetch logic is configured to calculate a target address of the predicated store instruction and initiate a memory operation associated with the calculated target address before a predicate of the predicated store instruction is calculated.
Abstract:
Apparatus and methods are disclosed for controlling instruction flow in block-based processor architectures. In one example of the disclosed technology, an instruction block address register stores an index address to a memory storing a plurality of instructions for an instruction block, the indexed address being inaccessible when the processor is in one or more unprivileged operational modes, one or more execution units configured to execute instructions for the instruction block, and a control unit configured to fetch and decode two or more of the plurality of instructions from the memory based on the indexed address.