Abstract:
The number of registers required is reduced by overlapping scalar and vector registers. This allows increased compiler flexibility when mixing scalar and vector instructions. Local register read ports are reduced by restricting read access. Dedicated predicate registers reduce requirements for general registers, and allows reduction of critical timing paths by allowing the predicate registers to be placed next to the predicate unit.
Abstract:
The number of registers required is reduced by overlapping scalar and vector registers. This allows increased compiler flexibility when mixing scalar and vector instructions. Local register read ports are reduced by restricting read access. Dedicated predicate registers reduce requirements for general registers, and allows reduction of critical timing paths by allowing the predicate registers to be placed next to the predicate unit.
Abstract:
Various examples disclosed herein relate to allocation of code and data of application software among memory of a microcontroller unit (MCU), and more particularly to allocating portions of the application software to random access memory or flash memory of an MCU based on information associated with of each portion of the application software. A method is provided herein that comprises instructing an MCU to execute an application software. The method further comprises obtaining information indicative of a performance of portions of the application software on the MCU and capacity requirements of the portions of the application software, and designating, based on the information, each of the portions of the application software for execution from either a first memory or a second memory when deployed to one or more MCUs.
Abstract:
The number of registers required is reduced by overlapping scalar and vector registers. This allows increased compiler flexibility when mixing scalar and vector instructions. Local register read ports are reduced by restricting read access. Dedicated predicate registers reduce requirements for general registers, and allows reduction of critical timing paths by allowing the predicate registers to be placed next to the predicate unit.
Abstract:
Various examples disclosed herein relate to allocation of code and data of application software among memory of a microcontroller unit (MCU), and more particularly to allocating portions of the application software to random access memory or flash memory of an MCU based on information associated with of each portion of the application software. A method is provided herein that comprises instructing an MCU to execute an application software. The method further comprises obtaining information indicative of a performance of portions of the application software on the MCU and capacity requirements of the portions of the application software, and designating, based on the information, each of the portions of the application software for execution from either a first memory or a second memory when deployed to one or more MCUs.
Abstract:
A computer system includes a processor and program storage coupled to the processor. The program storage stores a software instruction translator that, when executed by the processor, is configured to receive source code and translate the source code to a low-level language. The source code is restricted to a subset of a high-level language and the low-level language is a specialized instruction set. Each statement of the subset of the high-level language directly maps to an instruction of the low-level language.
Abstract:
An example accelerator circuit includes a direct memory access (DMA) circuit configured to copy contents of an off-chip memory to an internal memory of a device. In some examples, the off-chip memory is external to the device. The example accelerator circuit also includes a decoder circuit configured to determine a transaction from a processor circuit of the device is associated with a memory address included in a region of the off-chip memory to be copied to the internal memory. In some examples, the decoder circuit is also configured to direct the transaction to one of the off-chip memory or the internal memory based on whether a DMA copy of the region of the off-chip memory to the internal memory has completed.
Abstract:
Various examples disclosed herein relate to allocation of code and data of application software among memory of a microcontroller unit (MCU), and more particularly to allocating portions of the application software to random access memory or flash memory of an MCU based on information associated with of each portion of the application software. A method is provided herein that comprises instructing an MCU to execute an application software. The method further comprises obtaining information indicative of a performance of portions of the application software on the MCU and capacity requirements of the portions of the application software, and designating, based on the information, each of the portions of the application software for execution from either a first memory or a second memory when deployed to one or more MCUs.
Abstract:
Techniques related to executing a plurality of instructions by a processor comprising receiving a first instruction for execution on an instruction execution pipeline, wherein the instruction execution pipeline is in a first execution mode, beginning execution of the first instruction on the instruction execution pipeline, receiving an execution mode instruction to switch the instruction execution pipeline to a second execution mode, switching the instruction execution pipeline to the second execution mode based on the received execution mode instruction, annulling the first instruction based on the execution mode instruction, receiving a second instruction for execution on the instruction execution pipeline, the second instruction, and executing the second instruction.
Abstract:
In an embodiment, a device including a processor, a plurality of hardware accelerator engines and a hardware scheduler is disclosed. The processor is configured to schedule an execution of a plurality of instruction threads, where each instruction thread includes a plurality of instructions associated with an execution sequence. The plurality of hardware accelerator engines performs the scheduled execution of the plurality of instruction threads. The hardware scheduler is configured to control the scheduled execution such that each hardware accelerator engine is configured to execute a corresponding instruction and the plurality of instructions are executed by the plurality of hardware accelerator engines in a sequential manner. The plurality of instruction threads are executed by plurality of hardware accelerator engines in a parallel manner based on the execution sequence and an availability status of each of the plurality of hardware accelerator engines.