Abstract:
Methods and apparatus for a low energy accelerator processor architecture with short parallel instruction word. An integrated circuit includes a system bus having a data width N, where N is a positive integer; a central processor unit coupled to the system bus and configured to execute instructions retrieved from a memory coupled to the system bus; and a low energy accelerator processor coupled to the system bus and configured to execute instruction words retrieved from a low energy accelerator code memory, the low energy accelerator processor having a plurality of execution units including a load store unit, a load coefficient unit, a multiply unit, and a butterfly/adder ALU unit, each of the execution units configured to perform operations responsive to op-codes decoded from the retrieved instruction words, wherein the width of the instruction words is equal to the data width N. Additional methods and apparatus are disclosed.
Abstract:
An apparatus for a low energy accelerator processor architecture is disclosed. An example arrangement is an integrated circuit that includes a system bus having a data width N, where N is a positive integer; a central processor unit is coupled to the system bus and configured to execute instructions retrieved from a memory; a low energy accelerator processor is configured to execute instruction words received on the system bus and has a plurality of execution units including a load store unit, a load coefficient unit, a multiply unit, and a butterfly/adder ALU unit, wherein each of the execution units is configured to perform operations responsive to retrieved instruction words; and a non-orthogonal data register file comprising a set of data registers coupled to the plurality of execution units, wherein the registers coupled to selected ones of the plurality of execution units. Additional methods and apparatus are disclosed.
Abstract:
Computers and methods for performing mathematical functions are disclosed. An embodiment of a computer includes an operations level and a driver level. The operations level performs mathematical operations. The driver level includes a first lookup table and a second lookup table, wherein the first lookup table includes first data for calculating at least one mathematical function using a first level of accuracy. The second lookup table includes second data for calculating the at least one mathematical function using a second level of accuracy, wherein the first level of accuracy is greater than the second level of accuracy. A driver executes either the first data or the second data depending on a selected level of accuracy.
Abstract:
Computers and methods for performing mathematical functions are disclosed. An embodiment of a computer includes an operations level and a driver level. The operations level performs mathematical operations. The driver level includes a first lookup table and a second lookup table, wherein the first lookup table includes first data for calculating at least one mathematical function using a first level of accuracy. The second lookup table includes second data for calculating the at least one mathematical function using a second level of accuracy, wherein the first level of accuracy is greater than the second level of accuracy. A driver executes either the first data or the second data depending on a selected level of accuracy.
Abstract:
Computers and methods for performing mathematical functions are disclosed. An embodiment of a computer includes an operations level and a driver level. The operations level performs mathematical operations. The driver level includes a first lookup table and a second lookup table, wherein the first lookup table includes first data for calculating at least one mathematical function using a first level of accuracy. The second lookup table includes second data for calculating the at least one mathematical function using a second level of accuracy, wherein the first level of accuracy is greater than the second level of accuracy. A driver executes either the first data or the second data depending on a selected level of accuracy.
Abstract:
Described examples include integrated circuits such as microcontrollers with a low energy accelerator processor circuit or other application specific integrated processor circuit including a load store circuit operative to perform load and store operations associated with at least one register and a low gate count shift circuit to selectively shift the data of the register by only an integer number bits less than the register data width without using a barrel shifter for low power operation to support vector operations for FFT or filtering functions.
Abstract:
Described examples include integrated circuits such as microcontrollers with a low energy accelerator processor circuit or other application specific integrated processor circuit including a load store circuit operative to perform load and store operations associated with at least one register and a low gate count shift circuit to selectively shift the data of the register by only an integer number of bits less than the register data width without using a barrel shifter for low power operation to support vector operations for FFT or filtering functions.
Abstract:
An apparatus for a low energy accelerator processor architecture is disclosed. An example arrangement is an integrated circuit that includes a system bus having a data width N, where N is a positive integer; a central processor unit coupled to the system bus and configured to execute instructions retrieved from a memory; a low energy accelerator processor configured to execute instruction words received on the system bus and having a plurality of execution units including a load store unit, a load coefficient unit, a multiply unit, and a butterfly/adder ALU unit, wherein each of the execution units is configured to perform operations responsive to retrieved instruction words; and a data register file comprising a set of data registers coupled to the plurality of execution units, wherein the registers are coupled to selected ones of the plurality of execution units. Additional methods and apparatus are disclosed.
Abstract:
Methods and apparatus for a low energy accelerator processor architecture with short parallel instruction word. An integrated circuit includes a system bus having a data width N, where N is a positive integer; a central processor unit coupled to the system bus and configured to execute instructions retrieved from a memory coupled to the system bus; and a low energy accelerator processor coupled to the system bus and configured to execute instruction words retrieved from a low energy accelerator code memory, the low energy accelerator processor having a plurality of execution units including a load store unit, a load coefficient unit, a multiply unit, and a butterfly/adder ALU unit, each of the execution units configured to perform operations responsive to op-codes decoded from the retrieved instruction words, wherein the width of the instruction words is equal to the data width N. Additional methods and apparatus are disclosed.
Abstract:
Instructions for 32-bit arithmetic support using 16-bit multiply and 32-bit addition without a barrel shifter. Illustrative instructions include operations that include receiving a first 32-bit operand, receiving a second 32-bit operand, shifting the second 32-bit operand right 16 or 15 bits to obtain a shifted second 32-bit operand, and adding the shifted second 32-bit operand and the first 32-bit operand to generate a 32-bit sum.