Abstract:
Methods and apparatus for a low energy accelerator processor architecture with short parallel instruction word. An integrated circuit includes a system bus having a data width N, where N is a positive integer; a central processor unit coupled to the system bus and configured to execute instructions retrieved from a memory coupled to the system bus; and a low energy accelerator processor coupled to the system bus and configured to execute instruction words retrieved from a low energy accelerator code memory, the low energy accelerator processor having a plurality of execution units including a load store unit, a load coefficient unit, a multiply unit, and a butterfly/adder ALU unit, each of the execution units configured to perform operations responsive to op-codes decoded from the retrieved instruction words, wherein the width of the instruction words is equal to the data width N. Additional methods and apparatus are disclosed.
Abstract:
Instructions for 32-bit arithmetic support using 16-bit multiply and 32-bit addition without a barrel shifter. Illustrative instructions include operations that include receiving a first 32-bit operand, receiving a second 32-bit operand, shifting the second 32-bit operand right 16 or 15 bits to obtain a shifted second 32-bit operand, and adding the shifted second 32-bit operand and the first 32-bit operand to generate a 32-bit sum.
Abstract:
Methods and apparatus for a low energy accelerator processor architecture with short parallel instruction word. An integrated circuit includes a system bus having a data width N, where N is a positive integer; a central processor unit coupled to the system bus and configured to execute instructions retrieved from a memory coupled to the system bus; and a low energy accelerator processor coupled to the system bus and configured to execute instruction words retrieved from a low energy accelerator code memory, the low energy accelerator processor having a plurality of execution units including a load store unit, a load coefficient unit, a multiply unit, and a butterfly/adder ALU unit, each of the execution units configured to perform operations responsive to op-codes decoded from the retrieved instruction words, wherein the width of the instruction words is equal to the data width N. Additional methods and apparatus are disclosed.
Abstract:
A method for a processor computing a first trigonometric function to use an alternative trigonometric function for certain ranges of the operand. A modulo function may be used to provide an operand with a reduced range, and the modulo function may subtract in multiple steps in a manner that preserves low-order bits.
Abstract:
Computers and methods for performing mathematical functions are disclosed. An embodiment of a computer includes an operations level and a driver level. The operations level performs mathematical operations. The driver level includes a first lookup table and a second lookup table, wherein the first lookup table includes first data for calculating at least one mathematical function using a first level of accuracy. The second lookup table includes second data for calculating the at least one mathematical function using a second level of accuracy, wherein the first level of accuracy is greater than the second level of accuracy. A driver executes either the first data or the second data depending on a selected level of accuracy.
Abstract:
Methods and apparatus for a low energy accelerator processor architecture with short parallel instruction word. An integrated circuit includes a system bus having a data width N, where N is a positive integer; a central processor unit coupled to the system bus and configured to execute instructions retrieved from a memory coupled to the system bus; and a low energy accelerator processor coupled to the system bus and configured to execute instruction words retrieved from a low energy accelerator code memory, the low energy accelerator processor having a plurality of execution units including a load store unit, a load coefficient unit, a multiply unit, and a butterfly/adder ALU unit, each of the execution units configured to perform operations responsive to op-codes decoded from the retrieved instruction words, wherein the width of the instruction words is equal to the data width N. Additional methods and apparatus are disclosed.
Abstract:
An apparatus for a low energy accelerator processor architecture is disclosed. An example arrangement is an integrated circuit that includes a system bus having a data width N, where N is a positive integer; a central processor unit coupled to the system bus and configured to execute instructions retrieved from a memory; a low energy accelerator processor configured to execute instruction words received on the system bus and having a plurality of execution units including a load store unit, a load coefficient unit, a multiply unit, and a butterfly/adder ALU unit, wherein each of the execution units is configured to perform operations responsive to retrieved instruction words; and a data register file comprising a set of data registers coupled to the plurality of execution units, wherein the registers are coupled to selected ones of the plurality of execution units. Additional methods and apparatus are disclosed.
Abstract:
Methods and apparatus for a low energy accelerator processor architecture with short parallel instruction word. An integrated circuit includes a system bus having a data width N, where N is a positive integer; a central processor unit coupled to the system bus and configured to execute instructions retrieved from a memory coupled to the system bus; and a low energy accelerator processor coupled to the system bus and configured to execute instruction words retrieved from a low energy accelerator code memory, the low energy accelerator processor having a plurality of execution units including a load store unit, a load coefficient unit, a multiply unit, and a butterfly/adder ALU unit, each of the execution units configured to perform operations responsive to op-codes decoded from the retrieved instruction words, wherein the width of the instruction words is equal to the data width N. Additional methods and apparatus are disclosed.
Abstract:
Apparatus for a low energy accelerator processor architecture. An example arrangement is an integrated circuit that includes a system bus having a data width N, where N is a positive integer; a central processor unit coupled to the system bus and configured to execute instructions retrieved from a memory ; a low energy accelerator processor configured to execute instruction words coupled to the system bus and having a plurality of execution units including a load store unit, a load coefficient unit, a multiply unit, and a butterfly/adder ALU unit, each of the execution units configured to perform operations responsive to retrieved instruction words; and a non-orthogonal data register file comprising a set of data registers coupled to the plurality of execution units, the registers coupled to selected ones of the plurality of execution units. Additional methods and apparatus are disclosed.