Abstract:
In at least one example embodiment, a microprocessor circuit is provided that includes a microprocessor core coupled to a data memory via a data memory bus comprising a predetermined integer number of data wires (J); the single-ported data memory configured for storage of vector input elements of an N element vector in a predetermined vector element order and storage of matrix input elements of an M×N matrix comprising M columns of matrix input elements and N rows of matrix input elements; a vector matrix product accelerator comprising a datapath configured for multiplying the N element vector and the matrix to compute an M element result vector, the vector matrix product accelerator comprising: an input/output port interfacing the data memory bus to the vector matrix product accelerator; a plurality of vector input registers for storage respective input vector elements received through the input/output port.