Storage organization for transposing a matrix using a streaming engine
Abstract:
Software instructions are executed on a processor within a computer system to configure a steaming engine to operate in either a linear mode or a transpose mode. A stream of addresses is generated using an address generator, in which the stream of addresses includes consecutive nested loop iterations for at least a first loop and a second loop. While in the linear mode, the first loop is treated as an inner loop. While in the transpose mode, the second loop is treated as the inner loop. A matrix can be fetched from memory in the linear mode to provide row-wise vectors. A matrix can be fetched from the memory in the transpose mode to provide column wise vectors. Local storage on the streaming engine is organized as sectors based on the number of rows in the matrix to allow overlapping transposition processing and to minimize memory accesses.
Information query
Patent Agency Ranking
0/0