Abstract:
Disclosed embodiments include an electronic device having a processor core, a memory, a register, and a data load unit to receive a plurality of data elements stored in the memory in response to an instruction. All of the data elements hare the same data size, which is specified by one or more coding bits. The data load unit includes an address generator to generate addresses corresponding to locations in the memory at which the data elements are located, and a formatting unit to format the data elements. The register is configured to store the formatted data elements, and the processor core is configured to receive the formatted data elements from the register.
Abstract:
A stream of data is accessed from a memory system using a stream of addresses generated in a first mode of operating a streaming engine in response to executing a first stream instruction. A block cache management operation is performed on a cache in the memory using a block of addresses generated in a second mode of operating the streaming engine in response to executing a second stream instruction.
Abstract:
A stream of data is accessed from a memory system using a stream of addresses generated in a first mode of operating a streaming engine in response to executing a first stream instruction. A block cache preload operation is performed on a cache in the memory using a block of addresses generated in a second mode of operating the streaming engine in response to executing a second stream instruction.
Abstract:
A stream of data is accessed from a memory system using a stream of addresses generated in a first mode of operating a streaming engine in response to executing a first stream instruction. A block cache management operation is performed on a cache in the memory using a block of addresses generated in a second mode of operating the streaming engine in response to executing a second stream instruction.
Abstract:
This invention addresses implements a range of interesting technologies into a single block. Each DSP CPU has a streaming engine. The streaming engines include: a SE to L2 interface that can request 512 bits/cycle from L2; a loose binding between SE and L2 interface, to allow a single stream to peak at 1024 bits/cycle; one-way coherence where the SE sees all earlier writes cached in system, but not writes that occur after stream opens; full protection against single-bit data errors within its internal storage via single-bit parity with semi-automatic restart on parity error.
Abstract:
This invention addresses implements a range of interesting technologies into a single block. Each DSP CPU has a streaming engine. The streaming engines include: a SE to L2 interface that can request 512 bits/cycle from L2; a loose binding between SE and L2 interface, to allow a single stream to peak at 1024 bits/cycle; one-way coherence where the SE sees all earlier writes cached in system, but not writes that occur after stream opens; full protection against single-bit data errors within its internal storage via single-bit parity with semi-automatic restart on parity error.
Abstract:
An example apparatus includes: bandwidth estimator circuitry configured to: obtain a first memory transaction; and determine a consumed bandwidth associated with the memory transaction; and gate circuitry configured to: permit transmission of the memory transaction to a memory controller circuitry; determine whether to gate a second memory transaction generated by a source of the first memory transaction based on the consumed bandwidth of the first memory transaction; and when it is determined to gate the second memory transaction, prevent transmission of the second memory transaction for an amount of time based on the consumed bandwidth.
Abstract:
A device includes a data path, a first interface configured to receive a first memory access request from a first peripheral device, and a second interface configured to receive a second memory access request from a second peripheral device. The device further includes an arbiter circuit configured to, in a first clock cycle, a pre-arbitration winner between a first memory access request and a second memory access request based on a first number of credits allocated to a first destination device and a second number of credits allocated to a second destination device. The arbiter circuit is further configured to, in a second clock cycle select a final arbitration winner from among the pre-arbitration winner and a subsequent memory access request based on a comparison of a priority of the pre-arbitration winner and a priority of the subsequent memory access request.
Abstract:
Techniques for loading data, comprising receiving a memory management command to perform a memory management operation to load data into the cache memory before execution of an instruction that requests the data, formatting the memory management command into one or more instruction for a cache controller associated with the cache memory, and outputting an instruction to the cache controller to load the data into the cache memory based on the memory management command.
Abstract:
A device includes a data path, a first interface configured to receive a first memory access request from a first peripheral device, and a second interface configured to receive a second memory access request from a second peripheral device. The device further includes an arbiter circuit configured to, in a first clock cycle, a pre-arbitration winner between a first memory access request and a second memory access request based on a first number of credits allocated to a first destination device and a second number of credits allocated to a second destination device. The arbiter circuit is further configured to, in a second clock cycle select a final arbitration winner from among the pre-arbitration winner and a subsequent memory access request based on a comparison of a priority of the pre-arbitration winner and a priority of the subsequent memory access request.