Abstract:
This invention optimizes non-shared accesses and avoids dependencies across coherent endpoints to ensure bandwidth across the system even when sharing. The coherence controller is distributed across all coherent endpoints. The coherence controller for each memory endpoint keeps a state around for each coherent access to ensure the proper ordering of events. The coherence controller of this invention uses First-In-First-Out allocation to ensure full utilization of the resources before stalling and simplicity of implementation. The coherence controller provides Snoop Command/Response ID Allocation per memory endpoint.
Abstract:
This invention is data processing apparatus and method. Data is protecting from corruption using an error correction code by generating an error correction code corresponding to the data. In this invention the data and the corresponding error correction code are carried forward to another set of registers without regenerating the error correction code or using the error correction code for error detection or correction. Only later are error correction detection and correction actions taken. The differing data/error correction code registers may be in differing pipeline phases in the data processing apparatus. This invention forwards the error correction code with the data through the entire datapath that carries the data. This invention provides error protection to the whole datapath without requiring extensive hardware or additional time.
Abstract:
A predictive adder produces the result of incrementing and/or decrementing a sum of A and B by a one-bit constant of the form of the form 2k, where k is a bit position at which the sum is to be incremented or decremented. The predictive adder predicts the ripple portion of bits in the potential sum of the first operand A and the second operand B that would be toggled by incrementing or decrementing the sum A+B by the one-bit constant to generate and indication of the ripple portion of bits in the potential sum. The predictive adder uses the indication of the ripple portion of bits in the potential sum and the carry output generated by evaluating A+B to produce the results of at least one of A+B+2k and A+B−2k.
Abstract:
A predictive adder produces the result of incrementing and/or decrementing a sum of A and B by a one-bit constant of the form of the form 2k, where k is a bit position at which the sum is to be incremented or decremented. The predictive adder predicts the ripple portion of bits in the potential sum of the first operand A and the second operand B that would be toggled by incrementing or decrementing the sum A+B by the one-bit constant to generate and indication of the ripple portion of bits in the potential sum. The predictive adder uses the indication of the ripple portion of bits in the potential sum and the carry output generated by evaluating A+B to produce the results of at least one of A+B+2k and A+B−2k.
Abstract:
This invention optimizes non-shared accesses and avoids dependencies across coherent endpoints to ensure bandwidth across the system even when sharing. The coherence controller is distributed across all coherent endpoints. The coherence controller for each memory endpoint keeps a state around for each coherent access to ensure the proper ordering of events. The coherence controller of this invention uses First-In-First-Out allocation to ensure full utilization of the resources before stalling and simplicity of implementation. The coherence controller provides Snoop Command/Response ID Allocation per memory endpoint.
Abstract:
This invention mitigates these deadlocking issues by a adding a separate non-blocking pipeline for snoop returns. This separate pipeline would not be blocked behind coherent requests. This invention also repartitions the master initiated traffic to move cache evictions (both with and without data) and non-coherent writes to the new non-blocking channel. This non-blocking pipeline removes the need for any coherent requests to complete before the snoop request can reach the memory controller. Repartitioning cache initiated evictions to the non-blocking pipeline prevents deadlock when snoop and eviction occur concurrently. The non-blocking channel of this invention combines snoop responses from memory controller initiated requests and master initiated evictions/non-coherent writes.
Abstract:
In described examples, a processor system includes a processor core generating memory transactions, a lower level cache memory with a lower memory controller, and a higher level cache memory with a higher memory controller having a memory pipeline. The higher memory controller is connected to the lower memory controller by a bypass path that skips the memory pipeline. The higher memory controller: determines whether a memory transaction is a bypass write, which is a memory write request indicated not to result in a corresponding write being directed to the higher level cache memory; if the memory transaction is determined a bypass write, determines whether a memory transaction that prevents passing is in the memory pipeline; and if no transaction that prevents passing is determined to be in the memory pipeline, sends the memory transaction to the lower memory controller using the bypass path.
Abstract:
Disclosed embodiments include an electronic device having a processor core, a memory, a register, and a data load unit to receive a plurality of data elements stored in the memory in response to an instruction. All of the data elements hare the same data size, which is specified by one or more coding bits. The data load unit includes an address generator to generate addresses corresponding to locations in the memory at which the data elements are located, and a formatting unit to format the data elements. The register is configured to store the formatted data elements, and the processor core is configured to receive the formatted data elements from the register.
Abstract:
A stream of data is accessed from a memory system using a stream of addresses generated in a first mode of operating a streaming engine in response to executing a first stream instruction. A block cache management operation is performed on a cache in the memory using a block of addresses generated in a second mode of operating the streaming engine in response to executing a second stream instruction.
Abstract:
In described examples, a processor system includes a processor core generating memory transactions, a lower level cache memory with a lower memory controller, and a higher level cache memory with a higher memory controller having a memory pipeline. The higher memory controller is connected to the lower memory controller by a bypass path that skips the memory pipeline. The higher memory controller: determines whether a memory transaction is a bypass write, which is a memory write request indicated not to result in a corresponding write being directed to the higher level cache memory; if the memory transaction is determined a bypass write, determines whether a memory transaction that prevents passing is in the memory pipeline; and if no transaction that prevents passing is determined to be in the memory pipeline, sends the memory transaction to the lower memory controller using the bypass path.