摘要:
A novel and useful cost effective mechanism for detecting the livelock/starvation of transactions in a ring shaped interconnect that utilizes minimal logic resources. Rather than monitor all transactions concurrently in the ring, the mechanism monitors only a single transaction in the ring. A sampling point is located at a point in the ring which contains a set of N latches. If the monitored transaction is not being starved, it is released and the detection logic moves on the next candidate transaction in round robin fashion. If the monitored transaction passes the sampling point a threshold number of times, it is deemed to be starved and a starvation prevention handling procedure is activated. By traversing the entire ring a single transaction at a time, all starving transactions will eventually be detected with an upper limit on the detection time of O(N2).
摘要:
Methods and apparatuses are disclosed for direct access to cache memory. Embodiments include receiving, by a direct access manager that is coupled to a cache controller for a cache memory, a region scope zero command describing a region scope zero operation to be performed on the cache memory; in response to receiving the region scope zero command, generating a direct memory access region scope zero command, the direct memory access region scope zero command having an operation code and an identification of the physical addresses of the cache memory on which the operation is to be performed; sending the direct memory access region scope zero command to the cache controller for the cache memory; and performing, by the cache controller, the direct memory access region scope zero operation in dependence upon the operation code and the identification of the physical addresses of the cache memory.
摘要:
Systems and methods are disclosed for enhancing the throughput of a processor by minimizing the number of transfers of data associated with data transfer between a register file and a memory stack. The register file used by a processor running an application is partitioned into a number of blocks. A subset of the blocks of the register file is defined in an application binary interface enabling the subset to be pre-allocated and exposed to the application binary interface. Optionally, blocks other than the subset are not exposed to the application binary interface so that the data relating to application function switch or a context switch is not transferred between the unexposed blocks and a memory stack.
摘要:
A method and system for improving performance and latency of instruction execution within an execution pipeline in a processor. The method includes finding, while decoding an instruction, a pointer register used by the instruction; reading the pointer register; validating a pointer register entry; reading, if the pointer register entry is valid, a register file entry; validating a register file entry; validating, if the register file entry is invalid, a valid register file entry wherein the valid register file entry is in the register file's future file; bypassing, if the valid register file entry is valid, a valid register file value from the register file's future file to the execution pipeline wherein the valid register file value is in the valid register file entry; and executing the instruction using the valid register file value; wherein at least one of the steps is carried out using a computer device.
摘要:
Systems and methods are disclosed for enhancing the throughput of a processor by minimizing the number of transfers of data associated with data transfer between a register file and a memory stack. The register file used by a processor running an application is partitioned into a number of blocks. A subset of the blocks of the register file is defined in an application binary interface enabling the subset to be pre-allocated and exposed to the application binary interface. Optionally, blocks other than the subset are not exposed to the application binary interface so that the data relating to application function switch or a context switch is not transferred between the unexposed blocks and a memory stack.
摘要:
A novel and useful cost effective mechanism for detecting the livelock/starvation of transactions in a ring shaped interconnect that utilizes minimal logic resources. Rather than monitor all transactions concurrently in the ring, the mechanism monitors only a single transaction in the ring. A sampling point is located at a point in the ring which contains a set of N latches. If the monitored transaction is not being starved, it is released and the detection logic moves on the next candidate transaction in round robin fashion. If the monitored transaction passes the sampling point a threshold number of times, it is deemed to be starved and a starvation prevention handling procedure is activated. By traversing the entire ring a single transaction at a time, all starving transactions will eventually be detected with an upper limit on the detection time of O(N2).
摘要:
Methods and apparatuses are disclosed for direct access to cache memory. Embodiments include receiving, by a direct access manager that is coupled to a cache controller for a cache memory, a region scope zero command describing a region scope zero operation to be performed on the cache memory; in response to receiving the region scope zero command, generating a direct memory access region scope zero command, the direct memory access region scope zero command having an operation code and an identification of the physical addresses of the cache memory on which the operation is to be performed; sending the direct memory access region scope zero command to the cache controller for the cache memory; and performing, by the cache controller, the direct memory access region scope zero operation in dependence upon the operation code and the identification of the physical addresses of the cache memory.