摘要:
One embodiment of the present invention sets forth a technique for scheduling thread execution in a multi-threaded processing environment. A two-level scheduler maintains a small set of active threads called strands to hide function unit pipeline latency and local memory access latency. The strands are a sub-set of a larger set of pending threads that is also maintained by the two-leveler scheduler. Pending threads are promoted to strands and strands are demoted to pending threads based on latency characteristics. The two-level scheduler selects strands for execution based on strand state. The longer latency of the pending threads is hidden by selecting strands for execution. When the latency for a pending thread is expired, the pending thread may be promoted to a strand and begin (or resume) execution. When a strand encounters a latency event, the strand may be demoted to a pending thread while the latency is incurred.
摘要:
One embodiment of the present invention sets forth a technique for scheduling thread execution in a multi-threaded processing environment. A two-level scheduler maintains a small set of active threads called strands to hide function unit pipeline latency and local memory access latency. The strands are a sub-set of a larger set of pending threads that is also maintained by the two-leveler scheduler. Pending threads are promoted to strands and strands are demoted to pending threads based on latency characteristics. The two-level scheduler selects strands for execution based on strand state. The longer latency of the pending threads is hidden by selecting strands for execution. When the latency for a pending thread is expired, the pending thread may be promoted to a strand and begin (or resume) execution. When a strand encounters a latency event, the strand may be demoted to a pending thread while the latency is incurred.
摘要:
A server on a packet network is protected from attack by flooding SYN messages that request a connection by comparing the number of SYN messages received within a preselected time interval N, where N is a number SYN messages within said preselected time that, with a predetermined probability, can be considered to be bona fide. When the number of received SYN messages within the preselected time interval is greater than N, corrective action is taken, such as discarding all SYN messages above the received N messages.
摘要:
A multi-threaded processor system, method, and computer program product capable of utilizing a register file cache are provided for simultaneously processing a plurality of threads. A processor capable of simultaneously processing a plurality of threads is provided. The processor includes a register file and a register file cache in communication with the register file.
摘要:
A system for allocating and de-allocating registers of a processor. The system includes a register file having plurality of physical registers and a first table coupled to the register file for mapping virtual register IDs to physical register IDs. A second table is coupled to the register file for determining whether a virtual register ID has a physical register mapped to it in a cycle. The first table and the second table enable physical registers of the register file to be allocated and de-allocated on a cycle-by-cycle basis to support execution of instructions by the processor.
摘要:
A system, method, and computer program product are provided for removing a register of a processor from an active state. In operation, an aspect of a portion of a processor capable of simultaneously processing a plurality of threads is identified. Additionally, a register of the processor is conditionally removed from an active state, based on the aspect.
摘要:
Dynamic warp subdivision (DWS), which allows a single warp to occupy more than one slot in the scheduler without requiring extra register file space, is described. Independent scheduling entities also allow divergent branch paths to interleave their execution, and allow threads that hit in the cache or otherwise have divergent memory-access latency to run ahead. The result is improved latency hiding and memory level parallelism (MLP).