摘要:
Techniques for processing requests from a processing thread for a shared resource shared among threads on one or more processors include receiving a bundle of requests from a portion of a thread that is executed during a single wake interval on a particular processor. The bundle includes multiple commands for one or more shared resources. The bundle is processed at the shared resource(s) to produce a bundle result. The bundle result is sent to the particular processor. The thread undergoes no more than one wake interval to sleep interval cycle while the bundle commands are processed at the shared resource(s). These techniques allow a lock for shared resource(s) to be obtained, used and released all while the particular thread is sleeping, so that locks are held for shorter times than in conventional approaches. Using these techniques, line rate packet processing is more readily achieved in routers with multiple multi-threaded processors.
摘要:
Techniques for processing requests from a processing thread for a shared resource shared among threads on one or more processors include receiving a bundle of requests from a portion of a thread that is executed during a single wake interval on a particular processor. The bundle includes multiple commands for one or more shared resources. The bundle is processed at the shared resource(s) to produce a bundle result. The bundle result is sent to the particular processor. The thread undergoes no more than one wake interval to sleep interval cycle while the bundle commands are processed at the shared resource(s). These techniques allow a lock for shared resource(s) to be obtained, used and released all while the particular thread is sleeping, so that locks are held for shorter times than in conventional approaches. Using these techniques, line rate packet processing is more readily achieved in routers with multiple multi-threaded processors.
摘要:
In one embodiment, a method includes receiving at a thread scheduler data that indicates a first thread is to execute next a particular instruction path in software to access a particular portion of a shared computational resource. The thread scheduler determines whether a different second thread is exclusively eligible to execute the particular instruction path on any processor of a set of one or more processors to access the particular portion of the shared computational resource. If so, then the thread scheduler prevents the first thread from executing any instruction from the particular instruction path on any processor of the set of one or more processors. This enables several threads of the same software to share a resource without obtaining locks on the resource or holding a lock on a resource while a thread is not running.
摘要:
In one embodiment, a method includes receiving at a thread scheduler data that indicates a first thread is to execute next a particular instruction path in software to access a particular portion of a shared computational resource. The thread scheduler determines whether a different second thread is exclusively eligible to execute the particular instruction path on any processor of a set of one or more processors to access the particular portion of the shared computational resource. If so, then the thread scheduler prevents the first thread from executing any instruction from the particular instruction path on any processor of the set of one or more processors. This enables several threads of the same software to share a resource without obtaining locks on the resource or holding a lock on a resource while a thread is not running.
摘要:
An apparatus for routing data packets includes a network interface, a memory, a general purpose processor and a flow classifier. The memory stores a flow structure. Every packet in one flow has identical values for a set of data fields in the packet. The memory stores instruction that cause the processor to receive missing flow data and to add the missing flow to the flow structure. The apparatus forwards a packet based on the flow. The flow classifier determines a particular flow and whether it is already stored in the flow structure. If not, then the classifier determines whether that flow has already been sent to the processor as missing data. If not, then the classifier stores into a different data structure data that indicates the flow has been sent to the processor but is not yet included in the flow data structure, and sends missing data to the processor.
摘要:
A technique efficiently searches a hash table. Conventionally, a predetermined set of “signature” information is hashed to generate a hash-table index which, in turn, is associated with a corresponding linked list accessible through the hash table. The indexed list is sequentially searched, beginning with the first list entry, until a “matching” list entry is located containing the signature information. For long list lengths, this conventional approach may search a substantially large number of list entries. In contrast, the inventive technique reduces, on average, the number of list entries that are searched to locate the matching list entry. To that end, list entries are partitioned into different groups within each linked list. Thus, by searching only a selected group (e.g., subset) of entries in the indexed list, the technique consumes fewer resources, such as processor bandwidth and processing time, than previous implementations.
摘要:
An apparatus for routing data packets includes a network interface, a memory, a general purpose processor and a flow classifier. The memory stores a flow structure. Every packet in one flow has identical values for a set of data fields in the packet. The memory stores instruction that cause the processor to receive missing flow data and to add the missing flow to the flow structure. The apparatus forwards a packet based on the flow. The flow classifier determines a particular flow and whether it is already stored in the flow structure. If not, then the classifier determines whether that flow has already been sent to the processor as missing data. If not, then the classifier stores into a different data structure data that indicates the flow has been sent to the processor but is not yet included in the flow data structure, and sends missing data to the processor.
摘要:
A system and method is provided for automatically identifying and removing malicious data packets, such as denial-of-service (DoS) packets, in an intermediate network node before the packets can be forwarded to a central processing unit (CPU) in the node. The CPU's processing bandwidth is therefore not consumed identifying and removing the malicious packets from the system memory. As such, processing of the malicious packets is essentially “off-loaded” from the CPU, thereby enabling the CPU to process non-malicious packets in a more efficient manner. Unlike prior implementations, the invention identifies malicious packets having complex encapsulations that can not be identified using traditional techniques, such as ternary content addressable memories (TCAM) or lookup tables.
摘要:
The present invention provides a system and method for a plurality of independent processors to simultaneously assemble requests in a context memory coupled to a coprocessor. A write manager coupled to the context memory organizes segments received from multiple processors to form requests for the coprocessor. Each received segment indicates a location in the context memory, such as an indexed memory block, where the segment should be stored. Illustratively, the write manager parses the received segments to their appropriate blocks of the context memory, and detects when the last segment for a request has been received. The last segment may be identified according to a predetermined address bit, e.g. an upper order bit, that is set. When the write manager receives the last segment for a request, the write manager (1) finishes assembling the request in a block of the context memory, (2) enqueues an index associated with the memory block in an index FIFO, and (3) sets a valid bit associated with memory block. By setting the valid bit, the write manager prevents newly received segments from overwriting the assembled request that has not yet been forwarded to the coprocessor. When an index reaches the head of the index FIFO, a request is dequeued from the indexed block of the context memory and forwarded to the coprocessor.
摘要:
An apparatus and technique off-loads responsibility for maintaining order among requests directed to a same address on a split transaction bus from a processor to a split transaction bus controller, thereby increasing the performance of the processor. The present invention comprises an ordering circuit that enables the controller to defer issuing a subsequent (write) request directed to an address on the bus until a previous (read) request directed to the same address completes. By off-loading responsibility for maintaining order among requests from the processor to the controller, the invention enhances performance of the processor since the processor may proceed with program execution without having to stall to ensure such ordering. The ordering circuit maintains ordering in an efficient manner that is transparent to the processor.