摘要:
A method, system and computer program product for performing pattern matching in parallel for a plurality of input streams. The method includes calculating a memory address in a translation table responsive to a current input value, a current state and current state information. A transition rule is retrieved from the transition rule table at the memory address, the transition rule including a test input value, a test current state, and next state information. It is determined if the current input value and the current state match the test input value and the test current state. The current state information is updated with the next state information in response to determining that the current input value and the current state match the test input value and the test current state. The current state information is updated with contents of a default transition rule in response to determining that the current input value and the current state do not match the test input value and the test current state.
摘要:
An interfacing apparatus and related method is provided for configuring to couple a plurality of memory devices being addressable by means of an address space to a processing unit. In one embodiment, the apparatus comprises a first memory access unit being adapted for receiving a memory address from said processing unit and for accessing said memory devices accordingly based on the address provided. It also comprises a second memory access unit being adapted for receiving content data from the processing unit and for controlling a search or update function accordingly for the received content data in one or more of the memory devices. In addition, an allocation unit is also provided for allocating a first part of the address space of the memory devices to said first memory access unit and allocating a second part of the address space of said memory devices to the second memory access unit, each of the memory access units being assigned to corresponding memory devices of the plurality of memory devices.
摘要:
A pattern matching accelerator (PMA) for assisting software threads to find the presence and location of strings in an input data stream that match a given pattern. The patterns are defined using regular expressions that are compiled into a data structure comprised of rules subsequently processed by the PMA. The patterns to be searched in the input stream are defined by the user as a set of regular expressions. The patterns to be searched are grouped in pattern context sets. The sets of regular expressions which define the pattern context sets are compiled to generate a rules structure used by the PMA hardware. The rules are compiled before search run time and stored in main memory, in rule cache memory within the PMA or a combination thereof. For each input character, the PMA executes the search and returns the search results.
摘要:
A computer program product comprising a computer readable storage medium containing computer code that, when executed by a computer, implements a method for transforming a finite state automaton (FSA) of a regular expression, wherein the method includes determining, by a computer, a first subexpression R1 and a second subexpression R2 in the regular expression; calculating an overlap FSA, the overlap FSA configured to determine the existence of a partial overlap or a full overlap between the first subexpression R1 and the second subexpression R2; determining whether the overlap FSA has an accepting state; and in the event the overlap FSA is determined not to have an accepting state, determining that the transformation of the regular expression is safe, and constructing a transformed FSA of the regular expression comprising a first FSA for the first subexpression R1 and a second FSA for the second subexpression R2.
摘要:
A pattern matching accelerator (PMA) for assisting software threads to find the presence and location of strings in an input data stream that match a given pattern. The patterns are defined using regular expressions that are compiled into a data structure comprised of rules subsequently processed by the PMA. The patterns to be searched in the input stream are defined by the user as a set of regular expressions. The patterns to be searched are grouped in pattern context sets. The sets of regular expressions which define the pattern context sets are compiled to generate a rules structure used by the PMA hardware. The rules are compiled before search run time and stored in main memory, in rule cache memory within the PMA or a combination thereof. For each input character, the PMA executes the search and returns the search results.
摘要:
A data processing method comprises receiving an electronically parseable document, scanning the document according to at least one predefined rule to determine if the document is suspicious, and, if the document is determined not to be suspicious, parsing the document with a first parser, and, if the document is determined to be suspicious, parsing the document with a second parser.
摘要:
The invention relates to a method of optimizing a state transition function specification for a state machine engine based on a probability distribution for the state transitions. For the preferred embodiment of the invention, a B-FSM state machine engine accesses a transition rule memory using a processor cache. The invention allows improving the cache hit rate by exploiting the probability distribution. The N transition rules that comprise a hash table entry will be loaded in a burst mode from the main memory, from which the N transition rules are transferred to the processor cache. Because the comparison of the actual state and input values against each of the transition rules can immediately start after each of these rules has been received, the overall performance is improved as the transition rule that is most likely to be selected is the first to be transferred as part of the burst access.
摘要:
An XML parsing system includes a pattern-matching system 1 that receives an input stream 2 of characters corresponding to the XML document to be parsed, and provides an output 3 for subsequent processing in software by a processor 4. The pattern matching system 1 includes two main components, a controller in the form of a programmable state machine 5, which is programmed with an appropriate state transition diagram 6, and a character processing unit 7 in the form of a token and character handler. The programmable state machine 5 controls the character processing unit 7 to, e.g., compare characters in the input character stream 2 with other received or stored characters. The character processing unit 7 then provides feedback to the programmable state machine controller 5, e.g., as to whether the compared characters match, so that the programmable state machine controller 5 can then parse the received document accordingly.
摘要:
Methods and apparatus are provided for classifying data packets in data processing systems. A first packet classification method determines which of a plurality of predefined processing rules applies to a data packet, where each rule is associated with a range of possible data values in each of a plurality of dimensions (X,Y) corresponding to respective data items in the packet format. For each dimension (X,Y), it is determined which of a set of predefined basic ranges contains the corresponding data value (I1, I2) from the packet, where the basic ranges correspond to respective non-overlapping value ranges between successive rule range boundaries in the dimension. For the basic range so determined for each dimension, a corresponding basic range identifier is selected from a set of predefined basic range identifiers corresponding to respective basic ranges in that dimension. For each of at least two dimensions (X,Y), the basic range identifiers comprise respective pD-bit strings generated independently for that dimension by a process of deriving a primitive range hierarchy based on the rule ranges in that dimension. The resulting basic range identifiers, one for each dimension, are then combined to produce a search key which is supplied to a ternary content-addressable memory (5). In the memory (5), the search key is compared with a set of ternary rule vectors, each associated with a particular rule and derived for that rule from the aforementioned hierarchies, to identify at least one rule which applies to the data packet. A second method classifies data packets according to the values in respective data packets of a single, predetermined data item (DA) in the data packet format, where a plurality of classification results are predefined for respective ranges of values of the data item (DA). Here the data item (DA) in the packet is first segmented. The resulting segments are then equated to different dimensions (X,Y) of a multidimensional packet classification problem and are processed in a similar manner to identify a classification result for the packet.
摘要:
An XML parsing system includes a pattern-matching system 1 that receives an input stream 2 of characters corresponding to the XML document to be parsed, and provides an output 3 for subsequent processing in software by a processor 4. The pattern matching system 1 includes two main components, a controller in the form of a programmable state machine 5, which is programmed with an appropriate state transition diagram 6, and a character processing unit 7 in the form of a token and character handler. The programmable state machine 5 controls the character processing unit 7 to, e.g., compare characters in the input character stream 2 with other received or stored characters. The character processing unit 7 then provides feedback to the programmable state machine controller 5, e.g., as to whether the compared characters match, so that the programmable state machine controller 5 can then parse the received document accordingly.