摘要:
A data processing method comprises receiving an electronically parseable document, scanning the document according to at least one predefined rule to determine if the document is suspicious, and, if the document is determined not to be suspicious, parsing the document with a first parser, and, if the document is determined to be suspicious, parsing the document with a second parser.
摘要:
The invention relates to a method of optimizing a state transition function specification for a state machine engine based on a probability distribution for the state transitions. For the preferred embodiment of the invention, a B-FSM state machine engine accesses a transition rule memory using a processor cache. The invention allows improving the cache hit rate by exploiting the probability distribution. The N transition rules that comprise a hash table entry will be loaded in a burst mode from the main memory, from which the N transition rules are transferred to the processor cache. Because the comparison of the actual state and input values against each of the transition rules can immediately start after each of these rules has been received, the overall performance is improved as the transition rule that is most likely to be selected is the first to be transferred as part of the burst access.
摘要:
An XML parsing system includes a pattern-matching system 1 that receives an input stream 2 of characters corresponding to the XML document to be parsed, and provides an output 3 for subsequent processing in software by a processor 4. The pattern matching system 1 includes two main components, a controller in the form of a programmable state machine 5, which is programmed with an appropriate state transition diagram 6, and a character processing unit 7 in the form of a token and character handler. The programmable state machine 5 controls the character processing unit 7 to, e.g., compare characters in the input character stream 2 with other received or stored characters. The character processing unit 7 then provides feedback to the programmable state machine controller 5, e.g., as to whether the compared characters match, so that the programmable state machine controller 5 can then parse the received document accordingly.
摘要:
Methods and apparatus are provided for classifying data packets in data processing systems. A first packet classification method determines which of a plurality of predefined processing rules applies to a data packet, where each rule is associated with a range of possible data values in each of a plurality of dimensions (X,Y) corresponding to respective data items in the packet format. For each dimension (X,Y), it is determined which of a set of predefined basic ranges contains the corresponding data value (I1, I2) from the packet, where the basic ranges correspond to respective non-overlapping value ranges between successive rule range boundaries in the dimension. For the basic range so determined for each dimension, a corresponding basic range identifier is selected from a set of predefined basic range identifiers corresponding to respective basic ranges in that dimension. For each of at least two dimensions (X,Y), the basic range identifiers comprise respective pD-bit strings generated independently for that dimension by a process of deriving a primitive range hierarchy based on the rule ranges in that dimension. The resulting basic range identifiers, one for each dimension, are then combined to produce a search key which is supplied to a ternary content-addressable memory (5). In the memory (5), the search key is compared with a set of ternary rule vectors, each associated with a particular rule and derived for that rule from the aforementioned hierarchies, to identify at least one rule which applies to the data packet. A second method classifies data packets according to the values in respective data packets of a single, predetermined data item (DA) in the data packet format, where a plurality of classification results are predefined for respective ranges of values of the data item (DA). Here the data item (DA) in the packet is first segmented. The resulting segments are then equated to different dimensions (X,Y) of a multidimensional packet classification problem and are processed in a similar manner to identify a classification result for the packet.
摘要:
An XML parsing system includes a pattern-matching system 1 that receives an input stream 2 of characters corresponding to the XML document to be parsed, and provides an output 3 for subsequent processing in software by a processor 4. The pattern matching system 1 includes two main components, a controller in the form of a programmable state machine 5, which is programmed with an appropriate state transition diagram 6, and a character processing unit 7 in the form of a token and character handler. The programmable state machine 5 controls the character processing unit 7 to, e.g., compare characters in the input character stream 2 with other received or stored characters. The character processing unit 7 then provides feedback to the programmable state machine controller 5, e.g., as to whether the compared characters match, so that the programmable state machine controller 5 can then parse the received document accordingly.
摘要:
The invention relates to a system in which given search keys are evaluated, segment by segment, to search through tree-structured tables for finding an output information corresponding to the longest matching prefix. For at least one of the segments, only selected bits of the search key segment are used as index for accessing an associated table where test values are stored which are to be compared to the respective search key segment. The bits to be selected are determined by an index mask, reflecting the distribution of the valid test values in the table entries (and valid search key segment values). This allows table compression for minimizing storage requirements and search time. A procedure is disclosed for generating an optimum index mask in response to the set of valid test values.
摘要:
A system includes a register, a first logical function portion, the first logical function portion operative to receive a first numerical value from the register, perform a first logical function with the first numerical value, and output a second numerical value, a second logical function portion, the second logical function portion operative to receive the first numerical value from the register, perform a second logical function with the first numerical value, and output a third numerical value, and a control logic portion, the control logic portion operative to receive the first numerical value from the register, determine whether the first numerical value includes a code associated with either the first logical function or the second logical function, and responsive to determining that the code is associated with the first logical function, and direct the output of the second numerical value to an input of the register.
摘要:
An apparatus for classifying a data packet includes an interface for receiving the data packet; a classification controller for parsing the data packet to identify a plurality of data items required for classifying the data packet; memory for storing a set of range identifiers for each data item in the data packet corresponding to a rule range defined in the rule sets; and a controller for performing a preliminary test of at least one of the data items to determine whether any of the data item's values match known frequently-occurring values for that data item.
摘要:
The invention relates to an apparatus for analysing a network flow, comprising—a parser for extracting flow identification information from the network flow, —a flow metering unit for metering the network flow, —a programmable controller for controlling the flow metering unit and the parser.
摘要:
An apparatus for classifying a data packet includes an interface for receiving the data packet; a classification controller for parsing the data packet to identify a plurality of data items required for classifying the data packet; memory for storing a set of range identifiers for each data item in the data packet corresponding to a rule range defined in the rule sets; and a controller for performing a preliminary test of at least one of the data items to determine whether any of the data item's values match known frequently-occurring values for that data item.