Abstract:
Methods and systems to identify the domain names that can potentially be used for delivering instructions to a bot, before bots on a computer network succeed in obtaining the instructions. The system maintains a device rating for each device that reflects a likelihood that the device is infected by malware. The system also maintains a domain-name rating for each device that reflects a likelihood that the domain name is malicious. When a device attempts to access a particular domain name, the domain-name rating of the domain name is updated in light of the device rating of the device, and/or update the device rating of the device in light of the domain-name rating.
Abstract:
Methods and systems for keyword spotting, i.e., for identifying textual phrases of interest in input data. The input data may be communication packets exchanged in a communication network. A keyword spotting system holds a dictionary (or dictionaries) of textual phrases for searching input data. The input data and the patterns are assigned to multiple different pattern matching algorithms. For example, a share of the traffic is handled by one algorithm and smaller traffic shares may be handled by the others. The system monitors the algorithms performance as they process the data to search for a match. The ratio of traffic splitting among the algorithms is dynamically reassigned or adjusted to maximize the overall performance.
Abstract:
Embodiments that are described herein provide improved methods and systems for analyzing network traffic. The disclosed embodiments enable an analytics system to perform complex processing to only new, first occurrences of received content, while refraining from processing duplicate instances of that content. In a typical embodiment, the analytics results regarding the first occurring content are reported and cached in association with the content. For any duplicate instance of the content, the analytics results are retrieved from the cache without re-processing of the duplicate content. When using the disclosed techniques, the system still processes all first occurring content but not duplicate instances of content that was previously received and processed. In the embodiments described herein, input data comprises communication packets exchanged in a communication network.
Abstract:
Systems and methods for spotting keywords in data packets are provided. In particular, input data is received to be searched for occurrences of a set of patterns, the input data being divided into multiple segments. Then the input data and the patterns are assigned to first and second pattern matching algorithms, the first pattern matching algorithm is configured to search only within each of the segments, and the second pattern matching algorithm is configured to search across boundaries between adjacent segments. Then the input data is searched using the first and second pattern matching algorithms.
Abstract:
Methods and systems to identify the domain names that can potentially be used for delivering instructions to a bot, before bots on a computer network succeed in obtaining the instructions. The system maintains a device rating for each device that reflects a likelihood that the device is infected by malware. The system also maintains a domain-name rating for each device that reflects a likelihood that the domain name is malicious. When a device attempts to access a particular domain name, the domain-name rating of the domain name is updated in light of the device rating of the device, and/or update the device rating of the device in light of the domain-name rating.
Abstract:
A system for storing document collections in a manner that facilitates efficient querying. Each document vector is hashed, by applying a suitable hash function to the components of the vector. The hash function maps the vector to a particular hash value, corresponding to a particular hyperbox in the multidimensional space to which the vectors belong. The vector, or a pointer to the vector, is then stored in a hash table in association with the vector's hash value. Subsequently, given a document of interest, documents similar to the document of interest may be found by hashing the vector of the document of interest, and then returning the vectors that are associated, in the hash table, with the resulting hash value.
Abstract:
An apparatus and techniques for constructing and utilizing a “dynamic dictionary” that is not a compiled dictionary, and therefore does not need to be recompiled in order to be updated. The dynamic dictionary includes respective data structures that represent (i) a management automaton that includes a plurality of management nodes, and (ii) a runtime automaton that is derived from the management automaton and includes a plurality of runtime nodes. The runtime automaton may be used to search input data, such as communication traffic over a network, for keywords of interest, while the management automaton manages the addition of keywords to the dynamic dictionary. Typically, at least two (e.g., exactly two) such dynamic dictionaries are used in combination with a static dictionary.
Abstract:
Systems and methods for passive monitoring of computer communication that does not require performing any decryption. A monitoring system receives the traffic exchanged with each relevant application server, and identifies, in the traffic, sequences of messages—or “n-grams”—that appear to belong to a communication session between a pair of users. Subsequently, based on the numbers and types of identified n-grams, the system identifies each pair of users that are likely to be related to one another via the application, in that these users used the application to communicate (actively and/or passively) with one another. The system may identify those sequences of messages that, by virtue of the sizes of the messages in the sequence, and/or other properties of the messages that are readily discernable, indicate a possible user-pair relationship.
Abstract:
A traffic-monitoring system that monitors encrypted traffic exchanged between IP addresses used by devices and a network, and further receives the user-action details that are passed over the network. By correlating between the times at which the encrypted traffic is exchanged and the times at which the user-action details are received, the system associates the user-action details with the IP addresses. In particular, for each action specified in the user-action details, the system identifies one or more IP addresses that may be the source of the action. Based on the IP addresses, the system may identify one or more users who may have performed the action. The system may correlate between the respective action-times of the encrypted actions and the respective approximate action-times of the indicated actions. The system may hypothesize that the indicated action may correspond to one of the encrypted actions having these action-times.
Abstract:
Methods and systems to identify the domain names that can potentially be used for delivering instructions to a bot, before bots on a computer network succeed in obtaining the instructions. The system maintains a device rating for each device that reflects a likelihood that the device is infected by malware. The system also maintains a domain-name rating for each device that reflects a likelihood that the domain name is malicious. When a device attempts to access a particular domain name, the domain-name rating of the domain name is updated in light of the device rating of the device, and/or update the device rating of the device in light of the domain-name rating.