Abstract:
Methods and systems for analyzing network traffic. An analysis system receives network traffic, which complies with a certain protocol. The received network traffic carries a data item, which may be of value to an analyst. In order to access the data item in question, the analysis system automatically identifies the media type of the data item, by processing the network traffic without decoding the protocol. The analysis system identifies the media type irrespective of the protocol in order to avoid the computational complexity involved in decoding the protocol.
Abstract:
Methods and systems for range matching. The system holds a definition of one or more ranges of Internet Protocol (IP) addresses. The definition may specify any desired number of ranges of any suitable size, and some ranges may overlap one another or be contained in one another. The definition may also specify certain returned values and/or relative priorities for the various ranges. In a pre-processing phase, a hash table that is subsequently queried with addresses to be range-matched. The hash table may be updated at run-time. During operation, the system receives addresses (e.g., extracts addresses from monitored communication traffic) and identifies by querying the hash table, for each address, whether the address falls within any of the ranges.
Abstract:
Machine learning-based methods to improve the knowledge extraction process in a specific domain or business environment, and then provides that extracted knowledge in a word cloud user interface display capable of summarizing and conveying a vast amount of information to a user very quickly. Based on the self-training mechanism developed by the inventors, the ontology programming automatically trains itself to understand the domain or environment of the communication data by processing and analyzing a defined corpus of communication data. The developed ontology can be applied to process a dataset of communication information to create a word cloud that can provide a quick view into the content of the dataset, including information about the language used by participants in the communications, such as identifying for a user key phrases and terms, the frequency of those phrases, the originator of the terms of phrases, and the confidence levels of such identifications.
Abstract:
The disclosed solution uses machine learning-based methods to improve the knowledge extraction process in a specific domain or business environment. By formulizing a specific company's internal knowledge and terminology, the ontology programming accounts for linguistic meaning to surface relevant and important content for analysis. Based on the self-training mechanism developed by the inventors, the ontology programming automatically trains itself to understand the business environment by processing and analyzing a defined corpus of communication data. For example, the disclosed ontology programming adapts to the language used in a specific domain, including linguistic patterns and properties, such as word order, relationships between terms, and syntactical variations. The disclosed system and method further relates to leveraging the ontology to assess a dataset and conduct a funnel analysis to identify patterns, or sequences of events, in the dataset.
Abstract:
Currently, systems for assessing traffic, such as retail traffic, only output counting results in numeric measures. Current systems do not assess problems and/or provide solutions to problems using traffic analysis methods or applications. The present disclosure is directed to methods, systems and media for applying retail traffic analysis statistics to provide actionable intelligence, such as solutions to particular inquiries or problems.
Abstract:
Methods and systems for keyword spotting, i.e., for identifying textual phrases of interest in input data. The input data may be communication packets exchanged in a communication network. A keyword spotting system holds a dictionary (or dictionaries) of textual phrases for searching input data. The input data and the patterns are assigned to multiple different pattern matching algorithms. For example, a share of the traffic is handled by one algorithm and smaller traffic shares may be handled by the others. The system monitors the algorithms performance as they process the data to search for a match. The ratio of traffic splitting among the algorithms is dynamically reassigned or adjusted to maximize the overall performance.
Abstract:
Systems and methods for extracting identifiers from traffic of an unknown protocol are provided herein. An example method can include receiving communication traffic transferred over a communication network in accordance with a communication network. A data item that matches a predefined pattern can be identified in the communication traffic, irrespective of the communication protocol. The identified data item can then be extracted from the communication traffic.
Abstract:
A method of operating a speech processing system is provided. The method includes translating a portion of a speech record into a plurality of possible words associated with a plurality of contexts, and determining a plurality of correctness values based on a plurality of probabilities that each of the plurality of possible words is correct for each of the plurality of contexts. The method also includes determining which of the plurality of possible words is a correct translation of the portion of the speech record based on the plurality of correctness values.
Abstract:
Embodiments that are described herein provide improved methods and systems for analyzing network traffic. The disclosed embodiments enable an analytics system to perform complex processing to only new, first occurrences of received content, while refraining from processing duplicate instances of that content. In a typical embodiment, the analytics results regarding the first occurring content are reported and cached in association with the content. For any duplicate instance of the content, the analytics results are retrieved from the cache without re-processing of the duplicate content. When using the disclosed techniques, the system still processes all first occurring content but not duplicate instances of content that was previously received and processed. In the embodiments described herein, input data comprises communication packets exchanged in a communication network.
Abstract:
A method executed by a computer system for detecting edges comprises receiving an image comprising a plurality of pixels, determining a phase congruency value for a pixel, where the phase congruency value comprises a plurality of phase congruency components, and determining if the phase congruency value satisfies a phase congruency criteria. If the phase congruency value satisfies the phase congruency criteria, the computer system categorizes the pixel as an edge pixel. If the phase congruency value does not satisfy the phase congruency criteria, the computer system compares a first phase congruency component of the plurality of phase congruency components to a phase congruency component criteria. If the first phase congruency component satisfies the phase congruency component criteria, the computer system categorizes the pixel as an edge pixel, and if the first phase congruency component does not satisfy the phase congruency component criteria, categorizes the pixel as a non-edge pixel.