摘要:
Detecting unauthorized or excessive use of a resource is disclosed. The value of a metric is updated based at least in part on a first data associated with a current event associated with the metric and a second data associated with a most recent prior event associated with the metric. Responsive action is taken if the updated value of the metric exceeds a threshold.
摘要:
Detecting unauthorized or excessive use of a resource is disclosed. The value of a metric is updated based at least in part on a first data associated with a current event associated with the metric and a second data associated with a most recent prior event associated with the metric. Responsive action is taken if the updated value of the metric exceeds a threshold.
摘要:
Performing accelerated validation of a set of data is disclosed. A structure associated with the set of data is identified. It is determined whether the structure matches a previously learned structure. If a match is found, an accelerated validation of the first set of data is performed using validation information associated with the previously learned structure.
摘要:
A query (e.g., an extensible markup language (XML) Path or XPath query) for one or more components of a document (e.g., XML document) may be received. A forward axis graph including a plurality of nodes with edges connecting the nodes may be generated based on the query and corresponding to a traversal of the document as associated with events (e.g., XML SAX events) corresponding to the document. A plurality of matching states of the forward axis graph including at least one final state may be identified, each matching state including a subset of the nodes wherein each incoming edge to the subset originates from one of the nodes of the subset. Whether the one or more components of the query exist within the document may be based on which events correspond to transitions between the matching states and whether the final state is achieved.
摘要:
Documents are classified into one or more clusters corresponding to predefined classification categories by building a knowledge base comprising matrices of vectors which indicate the significance of terms within a corpus of text formed by the documents and classified in the knowledge base to each cluster. The significance of terms is determined assuming a standard normal probability distribution, and terms are determined to be significant to a cluster if their probability of occurrence being due to chance is low. For each cluster, statistical signatures comprising sums of weighted products and intersections of cluster terms to corpus terms are generated and used as discriminators for classifying documents. The knowledge base is built using prefix and suffix lexical rules which are context-sensitive and applied selectively to improve the accuracy and precision of classification.
摘要:
A method for improving search quality by quantitative analysis of enterprise web access traffic is disclosed. This invention relates to the field of data processing systems and more particularly to the field of knowledge management in corporate or enterprise. Performing search on heterogeneous data in an enterprise is complex and challenging. Present day technologies deploy costly and time consuming methods involving manual operation of data integration, pre-processing, mining and interpretation tools. Further, these methods are inefficient in retrieving relevant data. The proposed method discloses a method for exhaustive monitoring and analysis of intranet traffic to identify and retrieve relevant data in enterprise search. Resource relevance is revealed by traffic analyzer based on empirical, content-independent metric. Further, analysis of intranet traffic provides effective, timely and personalized information resource to user for selective information discovery, cross-linking of disjoint data repositories, one-click navigation to popular applications, index trimming and the like.