摘要:
A method, system and computer program product for confirming the validity of data returned from a data store. A data store contains a primary data set encrypted using a first encryption and a secondary data set using a second encryption. The secondary data set is a subset of the primary data set. A client issues a substantive query against the data store to retrieve a primary data result belonging to the primary data set. A query interface issues at least one validating query against the data store. Each validating query returns a secondary data result belonging to the secondary data set. The query interface receives the secondary data result and provides a data invalid notification if data satisfying the substantive query included in an unencrypted form of the secondary data result is not contained in an unencrypted form of the primary data result.
摘要:
A method, system and computer program product for confirming the validity of data returned from a data store. A data store contains a primary data set encrypted using a first encryption and a secondary data set using a second encryption. The secondary data set is a subset of the primary data set. A client issues a substantive query against the data store to retrieve a primary data result belonging to the primary data set. A query interface issues at least one validating query against the data store. Each validating query returns a secondary data result belonging to the secondary data set. The query interface receives the secondary data result and provides a data invalid notification if data satisfying the substantive query included in an unencrypted form of the secondary data result is not contained in an unencrypted form of the primary data result.
摘要:
A computer implemented method, system, and computer usable program code for classifying a data stream using high-order models. The data stream is divided into a plurality of data segments. A classifier is selected for each of the plurality of data segments. Each of a plurality of classifiers is clustered into states. A state transition matrix is computed for the states. The states of the state transition matrix specify one of the high-order models for classifying the data stream.
摘要:
The present invention provides a system and method for optimizing routes that include multiple stops. This is accomplished by allowing users to identify a starting point, a destination, and types of businesses or other locations to be visited along the way. A route processor then provides users with a list of stores or other requested detour choices yielding a trip of optimal itinerary. The detour choices may be either an ordered sequence or an unordered set of points to be visited and may include constraints that make it possible to optimize utility functions according to user preferences.
摘要:
A computer implemented method, system, and computer usable program code for classifying a data stream using high-order models. The data stream is divided into a plurality of data segments. A classifier is selected for each of the plurality of data segments. Each of a plurality of classifiers is clustered into states. A state transition matrix is computed for the states. The states of the state transition matrix specify one of the high-order models for classifying the data stream.
摘要:
A method for measuring time series relevance using state transition points, including inputting time series data and relevance threshold data. Then convert all time series values to ranks within [0,1] interval. Calculate the valid range of the transition point in [0,1]. Afterwards, a verification occurs that a time series Z exists for each pair of time series Z and Y, such that the relevances between X and Z, and between Y and Z are known. Then deduce the relevance of X and Y. The relevance of X and Y must be at least one of, (i) higher, and (ii) lower than, the given threshold. Provided Z is found terminate all remaining calculations for X and Y. Otherwise, segment the time series if no Z time series exists, use the segmented time series to estimate the relevance. Apply a hill climbing algorithm in the valid range to find the true relevance.
摘要:
A dynamic rule classifier for mining a data stream includes at least one window for viewing data contained in the data stream and a set of rules for mining the data. Rules are added and the set of rules are updated by algorithms when an drift in a concept within the data occurs, causing unacceptable drops in classification accuracy. The dynamic rule classifier is also implemented as a method and a computer program product.
摘要:
A dynamic rule classifier for mining a data stream includes at least one window for viewing data contained in the data stream and a set of rules for mining the data. Rules are added and the set of rules are updated by algorithms when an drift in a concept within the data occurs, causing unacceptable drops in classification accuracy. The dynamic rule classifier is also implemented as a method and a computer program product.
摘要:
Attribute association discovery techniques that support relational-based data mining are disclosed. In one aspect of the invention, a technique for mining attribute associations in a relational data set comprises the following steps/operations. Multiple items are obtained from the relational data set. Then, attribute associations are discovered using: (i) multi-attribute mining templates formed from at least a portion of the multiple items; and (ii) one or more mining preferences specified by a user. The invention provides a novel architecture for the mining search space so as to exploit the inter-relationships among patterns of different templates. The framework is relational-sensitive and supports interactive and online mining.
摘要:
The present invention provides an index structure for managing weighted-sequences in large databases. A weighted-sequence is defined as a two-dimensional structure in which each element in the sequence is associated with a weight. A series of network events, for instance, is a weighted-sequence because each event is associated with a timestamp. Querying a large sequence database by events' occurrence patterns is a first step towards understanding the temporal causal relationships among the events. The index structure proposed herein enables the efficient retrieval from the database of all subsequences (contiguous and non-contiguous) that match a given query sequence both by events and by weights. The index structure also takes into consideration the nonuniform frequency distribution of events in the sequence data.