摘要:
A graph taxonomy of information which is represented by a plurality of vectors is generated. The graph taxonomy includes a plurality of nodes and a plurality of edges. The plurality of nodes is generated, and each node of the plurality of nodes is associated with ones of the plurality of vectors. A tree hierarchy is established based on the plurality of nodes. A plurality of distances between ones of the plurality of nodes is calculated. Ones of the plurality of nodes are connected with other ones of the plurality of nodes by ones of the plurality of edges based on the plurality of distances. The information represented by the plurality of vectors may be, for example, a plurality of documents such as Web Pages.
摘要:
A computerized method of online mining of inference rules in a large database. The method is comprised of two stages, a preprocessing stage followed by an online rule generation stage. The pro-processing stage is further defined to be a two step process that involves the generation of large itemsets. The present method defines large itemsets by how the items in the itemsets relate to each other rather than their level of presence. The measure by which itemsets are said to relate to each other is defined by a computed figure of merit, K1. The first substep of the preprocessing stage involves finding those itemsets that possess a minimum computer collective strength of K1. From those found itemsets, a second user supplied input, K2 is used to prune those itemsets with inference strength below K2.
摘要:
A computer method of online mining of quantitative association rules consisting of two stages, a preprocessing stage followed by an online rule generation stage. The required computational effort is reduced by the pre-processing stage, defined by pre-processing data to organize the relationship between antecedent attributes to create a heirarchially arranged multidimensional indexing structure. The resulting structure facilitates the performance of the second stage, online processing, which involves the generation of quantitative association rules. The second stage, online rule generation, utilizes the multidimensional index structure created by the preprocessing stage by first finding the areas in the data which correspond to the rules and then uses a merging step to create a merged tree in order to carefully combine interesting regions in order to give a heirarchical representation of the rule set. The merged tree is then used in order to actually generate the rules.
摘要:
Information is analyzed in the form of a plurality of data values that represent a plurality of objects. A set of features that characterize each object of the plurality of objects is identified. The plurality of data values are stored in a database. Each data value corresponds to at least one of the plurality of objects based on the set of features. Ones of the plurality of data values stored in the database are partitioned into a plurality of clusters. Each cluster of the plurality of clusters is assigned to one respective node of a plurality of nodes arranged in a tree hierarchy. Ones of the plurality of nodes of the tree hierarchy are traversed. If desired, information may be analyzed for finding peer groups in e-commerce applications.
摘要:
A method of analyzing information in the form of a plurality of data records. Each data record includes one or more data values. The data values are partitioned into a plurality of data signatures. Data values of data signatures are compared to data values of data records. Based on the result of the comparison an index is associated with each data record. A bound corresponding to the index is calculated based on a user defined target value and an objective function. If desired, information may be analyzed for finding peer groups in e-commerce applications.
摘要:
Portions of multimedia program (presentation) are repetitively broadcast to receiving stations with subsequent portions being broadcast less frequently than preceding portions. Blocks of at least one of the portions are broadcast in varying permutations from one repetition to a next repetition. Further, each portion is of a length which is proportional to a sum of the lengths of all preceding portions. A receiver is provided with selects blocks to be skipped (in a pyramid type broadcast) based on information indicative of the permutation selected by the server. The receiver determines the number of blocks to skip before buffering the next block for the video being viewed.
摘要:
A system and method are provided to analyze information stored in a computer data base by detecting clusters of related or correlated data values. Data values stored in the data base represent a set of objects. A data value is stored in the data base as an instance of a set of features that characterize the objects. The features are the dimensions of the feature space of the data base. Each cluster includes not only a subset of related data values stored in the data base but also a subset of features. The data values in a cluster are data values that are a short distance apart, in the sense of a metric, when projected onto a subspace that corresponds to the subset of features of the cluster. A set of k clusters may be detected such that the average number of features of the subsets of features of the clusters is l.
摘要:
A rating of a plurality of ratings is predicted. The rating is associated with a user of a plurality of users and the rating corresponds to an item of a plurality of items. One of the plurality of ratings, corresponding to at least one of the plurality of items, is provided for each of the plurality of users. A predictability relation between ones of the plurality of users and other ones of the plurality of users is calculated based on ratings provided by users. One of a plurality of nodes is assigned to each of the plurality of users. Ones of the plurality of nodes are connected with other ones of the plurality of nodes by a plurality of edges based on the predictability relation. A graph which includes the plurality of nodes and the plurality of edges is searched for a path from a node assigned to the user of the plurality of users to another node assigned to another user of the plurality of users. The rating of the plurality of ratings associated with the user of a plurality of users is calculated based on the path and the predictability relation. If desired, a predicted rating may be produced for identifying products and customers in an e-commerce applications.
摘要:
A method of analyzing information in the form of a plurality of data values. The plurality of data values represent a plurality of objects. The plurality of data values are distributed in a data space. A set of features which characterize each of the plurality of objects is identified. The plurality of data values are stored in a database. Each of the plurality of data values corresponds to at least one of the plurality of objects based on the set of features. Ones of the plurality of data values stored in the database are partitioned into a plurality of clusters. A respective orientation associated with a position in data space of data values which are contained in each respective cluster of the plurality of clusters is calculated based on the set of features. If desired, information may be analyzed for finding peer groups in e-commerce applications.
摘要:
A computer method of online mining of inference rules in a large database comprising a preprocessing stage and an online rule generation stage. The pre-processing stage includes first finding itemsets that possess a minimum computed collective strength K1, and second, pruning the itemsets with inference strength below a predetermined inference strength, K2. The online rule generation stage utilizes the itemsets organized into an adjacency lattice to generate inference rules with inference strength K2.