System and method for generating taxonomies with applications to content-based recommendations
    1.
    发明授权
    System and method for generating taxonomies with applications to content-based recommendations 有权
    用于生成基于内容的建议的分类法的系统和方法

    公开(公告)号:US06360227B1

    公开(公告)日:2002-03-19

    申请号:US09240231

    申请日:1999-01-29

    IPC分类号: G06F1700

    摘要: A graph taxonomy of information which is represented by a plurality of vectors is generated. The graph taxonomy includes a plurality of nodes and a plurality of edges. The plurality of nodes is generated, and each node of the plurality of nodes is associated with ones of the plurality of vectors. A tree hierarchy is established based on the plurality of nodes. A plurality of distances between ones of the plurality of nodes is calculated. Ones of the plurality of nodes are connected with other ones of the plurality of nodes by ones of the plurality of edges based on the plurality of distances. The information represented by the plurality of vectors may be, for example, a plurality of documents such as Web Pages.

    摘要翻译: 生成由多个向量表示的信息的图分类法。 图形分类法包括多个节点和多个边缘。 生成多个节点,并且多个节点中的每个节点与多个向量中的每个节点相关联。 基于多个节点建立树层次结构。 计算多个节点中的多个节点之间的多个距离。 基于多个距离,多个节点中的一个与多个节点中的其他节点通过多个边缘中的一个连接。 由多个向量表示的信息可以是例如多个文档,例如网页。

    Finding collective baskets and inference rules for internet mining
    2.
    发明授权
    Finding collective baskets and inference rules for internet mining 失效
    寻找网络挖掘的集体篮子和推理规则

    公开(公告)号:US06263327B1

    公开(公告)日:2001-07-17

    申请号:US09522723

    申请日:2000-03-10

    IPC分类号: G06F1700

    摘要: A computerized method of online mining of inference rules in a large database. The method is comprised of two stages, a preprocessing stage followed by an online rule generation stage. The pro-processing stage is further defined to be a two step process that involves the generation of large itemsets. The present method defines large itemsets by how the items in the itemsets relate to each other rather than their level of presence. The measure by which itemsets are said to relate to each other is defined by a computed figure of merit, K1. The first substep of the preprocessing stage involves finding those itemsets that possess a minimum computer collective strength of K1. From those found itemsets, a second user supplied input, K2 is used to prune those itemsets with inference strength below K2.

    摘要翻译: 一种在大型数据库中在线挖掘推理规则的计算机化方法。 该方法由两个阶段组成,一个预处理阶段,随后是在线规则生成阶段。 前处理阶段被进一步定义为涉及生成大项目集的两步过程。 本方法通过项目集中的项目相互关联而不是其存在级别来定义大项目集。 项目集被称为相互关联的措施由计算出的品质因数K1定义。 预处理阶段的第一个子步骤是找到具有最小计算机集体实力K1的项目集。 从那些找到的项目集中,第二个用户提供输入,K2用于修剪低于K2的推理强度的项目集。

    Methods for performing large scale auctions and online negotiations
    3.
    发明授权
    Methods for performing large scale auctions and online negotiations 有权
    执行大规模拍卖和在线谈判的方法

    公开(公告)号:US6151589A

    公开(公告)日:2000-11-21

    申请号:US151200

    申请日:1998-09-10

    IPC分类号: G06Q30/08 G06Q40/00 G06F17/60

    CPC分类号: G06Q30/08 G06Q40/00 G06Q40/06

    摘要: A method for performing continuous auctions over a computer network system consisting of a server/seller and multiple clients/buyers. The seller makes information about the type of sale items, the number of sale items, minimum bid price, time limits for bids to be submitted, and estimated time interval to the next auction decision available to the buyer by displaying it on buyers' computer terminals. Each buyer responds by entering a bid and such bid's duration, within the time limits set by the seller, in to the auction system through buyers' computer terminals. Additionally, a buyer's bid entry time is saved by the system. Determining the response time for present buyers to schedule the next auction. At least one auction winner, whose bid is within bid duration, is selected through a dynamically adjusted customer selection method.

    摘要翻译: 一种通过由服务器/卖家和多个客户/买方组成的计算机网络系统执行连续拍卖的方法。 卖方通过在买方的电脑终端上显示销售商品的类型,销售数量,最低投标价格,要提交的投标的时间限制以及下一次拍卖决定的时间间隔, 。 每个买方在买方的电脑终端上通过在卖方设定的时限内输入出价和出价持续时间来进行拍卖。 此外,系统保存买方的出价输入时间。 确定现在买家安排下一次拍卖的响应时间。 通过动态调整的客户选择方法选择至少一个拍卖竞价者,其竞标价格在投标期限内。

    On-line mining of quantitative association rules
    4.
    发明授权
    On-line mining of quantitative association rules 失效
    定量关联规则的在线挖掘

    公开(公告)号:US6092064A

    公开(公告)日:2000-07-18

    申请号:US964064

    申请日:1997-11-04

    IPC分类号: G06F19/00 G06F17/30

    摘要: A computer method of online mining of quantitative association rules consisting of two stages, a preprocessing stage followed by an online rule generation stage. The required computational effort is reduced by the pre-processing stage, defined by pre-processing data to organize the relationship between antecedent attributes to create a heirarchially arranged multidimensional indexing structure. The resulting structure facilitates the performance of the second stage, online processing, which involves the generation of quantitative association rules. The second stage, online rule generation, utilizes the multidimensional index structure created by the preprocessing stage by first finding the areas in the data which correspond to the rules and then uses a merging step to create a merged tree in order to carefully combine interesting regions in order to give a heirarchical representation of the rule set. The merged tree is then used in order to actually generate the rules.

    摘要翻译: 一种在线挖掘定量关联规则的计算机方法,包括两个阶段,一个预处理阶段,随后是在线规则生成阶段。 通过预处理阶段来减少所需的计算量,该预处理阶段通过预处理数据来定义,以组织先行属性之间的关系,以创建一个历史性地排列的多维索引结构。 所产生的结构有助于第二阶段的在线处理,其涉及产生定量关联规则的性能。 第二阶段,在线规则生成,利用由预处理阶段创建的多维索引结构,首先查找与规则相对应的数据中的区域,然后使用合并步骤创建合并树,以便仔细地组合有趣区域 命令给出规则集的历史代表性。 然后使用合并的树来实际生成规则。

    System and method for construction of a data structure for indexing
multidimensional objects
    5.
    发明授权
    System and method for construction of a data structure for indexing multidimensional objects 失效
    用于构建索引多维对象的数据结构的系统和方法

    公开(公告)号:US5781906A

    公开(公告)日:1998-07-14

    申请号:US660047

    申请日:1996-06-06

    IPC分类号: G06F17/30

    摘要: An apparatus and a method for constructing a multidimensional index tree which minimizes the time to access data objects and is resilient to the skewness of the data. This is achieved through successive partitioning of all given data objects by considering one level at a time starting with one partition and using a top-down approach until each final partition can fit within a leaf node. Subdividing the data objects is via a global optimization approach to minimize the area overlap and perimeter of the minimum bounding rectangles covered by each node. The current invention divides the index construction problem into two subproblems: the first one addresses the tightness of the packing (in terms of area, overlap and perimeter) using a small fan out at each index node and the other one handles the fan out issue to improve index page utilization. These two stages are referred to as binarization and compression. The binarization stage constructs a binary tree such that the entries in the leaf nodes correspond to the spatial data objects. The compression stage converts the binary tree into a tree for which all but the leaf nodes and the parent nodes of all leaf nodes have branch factors of M. In the binarization stage, a weighting or skew factor is used to achieve flexibility in determining the number of data objects to be included in each of the partitions to obtain a tree structure with desirable query performance. Thus the index tree constructed is not required to be height balanced. This provides a means to trade-off imbalance in the index tree in order to reduce the number of pages which need to be accessed in a query.

    摘要翻译: 一种用于构造多维索引树的装置和方法,其使得访问数据对象的时间最小化并且对数据的偏度有弹性。 这是通过从一个分区开始一次考虑一个级别并使用自上而下的方法,直到每个最终分区可以适合于叶节点内的所有给定数据对象的连续分区来实现的。 通过全局优化方法细分数据对象,以最小化每个节点覆盖的最小边界矩形的面积重叠和周长。 本发明将指数构造问题划分为两个子问题:第一个问题是使用每个索引节点处的小扇形物来解决包装的紧密度(面积,重叠和周长),另一个处理扇出问题 提高索引页面利用率。 这两个阶段被称为二值化和压缩。 二值化阶段构造二叉树,使得叶节点中的条目对应于空间数据对象。 压缩级将二进制树转换为树,除了叶节点和所有叶节点的父节点之外,所有叶节点都具有分支因子M.在二进制化阶段,使用加权或偏斜因子来确定数量的灵活性 的数据对象被包括在每个分区中以获得具有期望的查询性能的树结构。 因此,构建的索引树不需要高度平衡。 这提供了一种权衡索引树中的不平衡的方法,以减少查询中需要访问的页面数量。

    System and method for analyzing streams and counting stream items on multi-core processors
    6.
    发明授权
    System and method for analyzing streams and counting stream items on multi-core processors 失效
    用于分析多核处理器上的流和计数流项目的系统和方法

    公开(公告)号:US08321579B2

    公开(公告)日:2012-11-27

    申请号:US11828732

    申请日:2007-07-26

    IPC分类号: G06F15/16

    CPC分类号: G06F17/18

    摘要: Systems and methods for parallel stream item counting are disclosed. A data stream is partitioned into portions and the portions are assigned to a plurality of processing cores. A sequential kernel is executed at each processing core to compute a local count for items in an assigned portion of the data stream for that processing core. The counts are aggregated for all the processing cores to determine a final count for the items in the data stream. A frequency-aware counting method (FCM) for data streams includes dynamically capturing relative frequency phases of items from a data stream and placing the items in a sketch structure using a plurality of hash functions where a number of hash functions is based on the frequency phase of the item. A zero-frequency table is provided to reduce errors due to absent items.

    摘要翻译: 公开了并行流项计数的系统和方法。 将数据流划分为多个部分,并将这些部分分配给多个处理核。 在每个处理核心处执行顺序内核以计算用于该处理核心的数据流的分配部分中的项目的本地计数。 为所有处理核心聚合计数,以确定数据流中项目的最终计数。 用于数据流的频率感知计数方法(FCM)包括从数据流动态地捕获项目的相对频率相位,并且使用多个散列函数将项目放置在草图结构中,其中多个散列函数基于频率相位 的项目。 提供零频率表以减少由于缺少项目导致的错误。

    System and method for similarity searching in high-dimensional data space
    8.
    发明授权
    System and method for similarity searching in high-dimensional data space 失效
    高维数据空间相似度搜索的系统与方法

    公开(公告)号:US06289354B1

    公开(公告)日:2001-09-11

    申请号:US09167332

    申请日:1998-10-07

    IPC分类号: G06F1730

    摘要: Information is analyzed in the form of a plurality of data values that represent a plurality of objects. A set of features that characterize each object of the plurality of objects is identified. The plurality of data values are stored in a database. Each data value corresponds to at least one of the plurality of objects based on the set of features. Ones of the plurality of data values stored in the database are partitioned into a plurality of clusters. Each cluster of the plurality of clusters is assigned to one respective node of a plurality of nodes arranged in a tree hierarchy. Ones of the plurality of nodes of the tree hierarchy are traversed. If desired, information may be analyzed for finding peer groups in e-commerce applications.

    摘要翻译: 以表示多个对象的多个数据值的形式分析信息。 识别表征多个对象中的每个对象的一组特征。 多个数据值存储在数据库中。 基于特征集合,每个数据值对应于多个对象中的至少一个。 存储在数据库中的多个数据值的一部分被划分成多个簇。 将多个群集中的每个群集分配给以树状层次结构排列的多个节点的一个相应节点。 遍历树层次结构的多个节点的一部分。 如果需要,可以分析信息以在电子商务应用中寻找对等组。

    System and method for searching databases with applications such as peer groups, collaborative filtering, and e-commerce
    9.
    发明授权
    System and method for searching databases with applications such as peer groups, collaborative filtering, and e-commerce 有权
    用于通过对等组,协同过滤和电子商务等应用程序搜索数据库的系统和方法

    公开(公告)号:US06236985B1

    公开(公告)日:2001-05-22

    申请号:US09168117

    申请日:1998-10-07

    IPC分类号: G06F1730

    摘要: A method of analyzing information in the form of a plurality of data records. Each data record includes one or more data values. The data values are partitioned into a plurality of data signatures. Data values of data signatures are compared to data values of data records. Based on the result of the comparison an index is associated with each data record. A bound corresponding to the index is calculated based on a user defined target value and an objective function. If desired, information may be analyzed for finding peer groups in e-commerce applications.

    摘要翻译: 一种以多个数据记录的形式分析信息的方法。 每个数据记录包括一个或多个数据值。 数据值被划分成多个数据签名。 将数据签名的数据值与数据记录的数据值进行比较。 基于比较的结果,索引与每个数据记录相关联。 基于用户定义的目标值和目标函数计算与索引相对应的边界。 如果需要,可以分析信息以在电子商务应用中寻找对等组。

    System and method for caching objects of non-uniform size using multiple
LRU stacks partitions into a range of sizes
    10.
    发明授权
    System and method for caching objects of non-uniform size using multiple LRU stacks partitions into a range of sizes 失效
    使用多个LRU堆栈来缓存不均匀大小的对象的系统和方法分割成一系列大小

    公开(公告)号:US6012126A

    公开(公告)日:2000-01-04

    申请号:US741412

    申请日:1996-10-29

    IPC分类号: G06F12/08 G06F12/12 G06F12/00

    摘要: A system and method for caching objects of non-uniform size. A caching logic includes a selection logic and an admission control logic. The admission control logic determines whether an object not currently in the cache is accessed may be cached at all. The admission control logic uses an auxiliary LRU stack which contains the identities and time stamps of the objects which have been recently accessed. Thus, the memory required is relatively small. The auxiliary cache serves as a dynamic popularity list and an object may be admitted to the cache if and only if it appears on the popularity list. The selection logic selects one or more of the objects in the cache which have to be purged when a new object enters the cache. The order of removal of the objects is prioritized based both on the size as well as the frequency of access of the object and may be adjusted by a time to obsolescence factor (TTO). To reduce the time required to compare the space-time product of each object in the cache, the objects may be classified in ranges having geometrically increasing intervals. Specifically, multiple LRU stacks are maintained independently wherein each LRU stack contains only objects in a predetermined range of sizes. In order to choose candidates for replacement, only the least recently used objects in each group need be considered.

    摘要翻译: 用于缓存不均匀大小的对象的系统和方法。 缓存逻辑包括选择逻辑和准入控制逻辑。 准入控制逻辑确定是否存取当前不在高速缓存中的对象可以被缓存。 准入控制逻辑使用辅助LRU堆栈,其包含最近访问的对象的标识符和时间标记。 因此,所需的存储器相对较小。 辅助缓存用作动态流行度列表,并且当且仅当出现在受欢迎程度列表上时,对象可以被允许进入高速缓存。 选择逻辑选择当新对象进入高速缓存时必须被清除的高速缓存中的一个或多个对象。 基于对象的访问大小和频率,对对象的删除顺序进行优先排序,并且可以通过时间到过时因素(TTO)进行调整。 为了减少比较高速缓存中每个对象的时空乘积所需的时间,对象可以分为具有几何增加间隔的范围。 具体地说,独立维护多个LRU堆栈,其中每个LRU堆栈仅包含预定尺寸范围的对象。 为了选择候选人进行替换,只需要考虑每组中最近最少使用的对象。