System and method for construction of a data structure for indexing
multidimensional objects
    1.
    发明授权
    System and method for construction of a data structure for indexing multidimensional objects 失效
    用于构建索引多维对象的数据结构的系统和方法

    公开(公告)号:US5781906A

    公开(公告)日:1998-07-14

    申请号:US660047

    申请日:1996-06-06

    IPC分类号: G06F17/30

    摘要: An apparatus and a method for constructing a multidimensional index tree which minimizes the time to access data objects and is resilient to the skewness of the data. This is achieved through successive partitioning of all given data objects by considering one level at a time starting with one partition and using a top-down approach until each final partition can fit within a leaf node. Subdividing the data objects is via a global optimization approach to minimize the area overlap and perimeter of the minimum bounding rectangles covered by each node. The current invention divides the index construction problem into two subproblems: the first one addresses the tightness of the packing (in terms of area, overlap and perimeter) using a small fan out at each index node and the other one handles the fan out issue to improve index page utilization. These two stages are referred to as binarization and compression. The binarization stage constructs a binary tree such that the entries in the leaf nodes correspond to the spatial data objects. The compression stage converts the binary tree into a tree for which all but the leaf nodes and the parent nodes of all leaf nodes have branch factors of M. In the binarization stage, a weighting or skew factor is used to achieve flexibility in determining the number of data objects to be included in each of the partitions to obtain a tree structure with desirable query performance. Thus the index tree constructed is not required to be height balanced. This provides a means to trade-off imbalance in the index tree in order to reduce the number of pages which need to be accessed in a query.

    摘要翻译: 一种用于构造多维索引树的装置和方法,其使得访问数据对象的时间最小化并且对数据的偏度有弹性。 这是通过从一个分区开始一次考虑一个级别并使用自上而下的方法,直到每个最终分区可以适合于叶节点内的所有给定数据对象的连续分区来实现的。 通过全局优化方法细分数据对象,以最小化每个节点覆盖的最小边界矩形的面积重叠和周长。 本发明将指数构造问题划分为两个子问题:第一个问题是使用每个索引节点处的小扇形物来解决包装的紧密度(面积,重叠和周长),另一个处理扇出问题 提高索引页面利用率。 这两个阶段被称为二值化和压缩。 二值化阶段构造二叉树,使得叶节点中的条目对应于空间数据对象。 压缩级将二进制树转换为树,除了叶节点和所有叶节点的父节点之外,所有叶节点都具有分支因子M.在二进制化阶段,使用加权或偏斜因子来确定数量的灵活性 的数据对象被包括在每个分区中以获得具有期望的查询性能的树结构。 因此,构建的索引树不需要高度平衡。 这提供了一种权衡索引树中的不平衡的方法,以减少查询中需要访问的页面数量。

    System and method for caching objects of non-uniform size using multiple
LRU stacks partitions into a range of sizes
    3.
    发明授权
    System and method for caching objects of non-uniform size using multiple LRU stacks partitions into a range of sizes 失效
    使用多个LRU堆栈来缓存不均匀大小的对象的系统和方法分割成一系列大小

    公开(公告)号:US6012126A

    公开(公告)日:2000-01-04

    申请号:US741412

    申请日:1996-10-29

    IPC分类号: G06F12/08 G06F12/12 G06F12/00

    摘要: A system and method for caching objects of non-uniform size. A caching logic includes a selection logic and an admission control logic. The admission control logic determines whether an object not currently in the cache is accessed may be cached at all. The admission control logic uses an auxiliary LRU stack which contains the identities and time stamps of the objects which have been recently accessed. Thus, the memory required is relatively small. The auxiliary cache serves as a dynamic popularity list and an object may be admitted to the cache if and only if it appears on the popularity list. The selection logic selects one or more of the objects in the cache which have to be purged when a new object enters the cache. The order of removal of the objects is prioritized based both on the size as well as the frequency of access of the object and may be adjusted by a time to obsolescence factor (TTO). To reduce the time required to compare the space-time product of each object in the cache, the objects may be classified in ranges having geometrically increasing intervals. Specifically, multiple LRU stacks are maintained independently wherein each LRU stack contains only objects in a predetermined range of sizes. In order to choose candidates for replacement, only the least recently used objects in each group need be considered.

    摘要翻译: 用于缓存不均匀大小的对象的系统和方法。 缓存逻辑包括选择逻辑和准入控制逻辑。 准入控制逻辑确定是否存取当前不在高速缓存中的对象可以被缓存。 准入控制逻辑使用辅助LRU堆栈,其包含最近访问的对象的标识符和时间标记。 因此,所需的存储器相对较小。 辅助缓存用作动态流行度列表,并且当且仅当出现在受欢迎程度列表上时,对象可以被允许进入高速缓存。 选择逻辑选择当新对象进入高速缓存时必须被清除的高速缓存中的一个或多个对象。 基于对象的访问大小和频率,对对象的删除顺序进行优先排序,并且可以通过时间到过时因素(TTO)进行调整。 为了减少比较高速缓存中每个对象的时空乘积所需的时间,对象可以分为具有几何增加间隔的范围。 具体地说,独立维护多个LRU堆栈,其中每个LRU堆栈仅包含预定尺寸范围的对象。 为了选择候选人进行替换,只需要考虑每组中最近最少使用的对象。

    Eliminating redundancy in generation of association rules for on-line
mining
    4.
    发明授权
    Eliminating redundancy in generation of association rules for on-line mining 失效
    消除在线挖掘关联规则的冗余

    公开(公告)号:US5943667A

    公开(公告)日:1999-08-24

    申请号:US868244

    申请日:1997-06-03

    IPC分类号: G06F17/30

    摘要: A computer method of removing simple and strict redundant association rules generated from large collections of data. A compact set of rules is presented to an end user which is devoid of many redundancies in the discovery of data patterns. The method is directed primarily to on-line applications such as the Internet and Intranet. Given a number of large itemsets as input, simple redundancies are removed by generating all maximal ancestors, the frontier set, for each large itemset. The set of maximal ancestors share a hierarchical relationship with the large itemset from which they were derived and further satisfy an inequality whereby the ratio of respective support values is less than the reciprocal of some user defined confidence value.The resulting compact rule set is displayed to an end user at some specified level of support and confidence. The method is also able to generate the full set of rules from the compact set.

    摘要翻译: 一种从大量数据集中生成的简单而严格的冗余关联规则的计算机方法。 向最终用户提供了一套紧凑的规则,在发现数据模式时缺少许多冗余。 该方法主要针对在线应用,如Internet和Intranet。 给定大量项目集作为输入,通过为每个大项目集生成所有最大祖先(边界集)来消除简单的冗余。 最大祖先的集合与从其导出的大项目集共享分层关系,并进一步满足不等式,由此各个支持值的比率小于某些用户定义的置信度值的倒数。 所产生的紧凑规则集在某些指定的支持级别和置信度下显示给最终用户。 该方法还能够从紧凑集中生成完整的规则集。

    Collaborative caching of a requested object by a lower level node as a
function of the caching status of the object at a higher level node
    5.
    发明授权
    Collaborative caching of a requested object by a lower level node as a function of the caching status of the object at a higher level node 失效
    作为较高级别节点上对象的缓存状态的函数的由较低级别节点协作缓存所请求的对象

    公开(公告)号:US5924116A

    公开(公告)日:1999-07-13

    申请号:US831237

    申请日:1997-04-02

    IPC分类号: G06F17/30 G06F12/08

    CPC分类号: G06F17/30902

    摘要: A method and system of collaboratively caching information to allow improved caching decisions by a lower level or sibling node. In a caching hierarchy, the client and/or servers may factor in the caching status at the higher level in deciding whether to cache an object and which objects are to be replaced. The PICS protocol may be used to pass the caching information of some or all the upper hierarchy down the hierarchy. Furthermore, the caching status information can also be used to direct the object request to the closest higher level proxy which has potentially cached the object, instead of blindly requesting it from the next immediate higher level proxy. A selection policy used to select objects for replacement in the cache may be prioritized not only on the size and the frequency of access of the object, but also on the access time required to get the object if it is not cached. The selection policy may also include a selection weight factor wherein each object is assigned a selection weight based on its replacement cost, the object size and how frequently it is modified. Non-uniform size objects may be classified in ranges of selection weights having geometrically increasing intervals. Multiple LRU stacks may be independently maintained wherein each stack contains objects in a certain range of selection weights. In order to choose candidates for replacement, only the least recently used objects in each group need be considered.

    摘要翻译: 协同缓存信息以允许由较低级别或兄弟节点改进的缓存决定的方法和系统。 在高速缓存层次结构中,客户端和/或服务器可以考虑高级别的缓存状态,以决定是否缓存对象以及哪些对象被替换。 可以使用PICS协议将部分或全部上层的缓存信息传递给层次结构。 此外,缓存状态信息还可以用于将对象请求定向到潜在地缓存对象的最接近的较高级代理,而不是盲目地从下一个即时更高级别的代理请求它。 用于选择用于在高速缓存中替换的对象的选择策略可以不仅基于对象的访问的大小和频率,而且还取决于如果没有缓存而获取对象所需的访问时间。 选择策略还可以包括选择权重因子,其中基于其重置成本,对象大小以及修改的频率来为每个对象分配选择权重。 不均匀尺寸的物体可以分类为具有几何增加间隔的选择权重的范围。 可以独立地维护多个LRU堆栈,其中每个堆叠包含在一定范围的选择权重中的对象。 为了选择候选人进行替换,只需要考虑每组中最近最少使用的对象。

    System and method for similarity indexing and searching in high dimensional space
    6.
    发明授权
    System and method for similarity indexing and searching in high dimensional space 有权
    高维空间相似索引和搜索的系统和方法

    公开(公告)号:US06922700B1

    公开(公告)日:2005-07-26

    申请号:US09571471

    申请日:2000-05-16

    IPC分类号: G06F7/00 G06F17/30

    摘要: A system and method for providing similarity indexing and searching in multi-dimensional databases. In one aspect, given a set of data points in a multidimensional space, the values of the data points on each dimension are partitioned into a plurality of grids, wherein each grid is assigned a grid value. Given a target data point, similarity candidates (i.e., data points that are similar to the target data point) are identified based on matching grid values. An inverted grid index comprising an index on the data points falling into each grid of each dimension is utilized to identify similarity candidates. A similarity selection process is employed to select the closest identified similarity candidates for output, which utilizes a similarity function to measure the closeness of each identified similarity candidate to the target data point. A preferred similarity function is one that considers a subset of the dimensions in which a point falls within a similar grid of the target point. In addition, a correlation effect among the grids in different dimensions may be a factor captured in the similarity function.

    摘要翻译: 一种用于在多维数据库中提供相似性索引和搜索的系统和方法。 在一个方面,给定多维空间中的一组数据点,每个维度上的数据点的值被划分为多个网格,其中每个网格被分配网格值。 给定目标数据点,基于匹配网格值来识别相似候选(即,与目标数据点相似的数据点)。 使用包括落在每个维度的每个网格中的数据点上的索引的反向网格索引来识别相似性候选。 使用相似性选择处理来选择最接近的所识别的输出相似性候选,其利用相似度函数来测量每个识别的相似性候选者与目标数据点的接近度。 优选的相似度函数是考虑其中点落在目标点的类似网格内的维度的子集的相似度函数。 此外,不同维度的网格之间的相关效应可能是相似度函数中捕获的因素。

    Method, apparatus, and program for scheduling resources in a penalty-based environment
    7.
    发明授权
    Method, apparatus, and program for scheduling resources in a penalty-based environment 失效
    用于在基于罚分的环境中调度资源的方法,装置和程序

    公开(公告)号:US07480913B2

    公开(公告)日:2009-01-20

    申请号:US10658726

    申请日:2003-09-09

    IPC分类号: G06F9/46

    CPC分类号: G06F9/4887 G06Q10/06

    摘要: The present invention relates to the problem of scheduling work for employees and/or other resources in a help desk or similar environment. The employees have different levels of training and availabilities. The jobs, which occur as a result of dynamically occurring events, consist of multiple tasks ordered by chain precedence. Each job and/or task carries with it a penalty which is a step function of the time taken to complete it, the deadlines and penalties having been negotiated as part of one or more service level agreement contracts. The goal is to minimize the total amount of penalties paid. The invention consists of a pair of heuristic schemes for this difficult scheduling problem, one greedy and one randomized. The greedy scheme is used to provide a quick initial solution, while the greedy and randomized schemes are combined in order to think more deeply about particular problem instances. The invention also includes a scheme for determining how much time to allocate to thinking about each of several potential problem instance variants.

    摘要翻译: 本发明涉及在帮助台或类似环境中调度员工和/或其他资源的工作的问题。 员工具有不同的培训水平和可用性。 由于动态发生的事件而发生的作业由链优先级排序的多个任务组成。 每项工作和/或任务带有罚款,这是完成它所需的时间的一个阶段功能,作为一个或多个服务级别协议合同的一部分,谈判达成的期限和处罚。 目标是尽量减少所支付的罚款总额。 本发明由一对启发式方案组成,用于这个困难的调度问题,一个是贪心的,一个是随机的。 贪心的方案用于提供一个快速的初步解决方案,而贪心和随机的方案是相结合的,以便更深入地思考特定的问题实例。 本发明还包括一种用于确定分配多少时间以考虑几个潜在问题实例变体中的每一个的方案。