ARCHITECTURE FOR AN INDEXER
    1.
    发明申请
    ARCHITECTURE FOR AN INDEXER 失效
    一个指数的架构

    公开(公告)号:US20070271268A1

    公开(公告)日:2007-11-22

    申请号:US11834556

    申请日:2007-08-06

    IPC分类号: G06F17/30

    摘要: Disclosed is a technique for indexing data. For each token in a set of documents, a sort key is generated that includes a document identifier that indicates whether a section of a document associated with the sort key is an anchor text section or a context section, wherein the anchor text section and the context text section have a same document identifier; it is determined whether a data field associated with the token is a fixed width; when the data field is a fixed width, the token is designated as one for which fixed width sort is to be performed; and, when the data field is a variable length, the token is designated as one for which a variable width sort is to be performed. The fixed width sort and the variable width sort are performed. For each document, the sort keys are used to bring together the anchor text section and the context section of that document.

    摘要翻译: 公开了一种索引数据的技术。 对于一组文档中的每个标记,生成包括指示与排序键相关联的文档的一部分是锚定文本部分还是上下文部分的文档标识符的排序关键字,其中锚文本部分和上下文 文本部分具有相同的文档标识符; 确定与令牌相关联的数据字段是否是固定宽度; 当数据字段是固定宽度时,令牌被指定为要进行固定宽度排序的令牌; 并且当数据字段是可变长度时,令牌被指定为要对其执行可变宽度排序的令牌。 执行固定宽度排序和可变宽度排序。 对于每个文档,排序键用于将锚文本部分和文档的上下文部分组合在一起。

    Enhancing query performance of search engines using lexical affinities
    5.
    发明申请
    Enhancing query performance of search engines using lexical affinities 失效
    使用词汇亲和力提高搜索引擎的查询性能

    公开(公告)号:US20060259482A1

    公开(公告)日:2006-11-16

    申请号:US11335760

    申请日:2006-01-18

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30622 G06F17/30864

    摘要: Provided are techniques for computer-based electronic Information Retrieval (IR). An extended inverted index structure by generating one or more lexical affinities (LA), wherein each of the one or more lexical affinities comprises two or more search items found in proximity in one or more documents in a pool of documents, and generating a posting list for each of the one or more lexical affinities, wherein each posting list is associated with a specific lexical affinity and contains document identifying information for each of the one or more documents in the pool that contains the specific lexical affinity and a location within the document where the specific lexical affinity occurs.

    摘要翻译: 提供了基于计算机的电子信息检索(IR)技术。 一种通过生成一个或多个词汇亲和度(LA)的扩展的反向索引结构,其中所述一个或多个词法亲和度中的每一个包括在文档库中的一个或多个文档中邻近发现的两个或多个搜索项,并且生成发布列表 对于一个或多个词汇亲和力中的每一个,其中每个发布列表与特定词汇亲和度相关联,并且包含用于池中的一个或多个文档中的每个文档的文档标识信息,该文档包含特定的词汇亲和度以及文档中的位置 具体的词汇亲和力发生。

    Method, system, and program for searching documents for ranges of numeric values
    6.
    发明申请
    Method, system, and program for searching documents for ranges of numeric values 有权
    用于搜索文件数值范围的方法,系统和程序

    公开(公告)号:US20060074962A1

    公开(公告)日:2006-04-06

    申请号:US10949473

    申请日:2004-09-24

    IPC分类号: G06F17/30

    摘要: Provided are a method, system, and program for searching documents for ranges of numeric values. Document identifiers for documents are accessed, wherein the documents include at least one value that is a member of a set of values. A number of posting lists are generated. Each posting list is associated with a range of consecutive values within the set of values and includes document identifiers for documents having values within the range of consecutive values associated with the posting list. Each document identifier is associated with one value in the set of values included in the document identified by the document identifier. The generated posting lists are stored.

    摘要翻译: 提供了用于在数值范围内搜索文档的方法,系统和程序。 访问文档的文档标识符,其中文档包括作为一组值的成员的至少一个值。 生成多个发布列表。 每个发布列表与该组值范围内的连续值的范围相关联,并且包括具有与发布列表相关联的连续值范围内的值的文档的文档标识符。 每个文档标识符与由文档标识符标识的文档中包括的值集合中的一个值相关联。 生成的发布列表被存储。

    System and Method for Efficiently Evaluating Complex Boolean Expressions
    7.
    发明申请
    System and Method for Efficiently Evaluating Complex Boolean Expressions 审中-公开
    有效评估复杂布尔表达式的系统和方法

    公开(公告)号:US20110225038A1

    公开(公告)日:2011-09-15

    申请号:US12724415

    申请日:2010-03-15

    IPC分类号: G06F17/30 G06Q30/00

    摘要: An improved system and method for efficiently evaluating complex Boolean expressions is provided. Leaf nodes of Boolean expression trees for objects represented by Boolean expressions of attribute-value pairs may be assigned a positional identifier that indicates the position of a node in the Boolean expression tree. The positional identifiers of each object may be indexed by attribute-value pairs of the leaf nodes of the Boolean expression trees in an inverted index. Given an input set of attribute-value pairs, a list of positional identifiers for leaf nodes of virtual Boolean expression trees may be found in the index matching the attribute-value pairs of the input set. The list of positional identifiers of leaf nodes may be sorted in order by positional identifier for each contract. An expression evaluator may then verify whether a virtual Boolean expression tree for each contract is satisfied by the list of positional identifiers.

    摘要翻译: 提供了一种用于有效评估复杂布尔表达式的改进的系统和方法。 可以为布尔表达式树中的节点的位置分配位置标识符,该对象由布尔值属性值对表示的对象的布尔表达式树的叶节点分配。 每个对象的位置标识符可以由反向索引中的布尔表达式树的叶节点的属性值对索引。 给定属性值对的输入集合,可以在匹配输入集合的属性值对的索引中找到虚拟布尔表达式树的叶节点的位置标识符的列表。 叶节点的位置标识符的列表可以按照每个合同的位置标识符按顺序排序。 然后,表达式求值器可以验证每个契约的虚拟布尔表达式树是否被位置标识符列表所满足。

    System and method for automatic matching of contracts using a fixed-length predicate representation
    8.
    发明授权
    System and method for automatic matching of contracts using a fixed-length predicate representation 有权
    使用固定长度谓词表示自动匹配合同的系统和方法

    公开(公告)号:US08229933B2

    公开(公告)日:2012-07-24

    申请号:US12714142

    申请日:2010-02-26

    IPC分类号: G06F7/00

    CPC分类号: G06Q30/08

    摘要: An item of inventory is described as a Boolean expression, which is converted into a multi-level, alternating AND/OR impression tree representation with leaf nodes representing conjuncts. Processing the conjuncts of the tree through a contract index results in retrieving a set of candidate contracts that match at least some but not necessarily all impression tree leaf node predicates. Next, an AND/OR contract tree representation is constructed with each contract tree leaf node having a label representing a projection onto a discrete set of ordered symbols. Contracts with projections that cover the entire range of discrete set of ordered symbols are deemed to satisfy the item of inventory. Implementation of the contract index includes retrieval techniques to support multi-valued predicates as well as confidence threshold functions using a multi-level tree representation of multi-valued predicates.

    摘要翻译: 库存项目被描述为一个布尔表达式,它被转换成具有代表连接的叶节点的多层交替的AND / OR印象树表示。 通过合同索引​​处理树的结合导致检索与至少一些但不一定是所有印象树叶节点谓词匹配的一组候选契约。 接下来,构建AND / OR合同树表示,每个合约树叶节点具有表示投影到一个离散的有序符号集合上的标签。 具有覆盖整个有序符号的整个范围的预测的合同被视为满足库存项目。 合同索引的实现包括使用多值谓词的多级树表示来支持多值谓词以及置信阈值函数的检索技术。

    System and Method for Automatic Matching of Contracts Using a Fixed-Length Predicate Representation
    9.
    发明申请
    System and Method for Automatic Matching of Contracts Using a Fixed-Length Predicate Representation 有权
    使用固定长度谓词表示自动匹配合同的系统和方法

    公开(公告)号:US20110213767A1

    公开(公告)日:2011-09-01

    申请号:US12714142

    申请日:2010-02-26

    IPC分类号: G06F17/30

    CPC分类号: G06Q30/08

    摘要: A method for automatic matching of contracts to inventory using a fixed-length complex predicate representation. An item of inventory is described as a Boolean expression, which is converted into a multi-level, alternating AND/OR impression tree representation with leaf nodes representing conjuncts. Processing the conjuncts of the tree through a contract index results in retrieving a set of candidate contracts that match the at least some but not necessarily all impression tree leaf node predicates. Next, an AND/OR contract tree representation is constructed with each contract tree leaf node having a label representing a projection onto a discrete set of ordered symbols. Contracts with projections that cover the entire range of discrete set of ordered symbols are deemed to satisfy the item of inventory. Implementation of the contract index includes retrieval techniques to support multi-valued predicates as well as confidence threshold functions using a multi-level tree representation of multi-valued predicates.

    摘要翻译: 使用固定长度的复合谓词表示自动匹配合同到库存的方法。 库存项目被描述为一个布尔表达式,它被转换成具有代表连接的叶节点的多层交替的AND / OR印象树表示。 通过合同索引​​处理树的结合导致检索与至少一些但不一定是所有印象树叶节点谓词匹配的一组候选契约。 接下来,构建AND / OR合同树表示,每个合约树叶节点具有表示投影到一个离散的有序符号集合上的标签。 具有覆盖整个有序符号的整个范围的预测的合同被视为满足库存项目。 合同索引的实现包括使用多值谓词的多级树表示来支持多值谓词以及置信阈值函数的检索技术。

    System and Method for Automatic Matching of Contracts in an Inverted Index to Impression Opportunities Using Complex Predicates with Multi-Valued Attributes
    10.
    发明申请
    System and Method for Automatic Matching of Contracts in an Inverted Index to Impression Opportunities Using Complex Predicates with Multi-Valued Attributes 审中-公开
    使用具有多值属性的复数谓词将反向索引中的合同自动匹配到印象机会的系统和方法

    公开(公告)号:US20110213660A1

    公开(公告)日:2011-09-01

    申请号:US12714051

    申请日:2010-02-26

    IPC分类号: G06Q30/00 G06N5/02

    CPC分类号: G06Q30/02 G06Q30/0254

    摘要: A method for automatic matching of contracts to inventory using a fixed-length complex predicate representation. An item of inventory is described as a Boolean expression, which is converted into a multi-level, alternating AND/OR impression tree representation with leaf nodes representing conjuncts. Processing the conjuncts of the tree through a contract index results in retrieving a set of candidate contracts that match the at least some but not necessarily all impression tree leaf node predicates. Next, an AND/OR contract tree representation is constructed with each contract tree leaf node having a label representing a projection onto a discrete set of ordered symbols. Contracts with projections that cover the entire range of discrete set of ordered symbols are deemed to satisfy the item of inventory. Implementation of the contract index includes retrieval techniques to support multi-valued predicates as well as confidence threshold functions using a multi-level tree representation of multi-valued predicates.

    摘要翻译: 使用固定长度的复合谓词表示自动匹配合同到库存的方法。 库存项目被描述为一个布尔表达式,它被转换成具有代表连接的叶节点的多层交替的AND / OR印象树表示。 通过合同索引​​处理树的结合导致检索与至少一些但不一定是所有印象树叶节点谓词匹配的一组候选契约。 接下来,构建AND / OR合同树表示,每个合约树叶节点具有表示投影到一个离散的有序符号集合上的标签。 具有覆盖整个有序符号的整个范围的预测的合同被视为满足库存项目。 合同索引的实现包括使用多值谓词的多级树表示来支持多值谓词以及置信阈值函数的检索技术。