ROBUST DISCOVERY OF ENTITY SYNONYMS USING QUERY LOGS
    111.
    发明申请
    ROBUST DISCOVERY OF ENTITY SYNONYMS USING QUERY LOGS 有权
    使用查询记录对实体同步的可靠发现

    公开(公告)号:US20130232129A1

    公开(公告)日:2013-09-05

    申请号:US13487260

    申请日:2012-06-04

    CPC classification number: G06F17/30672

    Abstract: A similarity analysis framework is described herein which leverages two or more similarity analysis functions to generate synonyms for an entity reference string re. The functions are selected such that the synonyms that are generated by the framework satisfy a core set of synonym-related properties. The functions operate by leveraging query log data. One similarity analysis function takes into consideration the strength of similarity between a particular candidate string se and an entity reference string re even in the presence of sparse query log data, while another function takes into account the classes of se and re. The framework also provides indexing mechanisms that expedite its computations. The framework also provides a reduction module for converting long entity reference strings into shorter strings, where each shorter string (if found) contains a subset of the terms in its longer counterpart.

    Abstract translation: 本文描述了相似性分析框架,其利用两个或多个相似性分析功能来生成实体参考字符串re的同义词。 选择这些功能使得由框架生成的同义词满足同义词相关属性的核心集合。 这些功能通过利用查询日志数据进行操作。 一个相似性分析功能考虑到即使在存在稀疏查询日志数据的情况下,特定候选字符串se和实体引用字符串之间的相似度的强度,而另一个函数考虑了se和re的类别。 该框架还提供了加速其计算的索引机制。 该框架还提供了一个缩减模块,用于将长实体引用字符串转换为较短的字符串,其中每个较短的字符串(如果找到)包含其较长对应项中的术语的子集。

    Finding related entity results for search queries
    112.
    发明授权
    Finding related entity results for search queries 有权
    查找搜索查询的相关实体结果

    公开(公告)号:US08195655B2

    公开(公告)日:2012-06-05

    申请号:US11758024

    申请日:2007-06-05

    CPC classification number: G06F17/278 G06F17/30864

    Abstract: Architecture for finding related entities for web search queries. An extraction component takes a document as input and outputs all the mentions (or occurrences) of named entities such as names of people, organizations, locations, and products in the document, as well as entity metadata. An indexing component takes a document identifier (docID) and the set of mentions of named entities and, stores and indexes the information for retrieval. A document-based search component takes a keyword query and returns the docIDs of the top documents matching with the query. A retrieval component takes a docID as input, accesses the information stored by the indexing component and returns the set of mentions of named entities in the document. This information is then passed to an entity scoring and thresholding component that computes an aggregate score of each entity and selects the entities to return to the user.

    Abstract translation: 用于查找网络搜索查询的相关实体的架构。 提取组件将文档作为输入并输出所有实体的所有提及(或出现),例如文档中的人员,组织,位置和产品的名称以及实体元数据。 索引组件采用文档标识符(docID)和命名实体的提及集合,并存储和索引信息进行检索。 基于文档的搜索组件接受关键字查询,并返回与查询匹配的顶级文档的docID。 检索组件将docID作为输入,访问由索引组件存储的信息,并返回文档中命名实体的提及集。 然后将该信息传递给实体计分和阈值组件,该组件计算每个实体的聚合分数,并选择要返回给用户的实体。

    Lightweight physical design alerter
    113.
    发明授权
    Lightweight physical design alerter 有权
    轻量物理设计报警器

    公开(公告)号:US08150790B2

    公开(公告)日:2012-04-03

    申请号:US11669782

    申请日:2007-01-31

    CPC classification number: G06F17/30306

    Abstract: A lightweight physical design alerter can analyze a workload and determine whether a comprehensive tuning session would result in a configuration improvement over the current configuration. The alerter provides a low-overhead procedure that can run during normal operation of a database management system and produce a notification if a current configuration is less than optimal. The alerter can report lower and upper bounds on the improvements that could be obtained if a comprehensive tuning tool is launched. A lower bound can be justified by generating feasible configurations. The disclosed embodiments can be extended to query updates, materialized views, and other physical design features (e.g., partitioning).

    Abstract translation: 轻量级物理设计报警器可以分析工作负载并确定综合调优会话是否会导致配置改进超过当前配置。 报警器提供了一个低开销的过程,可以在数据库管理系统的正常操作期间运行,并在当前配置不太适合的情况下产生通知。 报警器可以报告如果启动综合调整工具可以获得的改进的上下限。 可以通过生成可行的配置来证明下限。 所公开的实施例可以扩展到查询更新,物化视图和其他物理设计特征(例如,分区)。

    Pushing Search Query Constraints Into Information Retrieval Processing
    114.
    发明申请
    Pushing Search Query Constraints Into Information Retrieval Processing 审中-公开
    将搜索查询约束推送到信息检索处理中

    公开(公告)号:US20110320446A1

    公开(公告)日:2011-12-29

    申请号:US12823124

    申请日:2010-06-25

    CPC classification number: G06F16/90335

    Abstract: This patent application relates to interval-based information retrieval (IR) search techniques for efficiently and correctly answering keyword search queries. In some embodiments, a range of information-containing blocks for a search query can be identified. Each of these blocks, and thus the range, can include document identifiers that identify individual corresponding documents that contain a term found in the search query. From the range, a subrange(s) having a smaller number of blocks than the range can be selected. This can be accomplished without decompressing the blocks by partitioning the range into intervals and evaluating the intervals. The smaller number of blocks in the subranges(s) can then be decompressed and processed to identify a doc ID(s) and thus document(s) that satisfies the query.

    Abstract translation: 该专利申请涉及用于有效和正确地回答关键词搜索查询的基于间隔的信息检索(IR)搜索技术。 在一些实施例中,可以识别用于搜索查询的一系列含有信息的块。 这些块中的每个以及因此的范围可以包括识别包含在搜索查询中找到的术语的各个对应文档的文档标识符。 从该范围可以选择具有比该范围少的块数量的子范围。 这可以在不通过将范围划分成间隔并且评估间隔来解压缩块的情况下实现。 然后可以解压缩和处理子范围中较小数量的块,以识别文档ID,从而识别符合查询的文档。

    Incremental repair of query plans
    115.
    发明授权
    Incremental repair of query plans 有权
    查询计划的增量修复

    公开(公告)号:US07739269B2

    公开(公告)日:2010-06-15

    申请号:US11625153

    申请日:2007-01-19

    CPC classification number: G06F17/30463

    Abstract: Database systems use a plan cache to avoid the overheads (e.g., time, money) of query recompilation. Query plans can become invalidated by updates to the statistics on data or changes to the physical database design. Once a plan is invalidated, it can be repaired utilizing one or more of the disclosed embodiments. Incremental repair of query plans includes reusing parts of the current plan rather than discarding the plan entirely when it is invalidated. Repair to an existing query plan is attempted before resorting to full recompilation.

    Abstract translation: 数据库系统使用计划缓存来避免查询重新编译的开销(如时间,金钱)。 通过对数据统计信息的更新或物理数据库设计更改,查询计划可能会失效。 一旦计划无效,可以利用所公开的一个或多个实施例来修复计划。 查询计划的增量修复包括重新使用当前计划的部分,而不是完全在无效的情况下丢弃计划。 在进行完全重新编译之前尝试修复现有的查询计划。

    STOP-AND-RESTART STYLE EXECUTION FOR LONG RUNNING DECISION SUPPORT QUERIES
    118.
    发明申请
    STOP-AND-RESTART STYLE EXECUTION FOR LONG RUNNING DECISION SUPPORT QUERIES 审中-公开
    用于长时间运行的决策支持查询的停止和重新启动方式执行

    公开(公告)号:US20090083238A1

    公开(公告)日:2009-03-26

    申请号:US11859046

    申请日:2007-09-21

    CPC classification number: G06F16/24561

    Abstract: Stop-and-restart query execution that partially leverages the work already performed during the initial execution of the query to reduce the execution time during a restart. The technique selectively saves information from a previous execution of the query so that the overhead associated with restarting the query execution can be bounded. Despite saving only limited information, the disclosed technique substantially reduces the running time of the restarted query. The stop-and-restart query execution technique is constrained to save and reuse only a bounded number of records (intermediate records or output records) thereby releasing all other resources, rather than some of the resources. The technique chooses a subset of the records to save that were found during normal execution and then skipping the corresponding records when performing a scan during restart to prevent the duplication of execution. A skip-scan operator is employed to facilitate the disclosed restart technique.

    Abstract translation: 停止和重新启动的查询执行,部分利用在初始执行查询期间已经执行的工作,以减少重新启动期间的执行时间。 该技术选择性地保存来自查询的先前执行的信息,使得与重新启动查询执行相关联的开销可以被界定。 尽管仅节省有限的信息,但是所公开的技术大大减少了重新启动的查询的运行时间。 停止和重启查询执行技术被限制为只保存和重用有限数量的记录(中间记录或输出记录),从而释放所有其他资源,而不是一些资源。 该技术选择在正常执行期间发现的记录的子集,然后在重新启动期间执行扫描时跳过相应的记录,以防止重复执行。 采用跳过扫描运算符来促进公开的重启技术。

    Finding Related Entities For Search Queries
    119.
    发明申请
    Finding Related Entities For Search Queries 有权
    查找搜索查询的相关实体

    公开(公告)号:US20080306908A1

    公开(公告)日:2008-12-11

    申请号:US11758024

    申请日:2007-06-05

    CPC classification number: G06F17/278 G06F17/30864

    Abstract: Architecture for finding related entities for web search queries. An extraction component takes a document as input and outputs all the mentions (or occurrences) of named entities such as names of people, organizations, locations, and products in the document, as well as entity metadata. An indexing component takes a document identifier (docID) and the set of mentions of named entities and, stores and indexes the information for retrieval. A document-based search component takes a keyword query and returns the docIDs of the top documents matching with the query. A retrieval component takes a docID as input, accesses the information stored by the indexing component and returns the set of mentions of named entities in the document. This information is then passed to an entity scoring and thresholding component that computes an aggregate score of each entity and selects the entities to return to the user.

    Abstract translation: 用于查找网络搜索查询的相关实体的架构。 提取组件将文档作为输入并输出所有实体的所有提及(或出现),例如文档中的人员,组织,位置和产品的名称以及实体元数据。 索引组件采用文档标识符(docID)和命名实体的提及集合,并存储和索引信息进行检索。 基于文档的搜索组件接受关键字查询,并返回与查询匹配的顶级文档的docID。 检索组件将docID作为输入,访问由索引组件存储的信息,并返回文档中命名实体的提及集。 然后将该信息传递给实体计分和阈值组件,该组件计算每个实体的聚合分数,并选择要返回给用户的实体。

    Efficient evaluation of queries with mining predicates
    120.
    发明授权
    Efficient evaluation of queries with mining predicates 有权
    对采矿谓词进行查询的有效评估

    公开(公告)号:US07346601B2

    公开(公告)日:2008-03-18

    申请号:US10161308

    申请日:2002-06-03

    Abstract: A method for evaluating a user query on a database having a mining model that classifies records contained in the database into classes when the query comprises at least one mining predicate that refers to a class of database records. An upper envelope is derived for the class referred to by the mining predicate corresponding to a query that returns a set of database records that includes all of the database records belonging to the class. The upper envelope is included in the user query for query evaluation. The method may be practiced during a preprocessing phase by evaluating the mining model to extract a set of classes of the database records and deriving an upper envelope for each class. These upper envelopes are stored for access during user query evaluation.

    Abstract translation: 一种用于评估具有挖掘模型的数据库上的用户查询的方法,所述挖掘模型将所述数据库中包含的记录分类为类,所述查询包括至少一个引用数据库记录类的挖掘谓词。 对于与返回一组包含属于该类的所有数据库记录的数据库记录的查询相对应的挖掘谓词引用的类,派生上层信封。 用于查询评估的用户查询中包含上部信封。 该方法可以在预处理阶段期间通过评估挖掘模型来提取数据库记录的一组类别并为每个类别导出上部包络来实现。 这些上部信封在用户查询评估期间被存储以供访问。

Patent Agency Ranking