Scaling dynamic authority-based search using materialized subgraphs
    1.
    发明授权
    Scaling dynamic authority-based search using materialized subgraphs 有权
    使用实体化子图扩展基于权限的动态搜索

    公开(公告)号:US09171077B2

    公开(公告)日:2015-10-27

    申请号:US12394371

    申请日:2009-02-27

    IPC分类号: G06F7/00 G06F17/30

    摘要: According to one embodiment of the present invention, a method for processing a query is provided. The method includes generating a set of pre-computed materialized sub-graphs from a dataset and receiving a search query having one or more search query terms. A particular one of the pre-computed materialized sub-graphs is accessed and a dynamic authority-based keyword search is executed on the particular one of the pre-computed materialized sub-graphs. Nodes in the dataset are then retrieved based on the executing, and a response to the search query is provided which includes the retrieved nodes.

    摘要翻译: 根据本发明的一个实施例,提供了一种处理查询的方法。 该方法包括从数据集生成一组预先计算的物化子图并接收具有一个或多个搜索查询项的搜索查询。 访问预先计算的物化子图中的特定一个,并且对预先计算的物化子图中的特定一个执行基于动态权限的关键字搜索。 然后基于执行来检索数据集中的节点,并且提供对搜索查询的响应,其包括检索到的节点。

    SCALING DYNAMIC AUTHORITY-BASED SEARCH USING MATERIALIZED SUBGRAPHS
    2.
    发明申请
    SCALING DYNAMIC AUTHORITY-BASED SEARCH USING MATERIALIZED SUBGRAPHS 有权
    基于动态基于权限的搜索使用MATERIALIZED SUBGRAPHS

    公开(公告)号:US20100223266A1

    公开(公告)日:2010-09-02

    申请号:US12394371

    申请日:2009-02-27

    IPC分类号: G06F17/30

    摘要: According to one embodiment of the present invention, a method for processing a query is provided. The method includes generating a set of pre-computed materialized sub-graphs from a dataset and receiving a search query having one or more search query terms. A particular one of the pre-computed materialized sub-graphs is accessed and a dynamic authority-based keyword search is executed on the particular one of the pre-computed materialized sub-graphs. Nodes in the dataset are then retrieved based on the executing, and a response to the search query is provided which includes the retrieved nodes.

    摘要翻译: 根据本发明的一个实施例,提供了一种处理查询的方法。 该方法包括从数据集生成一组预先计算的物化子图并接收具有一个或多个搜索查询项的搜索查询。 访问预先计算的物化子图中的特定一个,并且对预先计算的物化子图中的特定一个执行基于动态权限的关键字搜索。 然后基于执行来检索数据集中的节点,并且提供对搜索查询的响应,其包括检索到的节点。

    GRAPH SEARCH SYSTEM AND METHOD FOR QUERYING LOOSELY INTEGRATED DATA
    3.
    发明申请
    GRAPH SEARCH SYSTEM AND METHOD FOR QUERYING LOOSELY INTEGRATED DATA 失效
    图形搜索系统和方法用于查询LOOSELY集成数据

    公开(公告)号:US20090240682A1

    公开(公告)日:2009-09-24

    申请号:US12053597

    申请日:2008-03-22

    IPC分类号: G06F7/06

    CPC分类号: G06F17/30395 G06F17/30554

    摘要: A system, method and computer program product for executing a query on linked data sources. Embodiments of the invention generate an instance graph expressing relationships between objects in the linked data sources and receive a query including at least first and second search terms. The first search term is then executed on the instance graph and a summary graph is generated using the results of the executing step. A second search term is then executed on the summary graph.

    摘要翻译: 一种用于执行链接数据源查询的系统,方法和计算机程序产品。 本发明的实施例生成表示链接的数据源中的对象之间的关系的实例图,并且接收包括至少第一和第二搜索项的查询。 然后在实例图上执行第一个搜索项,并使用执行步骤的结果生成汇总图。 然后在摘要图上执行第二个搜索项。

    Graph search system and method for querying loosely integrated data
    4.
    发明授权
    Graph search system and method for querying loosely integrated data 失效
    用于查询松散集成数据的图形搜索系统和方法

    公开(公告)号:US08326847B2

    公开(公告)日:2012-12-04

    申请号:US12053597

    申请日:2008-03-22

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30395 G06F17/30554

    摘要: A system, method and computer program product for executing a query on linked data sources. Embodiments of the invention generate an instance graph expressing relationships between objects in the linked data sources and receive a query including at least first and second search terms. The first search term is then executed on the instance graph and a summary graph is generated using the results of the executing step. A second search term is then executed on the summary graph.

    摘要翻译: 一种用于执行链接数据源查询的系统,方法和计算机程序产品。 本发明的实施例生成表示链接的数据源中的对象之间的关系并且接收包括至少第一和第二搜索项的查询的实例图。 然后在实例图上执行第一个搜索项,并使用执行步骤的结果生成汇总图。 然后在摘要图上执行第二个搜索项。

    Methods for obtaining improved text similarity measures which replace similar characters with a string pattern representation by using a semantic data tree
    5.
    发明授权
    Methods for obtaining improved text similarity measures which replace similar characters with a string pattern representation by using a semantic data tree 失效
    用于通过使用语义数据树获得用字符串模式表示替换相似字符的改进的文本相似性度量的方法

    公开(公告)号:US07945525B2

    公开(公告)日:2011-05-17

    申请号:US11937550

    申请日:2007-11-09

    IPC分类号: G06F17/00

    摘要: The embodiments of the invention provide methods for obtaining improved text similarity measures. More specifically, a method of measuring similarity between at least two electronic documents begins by identifying similar terms between the electronic documents. This includes basing similarity between the similar terms on patterns, wherein the patterns can include word patterns, letter patterns, numeric patterns, and/or alphanumeric patterns. The identifying of the similar terms also includes identifying multiple pattern types between the electronic documents. Moreover, the basing of the similarity on patterns identifies terms within the electronic documents that are within a category of a hierarchy. Specifically, the identifying of the terms reviews a hierarchical data tree, wherein nodes of the tree represent terms within the electronic documents. Lower nodes of the tree have specific terms; and, wherein higher nodes of the tree have general terms.

    摘要翻译: 本发明的实施例提供了用于获得改进的文本相似性度量的方法。 更具体地说,一种测量至少两个电子文档之间的相似性的方法,首先是识别电子文档之间的类似术语。 这包括在模式上的类似术语之间的基础相似性,其中模式可以包括字模式,字母模式,数字模式和/或字母数字模式。 类似术语的识别还包括识别电子文档之间的多种模式类型。 此外,模式上的相似性的基础确定电子文档内的层次结构类别内的术语。 具体地,术语的识别审查分层数据树,其中树的节点表示电子文档内的术语。 树的下层节点有特定的术语; 并且其中树的较高节点具有一般术语。

    Method for estimating the number of distinct values in a partitioned dataset
    8.
    发明授权
    Method for estimating the number of distinct values in a partitioned dataset 有权
    用于估计分区数据集中不同值的数量的方法

    公开(公告)号:US07987177B2

    公开(公告)日:2011-07-26

    申请号:US12022601

    申请日:2008-01-30

    IPC分类号: G06F17/00 G06F17/30

    CPC分类号: G06F17/30536 G06F17/30469

    摘要: The task of estimating the number of distinct values (DVs) in a large dataset arises in a wide variety of settings in computer science and elsewhere. The present invention provides synopses for DV estimation in the setting of a partitioned dataset, as well as corresponding DV estimators that exploit these synopses. Whenever an output compound data partition is created via a multiset operation on a pair of (possibly compound) input partitions, the synopsis for the output partition can be obtained by combining the synopses of the input partitions. If the input partitions are compound partitions, it is not necessary to access the synopses for all the base partitions that were used to construct the input partitions. Superior (in certain cases near-optimal) accuracy in DV estimates is maintained, especially when the synopsis size is small. The synopses can be created in parallel, and can also handle deletions of individual partition elements.

    摘要翻译: 在大数据集中估计不同值(DV)的数量的任务出现在计算机科学和其他地方的各种设置中。 本发明提供了在分区数据集的设置中的DV估计的概要,以及利用这些概要的对应的DV估计器。 无论何时通过一对(可能是复合)输入分区上的多集合操作创建输出复合数据分区,可以通过组合输入分区的概要来获取输出分区的概要。 如果输入分区是复合分区,则不需要访问用于构建输入分区的所有基本分区的概要。 维持DV估计中的优异(在某些情况下接近最佳)的准确度,特别是当概要大小较小时。 概要可以并行创建,也可以处理各个分区元素的删除。

    Entity-based business intelligence
    9.
    发明授权
    Entity-based business intelligence 有权
    基于实体的商业智能

    公开(公告)号:US07979436B2

    公开(公告)日:2011-07-12

    申请号:US12133552

    申请日:2008-06-05

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30592 G06F17/30489

    摘要: A method is disclosed for conducting a query to transform data in a pre-existing database, the method comprising: collecting database information from the pre-existing database, the database information including inconsistent dimensional tables and fact tables; running an entity discovery process on the inconsistent dimensional tables and the fact tables to produce entity mapping tables; using the entity mapping tables to resolve the inconsistent dimensional tables into resolved dimensional tables; and running the query on a resolved database to obtain a query result, the resolved database including the resolved dimensional table.

    摘要翻译: 公开了一种用于进行在预先存在的数据库中转换数据的查询的方法,所述方法包括:从预先存在的数据库收集数据库信息,所述数据库信息包括不一致的维度表和事实表; 对不一致的维度表和事实表运行实体发现过程以生成实体映射表; 使用实体映射表将不一致的维度表解析为已解析的维度表; 并在解析的数据库上运行查询以获取查询结果,解析的数据库包括已解析的维度表。

    Method for Estimating the Number of Distinct Values in a Partitioned Dataset
    10.
    发明申请
    Method for Estimating the Number of Distinct Values in a Partitioned Dataset 有权
    用于估计分区数据集中不同值的数量的方法

    公开(公告)号:US20090192980A1

    公开(公告)日:2009-07-30

    申请号:US12022601

    申请日:2008-01-30

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30536 G06F17/30469

    摘要: The task of estimating the number of distinct values (DVs) in a large dataset arises in a wide variety of settings in computer science and elsewhere. The present invention provides synopses for DV estimation in the setting of a partitioned dataset, as well as corresponding DV estimators that exploit these synopses. Whenever an output compound data partition is created via a multiset operation on a pair of (possibly compound) input partitions, the synopsis for the output partition can be obtained by combining the synopses of the input partitions. If the input partitions are compound partitions, it is not necessary to access the synopses for all the base partitions that were used to construct the input partitions. Superior (in certain cases near-optimal) accuracy in DV estimates is maintained, especially when the synopsis size is small. The synopses can be created in parallel, and can also handle deletions of individual partition elements.

    摘要翻译: 在大数据集中估计不同值(DV)的数量的任务出现在计算机科学和其他地方的各种设置中。 本发明提供了在分区数据集的设置中的DV估计的概要,以及利用这些概要的对应的DV估计器。 无论何时通过一对(可能是复合)输入分区上的多集合操作创建输出复合数据分区,可以通过组合输入分区的概要来获得输出分区的概要。 如果输入分区是复合分区,则不需要访问用于构建输入分区的所有基本分区的概要。 维持DV估计中的优异(在某些情况下接近最佳)的准确度,特别是当概要大小较小时。 概要可以并行创建,也可以处理各个分区元素的删除。