Formal language and translator for parallel processing of data
    1.
    发明授权
    Formal language and translator for parallel processing of data 有权
    用于并行处理数据的正式语言和翻译器

    公开(公告)号:US07921416B2

    公开(公告)日:2011-04-05

    申请号:US11551336

    申请日:2006-10-20

    IPC分类号: G06F9/45

    CPC分类号: G06F17/30427 G06F17/3041

    摘要: The present invention, in an example embodiment, provides a special-purpose formal language and translator for the parallel processing of large databases in a distributed system. The special-purpose language has features of both a declarative programming language and a procedural programming language and supports the co-grouping of tables, each with an arbitrary alignment function, and the specification of procedural operations to be performed on the resulting co-groups. The language's translator translates a program in the language into optimized structured calls to an application programming interface for implementations of functionality related to the parallel processing of tasks over a distributed system. In an example embodiment, the application programming interface includes interfaces for MapReduce functionality, whose implementations are supplemented by the embodiment.

    摘要翻译: 本发明在一个示例性实施例中提供了用于并行处理分布式系统中的大型数据库的专用形式语言和翻译器。 专用语言具有声明式编程语言和程序性编程语言的特征,并且支持表的共同分组,每个表具有任意对齐功能,以及对所得到的协同组执行的过程操作的说明。 语言的翻译者将语言中的程序转换为应用程序编程接口的优化结构化调用,以实现与分布式系统上并行处理任务相关的功能。 在示例实施例中,应用编程接口包括用于MapReduce功能的接口,其实现由该实施例补充。

    Formal Language and Translator for Parallel Processing of Data
    2.
    发明申请
    Formal Language and Translator for Parallel Processing of Data 有权
    数据并行处理的正式语言和翻译器

    公开(公告)号:US20080098370A1

    公开(公告)日:2008-04-24

    申请号:US11551336

    申请日:2006-10-20

    IPC分类号: G06F9/45

    CPC分类号: G06F17/30427 G06F17/3041

    摘要: The present invention, in an example embodiment, provides a special-purpose formal language and translator for the parallel processing of large databases in a distributed system. The special-purpose language has features of both a declarative programming language and a procedural programming language and supports the co-grouping of tables, each with an arbitrary alignment function, and the specification of procedural operations to be performed on the resulting co-groups. The language's translator translates a program in the language into optimized structured calls to an application programming interface for implementations of functionality related to the parallel processing of tasks over a distributed system. In an example embodiment, the application programming interface includes interfaces for MapReduce functionality, whose implementations are supplemented by the embodiment.

    摘要翻译: 本发明在一个示例性实施例中提供了用于并行处理分布式系统中的大型数据库的专用形式语言和翻译器。 专用语言具有声明式编程语言和程序性编程语言的特征,并且支持表的共同分组,每个表具有任意对齐功能,以及对所得到的协同组执行的过程操作的说明。 语言的翻译者将语言中的程序转换为应用程序编程接口的优化结构化调用,以实现与分布式系统上并行处理任务相关的功能。 在示例实施例中,应用编程接口包括用于MapReduce功能的接口,其实现由该实施例补充。

    System and method for generalization search in hierarchies
    3.
    发明申请
    System and method for generalization search in hierarchies 审中-公开
    用于在层次结构中进行泛化搜索的系统和方法

    公开(公告)号:US20080010250A1

    公开(公告)日:2008-01-10

    申请号:US11483047

    申请日:2006-07-07

    IPC分类号: G06F17/30

    CPC分类号: G06F16/3325 G06F16/951

    摘要: An improved system and method is provided for searching a collection of objects that may be located in hierarchies of auxiliary information for retrieval of response objects. A framework to perform a generalization search in hierarchies may be used to generalize a search by moving up to a higher level in a hierarchy of taxonomies or to specialize a search by moving down to a lower level in the hierarchy of taxonomies. Once the system may decide to enumerate response objects at a particular level of generalization, a budgeted generalization search may be used for enumerating a set of response objects within a budgeted cost.

    摘要翻译: 提供了一种改进的系统和方法,用于搜索可能位于用于检索响应对象的辅助信息的层级中的对象的集合。 在层次结构中执行泛化搜索的框架可以用于通过在分类法的层次结构中移动到更高级别来推广搜索,或者通过向下移动到分类法层级中的较低级别来专门化搜索。 一旦系统可以决定在特定的泛化级别枚举响应对象,则可以使用预算的泛化搜索来枚举在预算成本内的一组响应对象。

    System and method for budgeted generalization search in hierarchies
    4.
    发明授权
    System and method for budgeted generalization search in hierarchies 有权
    用于层次结构的预算泛化搜索的系统和方法

    公开(公告)号:US07991769B2

    公开(公告)日:2011-08-02

    申请号:US11483048

    申请日:2006-07-07

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30646 G06F17/30864

    摘要: An improved system and method is provided for searching a collection of objects that may be located in hierarchies of auxiliary information for retrieval of response objects. A framework to perform a generalization search in hierarchies may be used to generalize a search by moving up to a higher level in a hierarchy of taxonomies or to specialize a search by moving down to a lower level in the hierarchy of taxonomies. Once the system may decide to enumerate response objects at a particular level of generalization, a budgeted generalization search may be used for enumerating a set of response objects within a budgeted cost.

    摘要翻译: 提供了一种改进的系统和方法,用于搜索可能位于用于检索响应对象的辅助信息的层级中的对象的集合。 在层次结构中执行泛化搜索的框架可以用于通过在分类法的层次结构中移动到更高级别来推广搜索,或者通过向下移动到分类法层级中的较低级别来专门化搜索。 一旦系统可以决定在特定的泛化级别枚举响应对象,则可以使用预算的泛化搜索来枚举在预算成本内的一组响应对象。

    System and method for budgeted generalization search in hierarchies
    5.
    发明申请
    System and method for budgeted generalization search in hierarchies 有权
    用于层次结构的预算泛化搜索的系统和方法

    公开(公告)号:US20080010251A1

    公开(公告)日:2008-01-10

    申请号:US11483048

    申请日:2006-07-07

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30646 G06F17/30864

    摘要: An improved system and method is provided for searching a collection of objects that may be located in hierarchies of auxiliary information for retrieval of response objects. A framework to perform a generalization search in hierarchies may be used to generalize a search by moving up to a higher level in a hierarchy of taxonomies or to specialize a search by moving down to a lower level in the hierarchy of taxonomies. Once the system may decide to enumerate response objects at a particular level of generalization, a budgeted generalization search may be used for enumerating a set of response objects within a budgeted cost.

    摘要翻译: 提供了一种改进的系统和方法,用于搜索可能位于用于检索响应对象的辅助信息的层级中的对象的集合。 在层次结构中执行泛化搜索的框架可以用于通过在分类法的层次结构中移动到更高级别来推广搜索,或者通过向下移动到分类法层级中的较低级别来专门化搜索。 一旦系统可以决定在特定的泛化级别枚举响应对象,则可以使用预算的泛化搜索来枚举在预算成本内的一组响应对象。

    System and method for automatic detection of needy queries
    6.
    发明授权
    System and method for automatic detection of needy queries 有权
    用于自动检测有需要的查询的系统和方法

    公开(公告)号:US07970760B2

    公开(公告)日:2011-06-28

    申请号:US12046123

    申请日:2008-03-11

    IPC分类号: G06F7/00 G06F17/00

    CPC分类号: G06F17/30864

    摘要: Methods, systems, and computer readable media comprising instructions for identifying needy queries for which additional responsive content is needed. A method comprises receiving a query comprising one or more terms and retrieving one or more content items identified as responsive to the query, the one or more content items ranked according to one or more ranking techniques. A score is generated for the one or more ranked content items identified as responsive to the query. A determination is thereafter made as to whether the query is needy based upon a comparison of the one or more scores associated with the one or more content items identified as responsive to the query and a needy query score threshold.

    摘要翻译: 方法,系统和计算机可读介质包括用于识别需要其他响应内容的有需要的查询的指令。 一种方法包括接收包括一个或多个术语的查询,并且检索一个或多个被识别为响应于该查询的内容项,该一个或多个内容项根据一个或多个排名技术排列。 对于被识别为响应于查询的一个或多个排名的内容项目生成分数。 此后,基于与识别为响应于查询的一个或多个内容项目和有需要的查询分数阈值相关联的一个或多个内容项目的比较来确定查询是否需要。

    System and method for crawl ordering by search impact
    7.
    发明授权
    System and method for crawl ordering by search impact 有权
    通过搜索影响来抓取排序的系统和方法

    公开(公告)号:US07899807B2

    公开(公告)日:2011-03-01

    申请号:US12004881

    申请日:2007-12-20

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30864

    摘要: An improved system and method for crawl ordering of a web crawler by impact upon search results of a search engine is provided. Content-independent features of uncrawled web pages may be obtained, and the impact of uncrawled web pages may be estimated for queries of a workload using the content-independent features. The impact of uncrawled web pages may be estimated for queries by computing an expected impact score for uncrawled web pages that match needy queries. Query sketches may be created for a subset of the queries by computing an expected impact score for crawled web pages and uncrawled web pages matching the queries. Web pages may then be selected to fetch using a combined query-based estimate and query-independent estimate of the impact of fetching the web pages on search query results.

    摘要翻译: 提供了一种改进的系统和方法,用于通过对搜索引擎的搜索结果的影响来爬取对网页爬虫的排序。 未获取的网页的内容无关的功能可能会被获取,并且可以使用内容无关的功能来估计未浏览的网页对工作负载的查询的影响。 未查询的网页的影响可以通过计算与有需要的查询匹配的未浏览的网页的预期影响分数来估计查询。 可以通过计算搜索的网页和匹配查询的未浏览的网页的预期影响分数来为查询的子集创建查询草图。 然后,可以使用基于查询的组合估计和对查询网页对搜索查询结果的影响的独立于查询的估计来选择网页。

    Virtual Environment Spanning Desktop and Cloud
    8.
    发明申请
    Virtual Environment Spanning Desktop and Cloud 有权
    跨越桌面和云的虚拟环境

    公开(公告)号:US20100114867A1

    公开(公告)日:2010-05-06

    申请号:US12266364

    申请日:2008-11-06

    IPC分类号: G06F17/30 G06F7/00

    摘要: A method and system are given for providing a virtual environment spanning a desktop and a cloud. In one example, the method includes receiving a query template over a data set that resides in the cloud, optimizing the query template to segment the query template into an offline phase and an online phase, executing the offline phase on the cloud to build one or more indexes, and sending the one or more indexes to the desktop.

    摘要翻译: 给出了一种方法和系统,用于提供跨越桌面和云的虚拟环境。 在一个示例中,该方法包括通过位于云中的数据集接收查询模板,优化查询模板以将查询模板分段成离线阶段和在线阶段,在云上执行脱机阶段以构建一个或 更多索引,并将一个或多个索引发送到桌面。

    Generating Example Data for Testing Database Queries
    9.
    发明申请
    Generating Example Data for Testing Database Queries 有权
    生成用于测试数据库查询的示例数据

    公开(公告)号:US20090182706A1

    公开(公告)日:2009-07-16

    申请号:US12015392

    申请日:2008-01-16

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30442

    摘要: Computer-implemented methods, modules and clients relate to expanded, pruned sample table for testing database queries against a base table. The expanded, pruned sample table is formed from the base table by a process of initial sampling, synthesis, and pruning.

    摘要翻译: 计算机实现的方法,模块和客户端与扩展的,已修剪的示例表相关,用于根据基表测试数据库查询。 通过初始采样,合成和修剪的过程,从基表形成扩展的修剪的样本表。

    System and method for adaptively refreshing a web page
    10.
    发明授权
    System and method for adaptively refreshing a web page 有权
    自适应刷新网页的系统和方法

    公开(公告)号:US08745183B2

    公开(公告)日:2014-06-03

    申请号:US11588020

    申请日:2006-10-26

    CPC分类号: G06F17/30899

    摘要: An improved system and method is provided for adaptively refreshing a web page. A base version of the web page may be partitioned into a collection of fragments. Then the collection of fragments may be compared with the corresponding fragments of a recent version of the web page to determine a divergence measurement of the difference between the base version and the recent version of the web page. The divergence measurement may be recorded in a change profile representing a change history of the web page that includes a sequence of numeric pairs indicating a time offset and a divergence measurement of the difference between a version of the web page at the time offset and a base version of the web page. The refresh period for the web page may be adjusted by applying an adaptive refresh policy using the divergence measurements recorded in the change profile.

    摘要翻译: 提供了一种改进的系统和方法来自适应地刷新网页。 网页的基本版本可以被分割成片段的集合。 然后将片段的收集与网页的最近版本的相应片段进行比较,以确定基本版本和网页的最近版本之间的差异的发散度度量。 发散度测量可以被记录在表示网页的变化历史的变化曲线中,该变化历史包括指示时间偏移的时间偏移和网页的版本之间的差的基准的数字对的序列, 版本的网页。 可以通过使用记录在改变简档中的发散度测量应用自适应刷新策略来调整网页的刷新周期。