MapReduce for distributed database processing
    1.
    发明授权
    MapReduce for distributed database processing 有权
    MapReduce用于分布式数据库处理

    公开(公告)号:US08190610B2

    公开(公告)日:2012-05-29

    申请号:US11539090

    申请日:2006-10-05

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30584 Y10S707/968

    摘要: An input data set is treated as a plurality of grouped sets of key/value pairs, which enhances the utility of the MapReduce programming methodology. By utilizing such a grouping, map processing can be carried out independently on two or more related but possibly heterogeneous datasets (e.g., related by being characterized by a common primary key). The intermediate results of the map processing (key/value pairs) for a particular key can be processed together in a single reduce function by applying a different iterator to intermediate values for each group. Different iterators can be arranged inside reduce functions in ways however desired.

    摘要翻译: 输入数据集被视为多个分组的键/值对集合,这增强了MapReduce编程方法的实用性。 通过利用这样的分组,可以在两个或更多个相关但可能异质的数据集上独立地执行地图处理(例如,通过由公共主键表征)相关联。 通过将不同的迭代器应用于每个组的中间值,可以在单个减少函数中一起处理特定密钥的映射处理(密钥/值对)的中间结果。 减少功能内的不同迭代器可以以所需的方式排列。

    MAP-REDUCE WITH MERGE TO PROCESS MULTIPLE RELATIONAL DATASETS
    2.
    发明申请
    MAP-REDUCE WITH MERGE TO PROCESS MULTIPLE RELATIONAL DATASETS 有权
    MAP-REDUCE与MERGE处理多个关系数据

    公开(公告)号:US20080120314A1

    公开(公告)日:2008-05-22

    申请号:US11560523

    申请日:2006-11-16

    IPC分类号: G06F7/00

    摘要: A method of processing relationships of at least two datasets is provided. For each of the datasets, a map-reduce subsystem is provided such that the data of that dataset is mapped to corresponding intermediate data for that dataset. The intermediate data for that dataset is reduced to a set of reduced intermediate data for that dataset. Data corresponding to the sets of reduced intermediate data are merged, in accordance with a merge condition. In some examples, data being merged may include the output of one or more other mergers. That is, generally, merge functions may be flexibly placed among various map-reduce subsystems and, as such, the basic map-reduce architecture may be advantageously modified to process multiple relational datasets using, for example, clusters of computing devices.

    摘要翻译: 提供了一种处理至少两个数据集的关系的方法。 对于每个数据集,提供了一个map-reduce子系统,以便将该数据集的数据映射到该数据集的相应中间数据。 该数据集的中间数据被减少为该数据集的一组减少的中间数据。 根据合并条件合并与缩减的中间数据集相对应的数据。 在一些示例中,合并的数据可以包括一个或多个其他合并的输出。 也就是说,通常,合并功能可以灵活地放置在各种map-reduce子系统之间,并且因此,可以有利地修改基本的map-reduce架构以使用例如计算设备的集群来处理多个关系数据集。

    BIOINFORMATICS COMPUTATION USING A MAPRREDUCE-CONFIGURED COMPUTING SYSTEM
    3.
    发明申请
    BIOINFORMATICS COMPUTATION USING A MAPRREDUCE-CONFIGURED COMPUTING SYSTEM 审中-公开
    使用映射配置计算系统进行生物计算

    公开(公告)号:US20080133474A1

    公开(公告)日:2008-06-05

    申请号:US11564983

    申请日:2006-11-30

    IPC分类号: G06F17/30

    CPC分类号: G16B30/00 G16B50/00

    摘要: A MapReduce architecture may be utilized for sequence alignment algorithm processing (such as BLAST or BLAST-like algorithms). In addition, a MapReduce architecture may be extended such that memory of the computing devices of a MapReduce-configured system may be shared between different jobs of sequence alignment and/or other bioinformatics algorithm processing, thereby reducing overhead associated with executing such jobs using the MapReduce-configured system.

    摘要翻译: MapReduce架构可用于序列比对算法处理(如BLAST或BLAST类算法)。 此外,MapReduce架构可以被扩展,使得MapReduce配置的系统的计算设备的存储器可以在序列对准和/或其他生物信息学算法处理的不同作业之间共享,从而减少与使用MapReduce执行这样的作业相关联的开销 配置系统。

    Map-reduce with merge to process multiple relational datasets
    4.
    发明授权
    Map-reduce with merge to process multiple relational datasets 有权
    通过合并映射减少来处理多个关系数据集

    公开(公告)号:US07523123B2

    公开(公告)日:2009-04-21

    申请号:US11560523

    申请日:2006-11-16

    IPC分类号: G06F17/00

    摘要: A method of processing relationships of at least two datasets is provided. For each of the datasets, a map-reduce subsystem is provided such that the data of that dataset is mapped to corresponding intermediate data for that dataset. The intermediate data for that dataset is reduced to a set of reduced intermediate data for that dataset. Data corresponding to the sets of reduced intermediate data are merged, in accordance with a merge condition. In some examples, data being merged may include the output of one or more other mergers. That is, generally, merge functions may be flexibly placed among various map-reduce subsystems and, as such, the basic map-reduce architecture may be advantageously modified to process multiple relational datasets using, for example, clusters of computing devices.

    摘要翻译: 提供了一种处理至少两个数据集的关系的方法。 对于每个数据集,提供了一个map-reduce子系统,以便将该数据集的数据映射到该数据集的相应中间数据。 该数据集的中间数据被减少为该数据集的一组减少的中间数据。 根据合并条件合并与缩减的中间数据集相对应的数据。 在一些示例中,合并的数据可以包括一个或多个其他合并的输出。 也就是说,通常,合并功能可以灵活地放置在各种map-reduce子系统之间,并且因此,可以有利地修改基本的map-reduce架构以使用例如计算设备的集群来处理多个关系数据集。

    MAPREDUCE FOR DISTRIBUTED DATABASE PROCESSING
    5.
    发明申请
    MAPREDUCE FOR DISTRIBUTED DATABASE PROCESSING 有权
    MAPREDUCE用于分布式数据库处理

    公开(公告)号:US20080086442A1

    公开(公告)日:2008-04-10

    申请号:US11539090

    申请日:2006-10-05

    IPC分类号: G06F17/30 G06F7/00

    CPC分类号: G06F17/30584 Y10S707/968

    摘要: An input data set is treated as a plurality of grouped sets of key/value pairs, which enhances the utility of the MapReduce programming methodology. By utilizing such a grouping, map processing can be carried out independently on two or more related but possibly heterogeneous datasets (e.g., related by being characterized by a common primary key). The intermediate results of the map processing (key/value pairs) for a particular key can be processed together in a single reduce function by applying a different iterator to intermediate values for each group. Different iterators can be arranged inside reduce functions in ways however desired.

    摘要翻译: 输入数据集被视为多个分组的键/值对集合,这增强了MapReduce编程方法的实用性。 通过利用这样的分组,可以在两个或更多个相关但可能异质的数据集上独立地执行地图处理(例如,通过由公共主键表征)相关联。 通过将不同的迭代器应用于每个组的中间值,可以在单个减少函数中一起处理特定密钥的映射处理(密钥/值对)的中间结果。 减少功能内的不同迭代器可以以所需的方式排列。

    Image quality assessment to merchandise an item
    6.
    发明授权
    Image quality assessment to merchandise an item 有权
    图像质量评估商品一个项目

    公开(公告)号:US08675957B2

    公开(公告)日:2014-03-18

    申请号:US13300305

    申请日:2011-11-18

    IPC分类号: G06K9/00

    摘要: Image-based features may be significantly correlated with click-through rates of images that depict a product, which may provide a more formal basis for the informal notion that good quality images will result in better click-through rates, as compared to poor quality images. Accordingly, an image assessment machine is configured to analyze image-based features to improve click-through rates for shopping search applications (e.g., a product search engine). Moreover, the image assessment machine may rank search results based on image quality factors and may notify sellers about low quality images. This may have the effect of improving the brand value for an online shopping website and accordingly have a positive long-term impact on the online shopping website.

    摘要翻译: 基于图像的特征可能与描绘产品的图像的点击率显着相关,这可能为非正式概念提供更正式的基础,即与质量差的图像相比,高质量的图像将导致更好的点击率 。 因此,图像评估机被配置为分析基于图像的特征以提高购物搜索应用(例如,产品搜索引擎)的点击率。 此外,图像评估机可以基于图像质量因素对搜索结果进行排序,并且可以通知卖方关于低质量图像。 这可能会影响网上购物网站的品牌价值,从而对网络购物网站产生积极的长期影响。

    Methods and apparatus for computing graph similarity via sequence similarity
    7.
    发明授权
    Methods and apparatus for computing graph similarity via sequence similarity 有权
    通过序列相似度计算图相似度的方法和装置

    公开(公告)号:US07996349B2

    公开(公告)日:2011-08-09

    申请号:US11951146

    申请日:2007-12-05

    IPC分类号: G06F17/00 G06N5/00

    CPC分类号: G06F17/30882

    摘要: This disclosure describes systems and methods for identifying and correcting anomalies in web graphs. A web graph is transformed into a sequence of tokens via a walk algorithm. The sequence is fingerprinted to form a set of shingles. The singles are compared to shingles for other web graphs in order to determine similarity between web graphs. Actions are then carried out to remove anomalous web graphs and modify parameters governing web mapping in order to decrease the likelihood of future anomalous web graphs being built.

    摘要翻译: 本公开描述了用于识别和校正网络图中的异常的系统和方法。 网路图通过步行算法转换为令牌序列。 该序列被指纹化以形成一组带状疱疹。 将单曲与其他网络图的带状疱疹进行比较,以确定网络图之间的相似性。 然后执行操作以消除异常Web图形并修改控制Web映射的参数,以减少将来构建未来异常Web图形的可能性。

    SEARCH ENGINE OUTPUT-ASSOCIATED BIDDING IN ONLINE ADVERTISING
    8.
    发明申请
    SEARCH ENGINE OUTPUT-ASSOCIATED BIDDING IN ONLINE ADVERTISING 审中-公开
    在线广告搜索引擎输出相关投标

    公开(公告)号:US20110191171A1

    公开(公告)日:2011-08-04

    申请号:US12699226

    申请日:2010-02-03

    申请人: Ali Dasdan

    发明人: Ali Dasdan

    IPC分类号: G06Q30/00 G06F17/30

    摘要: Methods and systems are provided for search engine output-associated bidding in online advertising. Techniques are provided in which an advertiser may specify, as part of a bid, one or more requirements relating to search engine output. The one or more requirements may need to be met for an advertisement to be served in connection with the bid.

    摘要翻译: 提供了在线广告中搜索引擎输出相关出价的方法和系统。 提供了技术,其中广告商可以将与搜索引擎输出相关的一个或多个要求指定为投标的一部分。 对于与投标有关的广告,可能需要满足一个或多个要求。

    SEARCH RESULTS WITH MOST CLICKED NEXT OBJECTS
    9.
    发明申请
    SEARCH RESULTS WITH MOST CLICKED NEXT OBJECTS 审中-公开
    搜索结果与最新的下一个对象

    公开(公告)号:US20090287645A1

    公开(公告)日:2009-11-19

    申请号:US12120993

    申请日:2008-05-15

    IPC分类号: G06F7/06

    CPC分类号: G06F16/957

    摘要: Disclosed are apparatus and methods for providing next click information regarding search results. In certain embodiments, as objects (such as web pages, images, videos, audio files) are searched and clicked, click information is retained. Next click information with respect to specific objects can then be determined. This next click information can then be provided to an object search initiator so that such next click information is presented along with search result objects, for example, during a search query.

    摘要翻译: 公开了用于提供关于搜索结果的下一个点击信息的装置和方法。 在某些实施例中,当搜索和点击对象(诸如网页,图像,视频,音频文件)时,保留点击信息。 然后可以确定关于特定对象的下一个单击信息。 然后,该下一个点击信息可以被提供给对象搜索启动器,使得这样的下一个点击信息与搜索结果对象一起被呈现,例如在搜索查询期间。

    Method and apparatus for determining the performance of an integrated circuit
    10.
    发明申请
    Method and apparatus for determining the performance of an integrated circuit 有权
    用于确定集成电路性能的方法和装置

    公开(公告)号:US20070156367A1

    公开(公告)日:2007-07-05

    申请号:US11644563

    申请日:2006-12-21

    IPC分类号: G06F19/00

    CPC分类号: G01R31/2894

    摘要: A system that determines the performance of an integrated circuit (IC). During operation, the system receives probability distributions for parameters for the IC. Next, the system generates samples of the IC, wherein generating a given sample involves using the probability distribution to assign values to the parameters for components within the IC. The system then calculates output performance metrics for the samples based on the assigned values of the parameters, and uses the calculated output performance metrics to generate a distribution of output performance metrics for the samples.

    摘要翻译: 确定集成电路(IC)性能的系统。 在运行期间,系统接收IC参数的概率分布。 接下来,系统生成IC的样本,其中生成给定样本涉及使用概率分布来为IC内的组件的参数分配值。 系统然后根据参数的分配值计算样本的输出性能指标,并使用计算的输出性能指标生成样本的输出性能指标分布。