Amortizing costs of shared scans
    1.
    发明授权
    Amortizing costs of shared scans 失效
    摊销共享扫描成本

    公开(公告)号:US08484649B2

    公开(公告)日:2013-07-09

    申请号:US12984909

    申请日:2011-01-05

    IPC分类号: G06F9/46

    CPC分类号: G06F9/4843

    摘要: Techniques for scheduling a plurality of jobs sharing input are provided. The techniques include partitioning one or more input datasets into multiple subcomponents, analyzing a plurality of jobs to determine which of the plurality of jobs require scanning of one or more common subcomponents of the one or more input datasets, and scheduling a plurality of jobs that require scanning of one or more common subcomponents of the one or more input datasets, facilitating a single scanning of the one or more common subcomponents to be used as input by each of the plurality of jobs.

    摘要翻译: 提供了用于调度多个作业共享输入的技术。 这些技术包括将一个或多个输入数据集划分成多个子组件,分析多个作业以确定多个作业中的哪个作业需要扫描一个或多个输入数据集的一个或多个公共子组件,以及调度需要 扫描所述一个或多个输入数据集中的一个或多个通用子部件,便于对所述一个或多个公共子部件进行单次扫描以用作多个作业中的每个作业的输入。

    Distributed solutions for large-scale resource assignment tasks
    2.
    发明授权
    Distributed solutions for large-scale resource assignment tasks 失效
    用于大规模资源分配任务的分布式解决方案

    公开(公告)号:US08346845B2

    公开(公告)日:2013-01-01

    申请号:US12760080

    申请日:2010-04-14

    IPC分类号: G06F15/16

    CPC分类号: G06F9/5027 G06F2209/5017

    摘要: Distributed data processing of problems representing resource assignment tasks. The problems are modeled as programs, and the programs are partitioned into sub-instances. Those sub-instances are executed in a distributed computing environment. The partitioning reduces communication costs between sub-instances and convergence time for the optimization program.

    摘要翻译: 分布式数据处理表示资源分配任务的问题。 问题被建模为程序,程序被分为子实例。 这些子实例在分布式计算环境中执行。 分区可以降低优化程序的子实例与收敛时间之间的通信成本。

    Scheduling Mapreduce Jobs in the Presence of Priority Classes
    3.
    发明申请
    Scheduling Mapreduce Jobs in the Presence of Priority Classes 审中-公开
    在优先课程中安排Mapreduce工作

    公开(公告)号:US20120304186A1

    公开(公告)日:2012-11-29

    申请号:US13116378

    申请日:2011-05-26

    IPC分类号: G06F9/46 G06F9/50

    CPC分类号: G06F9/46 G06F9/4881 G06F9/50

    摘要: Techniques for scheduling one or more MapReduce jobs in a presence of one or more priority classes are provided. The techniques include obtaining a preferred ordering for one or more MapReduce jobs, wherein the preferred ordering comprises one or more priority classes, prioritizing the one or more priority classes subject to one or more dynamic minimum slot guarantees for each priority class, and iteratively employing a MapReduce scheduler, once per priority class, in priority class order, to optimize performance of the one or more MapReduce jobs.

    摘要翻译: 提供了在存在一个或多个优先级类别的情况下调度一个或多个MapReduce作业的技术。 这些技术包括获得一个或多个MapReduce作业的优选顺序,其中优选顺序包括一个或多个优先级等级,对于每个优先级类别,受限于一个或多个动态最小时隙保证的一个或多个优先级类别进行优先级排序,并迭代地使用 MapReduce调度程序每优先级一次,按优先级顺序排列,以优化一个或多个MapReduce作业的性能。

    Clustering streaming graphs
    5.
    发明授权
    Clustering streaming graphs 失效
    聚类流图

    公开(公告)号:US08635224B2

    公开(公告)日:2014-01-21

    申请号:US13532823

    申请日:2012-06-26

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30961

    摘要: Embodiments of the invention include methods for identifying one or more clusters in a streaming graph, the method includes receiving a stream of edges and sampling the stream of edges to create a structural reservoir and support reservoir. The method also includes creating a sampled graph from the structural reservoir and identifying the one or more clusters in the sampled graph by grouping one or more connected vertices in the sampled graph.

    摘要翻译: 本发明的实施例包括用于识别流图中的一个或多个聚类的方法,该方法包括接收边缘流并采样边缘流以创建结构储层和支持储存器。 该方法还包括从结构储层创建采样图,并通过对采样图中的一个或多个连接的顶点进行分组来识别采样图中的一个或多个聚类。

    Partitioning operator flow graphs
    6.
    发明授权
    Partitioning operator flow graphs 有权
    分区操作员流程图

    公开(公告)号:US08490072B2

    公开(公告)日:2013-07-16

    申请号:US12489805

    申请日:2009-06-23

    IPC分类号: G06F9/44 G06F9/45

    CPC分类号: G06F8/443 G06F8/433 G06F8/44

    摘要: Techniques for partitioning an operator flow graph are provided. The techniques include receiving source code for a stream processing application, wherein the source code comprises an operator flow graph, wherein the operator flow graph comprises a plurality of operators, receiving profiling data associated with the plurality of operators and one or more processing requirements of the operators, defining a candidate partition as a coalescing of one or more of the operators into one or more sets of processing elements (PEs), using the profiling data to create one or more candidate partitions of the processing elements, using the one or more candidate partitions to choose a desired partitioning of the operator flow graph, and compiling the source code into an executable code based on the desired partitioning.

    摘要翻译: 提供了划分操作员流程图的技术。 所述技术包括接收流处理应用的源代码,其中所述源代码包括操作者流图,其中所述操作者流程图包括多个操作者,接收与所述多个操作者相关联的分析数据以及所述多个操作者的一个或多个处理要求 运营商,使用所述分析数据,使用所述一个或多个候选者来创建一个或多个所述处理元件的候选分区,将候选分区定义为将一个或多个运算符合并为一组或多组处理元素(PE) 分区以选择操作员流图的期望分区,并且基于期望的分区将源代码编译成可执行代码。

    Vertex-Proximity Query Processing
    7.
    发明申请
    Vertex-Proximity Query Processing 有权
    顶点近似查询处理

    公开(公告)号:US20130151536A1

    公开(公告)日:2013-06-13

    申请号:US13315415

    申请日:2011-12-09

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30327

    摘要: A method, an apparatus and an article of manufacture for processing a random-walk based vertex-proximity query on a graph. The method includes computing at least one vertex cluster and corresponding meta-information from a graph, dynamically updating the clustering and corresponding meta-information upon modification of the graph, and identifying a vertex cluster relevant to at least one query vertex and aggregating corresponding meta-information of the cluster to process the query.

    摘要翻译: 一种用于在图上处理随机走路的顶点 - 邻近度查询的方法,装置和制品。 该方法包括:从图中计算至少一个顶点簇和对应的元信息,在修改图形时动态地更新聚类和对应的元信息,以及识别与至少一个查询顶点相关的顶点簇, 集群的信息来处理查询。

    Vertex-proximity query processing
    10.
    发明授权
    Vertex-proximity query processing 有权
    顶点邻近查询处理

    公开(公告)号:US08903824B2

    公开(公告)日:2014-12-02

    申请号:US13315415

    申请日:2011-12-09

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30327

    摘要: A method, an apparatus and an article of manufacture for processing a random-walk based vertex-proximity query on a graph. The method includes computing at least one vertex cluster and corresponding meta-information from a graph, dynamically updating the clustering and corresponding meta-information upon modification of the graph, and identifying a vertex cluster relevant to at least one query vertex and aggregating corresponding meta-information of the cluster to process the query.

    摘要翻译: 一种用于在图上处理随机走路的顶点 - 邻近度查询的方法,装置和制品。 该方法包括:从图中计算至少一个顶点簇和对应的元信息,在修改图形时动态地更新聚类和对应的元信息,以及识别与至少一个查询顶点相关的顶点簇, 集群的信息来处理查询。