Persistent shuffle system
    1.
    发明授权

    公开(公告)号:US09928263B2

    公开(公告)日:2018-03-27

    申请号:US14045517

    申请日:2013-10-03

    Applicant: Google Inc.

    CPC classification number: G06F17/30345 G06F9/5066

    Abstract: A method includes receiving a request to perform a shuffle operation on a data stream; receiving at least a portion of the data stream including a plurality of records, each including a key; storing each of the plurality of records in a persistent storage location assigned to a key range corresponding to keys included in the plurality of records; receiving a request from a consumer for a subset of the plurality of records including a range of keys; and upon receiving the request from the consumer, providing the subset of the plurality of records including the range of keys from the one or more persistent storage locations.

    FILE OPERATION TASK OPTIMIZATION
    2.
    发明申请
    FILE OPERATION TASK OPTIMIZATION 审中-公开
    文件操作任务优化

    公开(公告)号:US20170004010A1

    公开(公告)日:2017-01-05

    申请号:US15266177

    申请日:2016-09-15

    Applicant: Google Inc.

    CPC classification number: G06F9/4881 G06F9/4887 G06F16/16 G06F16/182

    Abstract: A method includes receiving, by a data processing apparatus, a plurality of file operation requests, each file operation request including a priority, a deadline, and an operation type and representing a request to perform an operation on at least one file maintained in a distributed file system; identifying, by the data processing apparatus, a group of file operation requests to be executed together from the plurality of file operation requests, the identification based at least in part on at least one of: the file operations in the group of file operations being directed to a same storage system, or file operations in the group of file operations sharing a common operation type; and sending a request to execute the group of file operation requests to a system configured to perform the group of file operation requests.

    Abstract translation: 一种方法包括:由数据处理装置接收多个文件操作请求,每个文件操作请求包括优先级,最后期限和操作类型,并且表示对维护在分发的文件中的至少一个文件执行操作的请求 文件系统; 由所述数据处理装置识别要从所述多个文件操作请求一起执行的一组文件操作请求,所述标识至少部分地基于以下中的至少一个:所述文件操作组中的文件操作被定向 到同一个存储系统,或文件操作组中的文件操作共享一个常用的操作类型; 以及向被配置为执行所述一组文件操作请求的系统发送执行所述一组文件操作请求的请求。

    File operation task optimization
    3.
    发明授权
    File operation task optimization 有权
    文件操作任务优化

    公开(公告)号:US09449018B1

    公开(公告)日:2016-09-20

    申请号:US14089588

    申请日:2013-11-25

    Applicant: Google, Inc.

    CPC classification number: G06F9/4881 G06F9/4887 G06F17/30115 G06F17/30194

    Abstract: A method includes receiving, by a data processing apparatus, a plurality of file operation requests, each file operation request including a priority, a deadline, and an operation type and representing a request to perform an operation on at least one file maintained in a distributed file system; identifying, by the data processing apparatus, a group of file operation requests to be executed together from the plurality of file operation requests, the identification based at least in part on at least one of: the file operations in the group of file operations being directed to a same storage system, or file operations in the group of file operations sharing a common operation type; and sending a request to execute the group of file operation requests to a system configured to perform the group of file operation requests.

    Abstract translation: 一种方法包括:由数据处理装置接收多个文件操作请求,每个文件操作请求包括优先级,最后期限和操作类型,并且表示对维护在分发的文件中的至少一个文件执行操作的请求 文件系统; 由所述数据处理装置识别要从所述多个文件操作请求一起执行的一组文件操作请求,所述标识至少部分地基于以下中的至少一个:所述文件操作组中的文件操作被定向 到同一个存储系统,或文件操作组中的文件操作共享一个常用的操作类型; 以及向被配置为执行所述一组文件操作请求的系统发送执行所述一组文件操作请求的请求。

    Dynamic shuffle reconfiguration
    4.
    发明授权

    公开(公告)号:US09934262B2

    公开(公告)日:2018-04-03

    申请号:US15269276

    申请日:2016-09-19

    Applicant: Google Inc.

    CPC classification number: G06F17/30312 G06F9/5083 G06F17/30345 G06F17/30584

    Abstract: A method includes receiving a request to perform a shuffle operation on a data stream, the request including a set of initial key ranges: generating a shuffler configuration that assigns a shuffler from a set of shufflers to each of the initial key ranges; initiating the set of shufflers to perform the shuffle operation on the data stream; analyzing metadata statistics to determine whether a shuffler configuration update event occurs, the metadata statistics produced by the set of shufflers during the shuffle operation and indicating load statistics for each shuffler in the set of shufflers; and upon occurrence of the shuffler configuration update event and during the shuffle operation, altering the shuffler configuration based at least in part on the metadata statistics to produce an assignment of shufflers to key ranges that is different from the assignment of shufflers to the initial key ranges.

    Clustering for parallel processing
    6.
    发明授权
    Clustering for parallel processing 有权
    并行处理聚类

    公开(公告)号:US09535742B1

    公开(公告)日:2017-01-03

    申请号:US15148661

    申请日:2016-05-06

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for clustering for parallel processing. One of the methods includes providing virtual machines with an interface to a shuffle service, the shuffle service executing external of the virtual machines. The method includes receiving data records through the interface, each data record having a key and a value. The method includes partitioning the data records, using the shuffle service, according to the respective keys. The method includes providing a part of the partitioned data records through the interface to the virtual machines, wherein data records having the same key are provided to the same virtual machine. Each of the virtual machines can execute on a host machine and each of the virtual machine is a hardware virtualization of a machine.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于并行处理的聚类。 其中一种方法包括为虚拟机提供一个随机播放服务的接口,这些随机播放服务在虚拟机的外部执行。 该方法包括通过接口接收数据记录,每个数据记录具有一个键和一个值。 该方法包括根据各个密钥对使用洗牌服务的数据记录进行分区。 该方法包括通过与虚拟机的接口提供一部分分区数据记录,其中具有相同密钥的数据记录被提供给同一个虚拟机。 每个虚拟机都可以在主机上执行,每个虚拟机都是机器的硬件虚拟化。

    Dynamic Shuffle Reconfiguration
    7.
    发明申请
    Dynamic Shuffle Reconfiguration 有权
    动态随机重配置

    公开(公告)号:US20150095351A1

    公开(公告)日:2015-04-02

    申请号:US14044529

    申请日:2013-10-02

    Applicant: Google Inc.

    CPC classification number: G06F17/30312 G06F9/5083 G06F17/30345 G06F17/30584

    Abstract: A method includes receiving a request to perform a shuffle operation on a data stream, the request including a set of initial key ranges: generating a shuffler configuration that assigns a shuffler from a set of shufflers to each of the initial key ranges; initiating the set of shufflers to perform the shuffle operation on the data stream; analyzing metadata statistics to determine whether a shuffler configuration update event occurs, the metadata statistics produced by the set of shufflers during the shuffle operation and indicating load statistics for each shuffler in the set of shufflers; and upon occurrence of the shuffler configuration update event and during the shuffle operation, altering the shuffler configuration based at least in part on the metadata statistics to produce an assignment of shufflers to key ranges that is different from the assignment of shufflers to the initial key ranges.

    Abstract translation: 一种方法包括接收对数据流执行随机操作的请求,所述请求包括一组初始密钥范围:生成将洗牌器从一组洗牌器分配给每个初始密钥范围的洗牌器配置; 发起一组洗牌器来对数据流执行洗牌操作; 分析元数据统计以确定洗牌器配置更新事件是否发生,在洗牌操作期间由该组洗牌器产生的元数据统计信息,并指示该组洗牌器中的每个洗牌者的负载统计信息; 以及在发生所述洗牌器配置更新事件并且在所述随机播放操作期间,至少部分地基于所述元数据统计信息来改变所述洗牌器配置,以产生对于与所述初始密钥范围的分配不同的密钥范围的洗牌者的分配 。

    DYNAMIC SHUFFLE RECONFIGURATION
    8.
    发明申请
    DYNAMIC SHUFFLE RECONFIGURATION 有权
    动态舒适重新配置

    公开(公告)号:US20170003936A1

    公开(公告)日:2017-01-05

    申请号:US15269276

    申请日:2016-09-19

    Applicant: Google Inc.

    CPC classification number: G06F17/30312 G06F9/5083 G06F17/30345 G06F17/30584

    Abstract: A method includes receiving a request to perform a shuffle operation on a data stream, the request including a set of initial key ranges: generating a shuffler configuration that assigns a shuffler from a set of shufflers to each of the initial key ranges; initiating the set of shufflers to perform the shuffle operation on the data stream; analyzing metadata statistics to determine whether a shuffler configuration update event occurs, the metadata statistics produced by the set of shufflers during the shuffle operation and indicating load statistics for each shuffler in the set of shufflers; and upon occurrence of the shuffler configuration update event and during the shuffle operation, altering the shuffler configuration based at least in part on the metadata statistics to produce an assignment of shufflers to key ranges that is different from the assignment of shufflers to the initial key ranges.

    Abstract translation: 一种方法包括接收对数据流执行随机操作的请求,所述请求包括一组初始密钥范围:生成将洗牌器从一组洗牌器分配给每个初始密钥范围的洗牌器配置; 发起一组洗牌器来对数据流执行洗牌操作; 分析元数据统计以确定洗牌器配置更新事件是否发生,在洗牌操作期间由该组洗牌器产生的元数据统计信息,并指示该组洗牌器中的每个洗牌者的负载统计信息; 以及在发生所述洗牌器配置更新事件并且在所述随机播放操作期间,至少部分地基于所述元数据统计来改变所述洗牌器配置,以产生对于与所述初始密钥范围的分配不同的密钥范围的洗牌者的分配 。

    Parallel processing of data
    9.
    发明授权
    Parallel processing of data 有权
    并行处理数据

    公开(公告)号:US09536014B1

    公开(公告)日:2017-01-03

    申请号:US14922552

    申请日:2015-10-26

    Applicant: Google Inc.

    Abstract: Parallel processing of data may include a set of map processes and a set of reduce processes. Each map process may include at least one map thread. Map threads may access distinct input data blocks assigned to the map process, and may apply an application specific map operation to the input data blocks to produce key-value pairs. Each map process may include a multiblock combiner configured to apply a combining operation to values associated with common keys in the key-value pairs to produce combined values, and to output intermediate data including pairs of keys and combined values. Each reduce process may be configured to access the intermediate data output by the multiblock combiners. For each key, an application specific reduce operation may be applied to the combined values associated with the key to produce output data.

    Abstract translation: 数据的并行处理可以包括一组地图处理和一组缩减过程。 每个地图过程可以包括至少一个地图线程。 映射线程可以访问分配给映射过程的不同输入数据块,并且可以将应用特定映射操作应用于输入数据块以产生键值对。 每个映射过程可以包括多块组合器,其被配置为将组合操作应用于与键值对中的公共密钥相关联的值以产生组合值,以及输出包括密钥对和组合值的中间数据。 每个减少处理可以被配置为访问由多块组合器输出的中间数据。 对于每个密钥,可以将应用特定的减少操作应用于与密钥相关联的组合值以产生输出数据。

    Dynamic shuffle reconfiguration
    10.
    发明授权
    Dynamic shuffle reconfiguration 有权
    动态随机重新配置

    公开(公告)号:US09483509B2

    公开(公告)日:2016-11-01

    申请号:US14044529

    申请日:2013-10-02

    Applicant: Google Inc.

    CPC classification number: G06F17/30312 G06F9/5083 G06F17/30345 G06F17/30584

    Abstract: A method includes receiving a request to perform a shuffle operation on a data stream, the request including a set of initial key ranges: generating a shuffler configuration that assigns a shuffler from a set of shufflers to each of the initial key ranges; initiating the set of shufflers to perform the shuffle operation on the data stream; analyzing metadata statistics to determine whether a shuffler configuration update event occurs, the metadata statistics produced by the set of shufflers during the shuffle operation and indicating load statistics for each shuffler in the set of shufflers; and upon occurrence of the shuffler configuration update event and during the shuffle operation, altering the shuffler configuration based at least in part on the metadata statistics to produce an assignment of shufflers to key ranges that is different from the assignment of shufflers to the initial key ranges.

    Abstract translation: 一种方法包括接收对数据流执行随机操作的请求,所述请求包括一组初始密钥范围:生成将洗牌器从一组洗牌器分配给每个初始密钥范围的洗牌器配置; 发起一组洗牌器来对数据流执行洗牌操作; 分析元数据统计以确定洗牌器配置更新事件是否发生,在洗牌操作期间由该组洗牌器产生的元数据统计信息,并指示该组洗牌器中的每个洗牌者的负载统计信息; 以及在发生所述洗牌器配置更新事件并且在所述随机播放操作期间,至少部分地基于所述元数据统计信息来改变所述洗牌器配置,以产生对于与所述初始密钥范围的分配不同的密钥范围的洗牌者的分配 。

Patent Agency Ranking