Associating application-specific methods with tables used for data storage

    公开(公告)号:US11281631B2

    公开(公告)日:2022-03-22

    申请号:US16927264

    申请日:2020-07-13

    申请人: Google LLC

    摘要: A method of accessing data includes storing a table that includes a plurality of tablets corresponding to distinct non-overlapping table portions. Respective pluralities of tablet access objects and application objects are stored in a plurality of servers. A distinct application object and distinct tablet are associated with each tablet access object. Each application object corresponds to a distinct instantiation of an application associated with the table. The tablet access objects and associated application objects are redistributed among the servers in accordance with a first load-balancing criterion. A first request directed to a respective tablet is received from a client. In response, the tablet access object associated with the respective tablet is used to perform a data access operation on the respective tablet, and the application object associated with the respective tablet is used to perform an additional computational operation to produce a result to be returned to the client.

    System and method for analyzing data records

    公开(公告)号:US11275743B2

    公开(公告)日:2022-03-15

    申请号:US15799939

    申请日:2017-10-31

    申请人: Google LLC

    摘要: Systems and methods for analyzing input data records are provided in which a master process initiates a plurality of concurrent first processes each of which comprises, for each data record in at least a subset of a plurality of input data records, creating a parsed representation of the data record and independently applying a procedural language query to the parsed representation to extract one or more values. A respective emit operator is applied to at least one of the extracted one or more values thereby adding corresponding information to a respective intermediate data structure. The respective emit operator implements one of a predefined set of statistical information processing functions. The master process also initiates a plurality of second processes each of which aggregates information from a corresponding subset of intermediate data structures to produce aggregated data that is, in turn, combined to produce output data.

    System and method for large-scale data processing using an application-independent framework

    公开(公告)号:US11366797B2

    公开(公告)日:2022-06-21

    申请号:US17134862

    申请日:2020-12-28

    申请人: Google LLC

    摘要: A method performs large-scale data processing in a distributed and parallel processing environment. The method defines application-independent map and reduce operations, each invoking one or more library functions that automatically handle data partitioning, parallelization of computations, and fault tolerance. A user specifies a map operation, which calls one or more of the application-independent map operators to perform data read and write operations. A user also specifies a reduce operation, which calls one or more of the application-independent reduce operators to perform data read and write operations. The method executes application-independent map worker processes. Each map worker process executes the user-specified map operation to read designated portions of input files and store intermediate data values in intermediate data structures. The method also executes application-independent reduce worker processes. Each reduce worker process executes the user-specified reduce operation to read intermediate data values from the intermediate data structures and produce final output data.

    System And Method For Large-Scale Data Processing Using An Application-Independent Framework

    公开(公告)号:US20230385262A1

    公开(公告)日:2023-11-30

    申请号:US18137695

    申请日:2023-04-21

    申请人: Google LLC

    摘要: A method performs large-scale data processing in a distributed and parallel processing environment. The method defines application-independent map and reduce operations, each invoking one or more library functions that automatically handle data partitioning, parallelization of computations, and fault tolerance. A user specifies a map operation, which calls one or more of the application-independent map operators to perform data read and write operations. A user also specifies a reduce operation, which calls one or more of the application-independent reduce operators to perform data read and write operations. The method executes application-independent map worker processes. Each map worker process executes the user-specified map operation to read designated portions of input files and store intermediate data values in intermediate data structures. The method also executes application-independent reduce worker processes. Each reduce worker process executes the user-specified reduce operation to read intermediate data values from the intermediate data structures and produce final output data.

    Associating Application-Specific Methods With Tables Used For Data Storage

    公开(公告)号:US20200341950A1

    公开(公告)日:2020-10-29

    申请号:US16927264

    申请日:2020-07-13

    申请人: Google LLC

    摘要: A method of accessing data includes storing a table that includes a plurality of tablets corresponding to distinct non-overlapping table portions. Respective pluralities of tablet access objects and application objects are stored in a plurality of servers. A distinct application object and distinct tablet are associated with each tablet access object. Each application object corresponds to a distinct instantiation of an application associated with the table. The tablet access objects and associated application objects are redistributed among the servers in accordance with a first load-balancing criterion. A first request directed to a respective tablet is received from a client. In response, the tablet access object associated with the respective tablet is used to perform a data access operation on the respective tablet, and the application object associated with the respective tablet is used to perform an additional computational operation to produce a result to be returned to the client.

    System And Method For Large-scale Data Processing Using An Application-independent Framework

    公开(公告)号:US20190272264A1

    公开(公告)日:2019-09-05

    申请号:US16417126

    申请日:2019-05-20

    申请人: Google LLC

    摘要: A method performs large-scale data processing in a distributed and parallel processing environment. The method defines application-independent map and reduce operations, each invoking one or more library functions that automatically handle data partitioning, parallelization of computations, and fault tolerance. A user specifies a map operation, which calls one or more of the application-independent map operators to perform data read and write operations. A user also specifies a reduce operation, which calls one or more of the application-independent reduce operators to perform data read and write operations. The method executes application-independent map worker processes. Each map worker process executes the user-specified map operation to read designated portions of input files and store intermediate data values in intermediate data structures. The method also executes application-independent reduce worker processes. Each reduce worker process executes the user-specified reduce operation to read intermediate data values from the intermediate data structures and produce final output data.

    System And Method For Large-Scale Data Processing Using An Application-Independent Framework

    公开(公告)号:US20210117401A1

    公开(公告)日:2021-04-22

    申请号:US17134862

    申请日:2020-12-28

    申请人: Google LLC

    摘要: A method performs large-scale data processing in a distributed and parallel processing environment. The method defines application-independent map and reduce operations, each invoking one or more library functions that automatically handle data partitioning, parallelization of computations, and fault tolerance. A user specifies a map operation, which calls one or more of the application-independent map operators to perform data read and write operations. A user also specifies a reduce operation, which calls one or more of the application-independent reduce operators to perform data read and write operations. The method executes application-independent map worker processes. Each map worker process executes the user-specified map operation to read designated portions of input files and store intermediate data values in intermediate data structures. The method also executes application-independent reduce worker processes. Each reduce worker process executes the user-specified reduce operation to read intermediate data values from the intermediate data structures and produce final output data.

    System and method for large-scale data processing using an application-independent framework

    公开(公告)号:US10296500B2

    公开(公告)日:2019-05-21

    申请号:US15479228

    申请日:2017-04-04

    申请人: Google LLC

    摘要: A method performs large-scale data processing in a distributed and parallel processing environment. The method defines application-independent map and reduce operations, each invoking one or more library functions that automatically handle data partitioning, parallelization of computations, and fault tolerance. A user specifies a map operation, which calls one or more of the application-independent map operators to perform data read and write operations. A user also specifies a reduce operation, which calls one or more of the application-independent reduce operators to perform data read and write operations. The method executes application-independent map worker processes. Each map worker process executes the user-specified map operation to read designated portions of input files and store intermediate data values in intermediate data structures. The method also executes application-independent reduce worker processes. Each reduce worker process executes the user-specified reduce operation to read intermediate data values from the intermediate data structures and produce final output data.