-
公开(公告)号:US11275743B2
公开(公告)日:2022-03-15
申请号:US15799939
申请日:2017-10-31
申请人: Google LLC
发明人: Robert C. Pike , Sean Quinlan , Sean M. Dorward , Jeffrey Dean , Sanjay Ghemawat
IPC分类号: G06F16/18 , G06F16/2455 , G06F16/28 , G06F16/2458 , G06F11/14
摘要: Systems and methods for analyzing input data records are provided in which a master process initiates a plurality of concurrent first processes each of which comprises, for each data record in at least a subset of a plurality of input data records, creating a parsed representation of the data record and independently applying a procedural language query to the parsed representation to extract one or more values. A respective emit operator is applied to at least one of the extracted one or more values thereby adding corresponding information to a respective intermediate data structure. The respective emit operator implements one of a predefined set of statistical information processing functions. The master process also initiates a plurality of second processes each of which aggregates information from a corresponding subset of intermediate data structures to produce aggregated data that is, in turn, combined to produce output data.
-
公开(公告)号:US20220171781A1
公开(公告)日:2022-06-02
申请号:US17673049
申请日:2022-02-16
申请人: Google LLC
发明人: Robert C. Pike , Sean Quinlan , Sean M. Dorward , Jeffrey Dean , Sanjay Ghemawat
IPC分类号: G06F16/2455 , G06F16/28 , G06F16/2458 , G06F11/14 , G06F16/18
摘要: Systems and methods for analyzing input data records are provided in which a master process initiates a plurality of concurrent first processes each of which comprises, for each data record in at least a subset of a plurality of input data records, creating a parsed representation of the data record and independently applying a procedural language query to the parsed representation to extract one or more values. A respective emit operator is applied to at least one of the extracted one or more values thereby adding corresponding information to a respective intermediate data structure. The respective emit operator implements one of a predefined set of statistical information processing functions. The master process also initiates a plurality of second processes each of which aggregates information from a corresponding subset of intermediate data structures to produce aggregated data that is, in turn, combined to produce output data.
-
公开(公告)号:US20180052890A1
公开(公告)日:2018-02-22
申请号:US15799939
申请日:2017-10-31
申请人: GOOGLE LLC
发明人: Robert C. Pike , Sean Quinlan , Sean M. Dorward , Jeffrey Dean , Sanjay Ghemawat
CPC分类号: G06F16/24561 , G06F11/1482 , G06F16/2471 , G06F16/285 , Y10S707/99933 , Y10S707/99937
摘要: Systems and methods for analyzing input data records are provided in which a master process initiates a plurality of concurrent first processes each of which comprises, for each data record in at least a subset of a plurality of input data records, creating a parsed representation of the data record and independently applying a procedural language query to the parsed representation to extract one or more values. A respective emit operator is applied to at least one of the extracted one or more values thereby adding corresponding information to a respective intermediate data structure. The respective emit operator implements one of a predefined set of statistical information processing functions. The master process also initiates a plurality of second processes each of which aggregates information from a corresponding subset of intermediate data structures to produce aggregated data that is, in turn, combined to produce output data.
-
-