Patent search ap:("GOOGLE LLC") AND inv:"Sanjay Ghemawat" Page 1

1.

发明申请
ASYNCHRONOUS DISTRIBUTED DATA FLOW FOR MACHINE LEARNING WORKLOADS 有权

公开(公告)号：US20250053444A1

公开(公告)日：2025-02-13

申请号：US18814371

申请日：2024-08-23

Applicant: Google LLC

Inventor： Jeffrey Adgate Dean , Sudip Roy , Michael Acheson Isard , Aakanksha Chowdhery , Brennan Saeta , Chandramohan Amyangot Thekkath , Daniel William Hurt , Hyeontaek Lim , Laurent El Shafey , Parker Edward Schuh , Paul Ronald Barham , Ruoming Pang , Ryan Sepassi , Sanjay Ghemawat , Yonghui Wu

IPC: G06F9/48 , G06N3/063 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributing machine learning workloads, e.g., computations for training a neural network or computing an inference using a neural network, across multiple hardware accelerators. One of the systems comprises a plurality of accelerator islands, each hardware accelerator island comprising a respective plurality of hardware devices that include a plurality of hardware accelerators and a corresponding host for each of the plurality of hardware accelerators; and a respective scheduler for each of the accelerator islands that is configured to schedule workloads across the plurality of accelerators and corresponding hosts in the accelerator island, wherein the system is configured to: receive data representing a machine learning workload; and assign a respective portion of the machine learning workload to each of the plurality of accelerator islands for scheduling by the respective scheduler for the accelerator island.

2.

发明授权
Associating application-specific methods with tables used for data storage 有权

公开(公告)号：US11281631B2

公开(公告)日：2022-03-22

申请号：US16927264

申请日：2020-07-13

Applicant: Google LLC

Inventor： Jeffrey Dean , Sanjay Ghemawat , Andrew Fikes , Yasushi Saito

IPC: G06F16/182 , G06F16/22 , G06F9/50 , G06F16/13 , H04L67/1001 , H04L67/1004 , H04L67/1029

Abstract: A method of accessing data includes storing a table that includes a plurality of tablets corresponding to distinct non-overlapping table portions. Respective pluralities of tablet access objects and application objects are stored in a plurality of servers. A distinct application object and distinct tablet are associated with each tablet access object. Each application object corresponds to a distinct instantiation of an application associated with the table. The tablet access objects and associated application objects are redistributed among the servers in accordance with a first load-balancing criterion. A first request directed to a respective tablet is received from a client. In response, the tablet access object associated with the respective tablet is used to perform a data access operation on the respective tablet, and the application object associated with the respective tablet is used to perform an additional computational operation to produce a result to be returned to the client.

3.

发明授权
System and method for analyzing data records 有权

公开(公告)号：US11275743B2

公开(公告)日：2022-03-15

申请号：US15799939

申请日：2017-10-31

Applicant: Google LLC

Inventor： Robert C. Pike , Sean Quinlan , Sean M. Dorward , Jeffrey Dean , Sanjay Ghemawat

IPC: G06F16/18 , G06F16/2455 , G06F16/28 , G06F16/2458 , G06F11/14

Abstract: Systems and methods for analyzing input data records are provided in which a master process initiates a plurality of concurrent first processes each of which comprises, for each data record in at least a subset of a plurality of input data records, creating a parsed representation of the data record and independently applying a procedural language query to the parsed representation to extract one or more values. A respective emit operator is applied to at least one of the extracted one or more values thereby adding corresponding information to a respective intermediate data structure. The respective emit operator implements one of a predefined set of statistical information processing functions. The master process also initiates a plurality of second processes each of which aggregates information from a corresponding subset of intermediate data structures to produce aggregated data that is, in turn, combined to produce output data.

4.

发明授权
Systems and methods for replicating data 有权

公开(公告)号：US11272002B1

公开(公告)日：2022-03-08

申请号：US16801923

申请日：2020-02-26

Applicant: Google LLC

Inventor： Sanjay Ghemawat , Howard B Gobioff , Shun-Tak Leung

IPC: G06F16/10 , H04L67/1095

Abstract: A system facilitates the distribution and redistribution of chunks of data among multiple servers. The system may identify servers to store a replica of the data based on at least one of utilization of the servers, prior data distribution involving the servers, and failure correlation properties associated with the servers, and place the replicas of the data at the identified servers. The system may also monitor total numbers of replicas of the chunks available in the system, identify chunks that have a total number of replicas below one or more chunk thresholds, assign priorities to the identified chunks, and re-replicate the identified chunks based substantially on the assigned priorities. The system may further monitor utilization of the servers, determine whether to redistribute any of the replicas, select one or more of the replicas to redistribute based on the utilization of the servers, select one or more of the servers to which to move the one or more replicas, and move the one or more replicas to the selected one or more servers.

5.

发明申请
Method And System For Deleting Obsolete Files From A File System 有权

公开(公告)号：US20210311909A1

公开(公告)日：2021-10-07

申请号：US17350804

申请日：2021-06-17

Applicant: Google LLC

Inventor： Yasushi Saito , Sanjay Ghemawat , Jeffrey Adgate Dean

IPC: G06F16/16 , G06F16/11 , G06F16/182 , G06F16/215 , G06F16/174

Abstract: A method for deleting obsolete files from a file system is provided. The method includes receiving a request to delete a reference to a first target file of a plurality of target files stored in a file system, the first target file having a first target file name. A first reference file whose file name includes the first target file name is identified. The first reference file is deleted from the file system. The method further includes determining whether the file system includes at least one reference file, distinct from the first reference file, whose file name includes the first target file name. In accordance with a determination that the file system does not include the at least one reference file, the first target file is deleted from the file system.

6.

发明授权
Asynchronous distributed data flow for machine learning workloads 有权

公开(公告)号：US12112198B2

公开(公告)日：2024-10-08

申请号：US18082415

申请日：2022-12-15

Applicant: Google LLC

Inventor： Jeffrey Adgate Dean , Sudip Roy , Michael Acheson Isard , Aakanksha Chowdhery , Brennan Saeta , Chandramohan Amyangot Thekkath , Daniel William Hurt , Hyeontaek Lim , Laurent El Shafey , Parker Edward Schuh , Paul Ronald Barham , Ruoming Pang , Ryan Sepassi , Sanjay Ghemawat , Yonghui Wu

IPC: G06F17/10 , G06F9/48 , G06N3/063 , G06N3/08

CPC classification number: G06F9/4881 , G06N3/063 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributing machine learning workloads, e.g., computations for training a neural network or computing an inference using a neural network, across multiple hardware accelerators. One of the systems comprises a plurality of accelerator islands, each hardware accelerator island comprising a respective plurality of hardware devices that include a plurality of hardware accelerators and a corresponding host for each of the plurality of hardware accelerators; and a respective scheduler for each of the accelerator islands that is configured to schedule workloads across the plurality of accelerators and corresponding hosts in the accelerator island, wherein the system is configured to: receive data representing a machine learning workload; and assign a respective portion of the machine learning workload to each of the plurality of accelerator islands for scheduling by the respective scheduler for the accelerator island.

7.

发明公开
Method And System For Deleting Obsolete Files From A File System 审中-公开

公开(公告)号：US20230409527A1

公开(公告)日：2023-12-21

申请号：US18239475

申请日：2023-08-29

Applicant: Google LLC

Inventor： Yasushi Saito , Sanjay Ghemawat , Jeffrey Adgate Dean

IPC: G06F16/16 , G06F16/11 , G06F16/182 , G06F16/215 , G06F16/174

CPC classification number: G06F16/162 , G06F16/11 , G06F16/182 , G06F16/215 , G06F16/1748

Abstract: A method for deleting obsolete files from a file system is provided. The method includes receiving a request to delete a reference to a first target file of a plurality of target files stored in a file system, the first target file having a first target file name. A first reference file whose file name includes the first target file name is identified. The first reference file is deleted from the file system. The method further includes determining whether the file system includes at least one reference file, distinct from the first reference file, whose file name includes the first target file name. In accordance with a determination that the file system does not include the at least one reference file, the first target file is deleted from the file system.

8.

发明授权
Processing computational graphs 有权

公开(公告)号：US11769061B2

公开(公告)日：2023-09-26

申请号：US16898971

申请日：2020-06-11

Applicant: Google LLC

Inventor： Paul A. Tucker , Jeffrey Adgate Dean , Sanjay Ghemawat , Yuan Yu

IPC: G06N3/08 , G06N3/098 , G06F9/50 , G06N3/084 , G06N3/063 , G06N3/045 , G06N20/00 , G06N5/048

CPC classification number: G06N3/098 , G06F9/5038 , G06F9/5066 , G06N3/045 , G06N3/063 , G06N3/08 , G06N3/084 , G06N20/00 , G06N5/048

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for receiving a request from a client to process a computational graph; obtaining data representing the computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective operation, wherein each directed edge connects a respective first node to a respective second node that represents an operation that receives, as input, an output of an operation represented by the respective first node; identifying a plurality of available devices for performing the requested operation; partitioning the computational graph into a plurality of subgraphs, each subgraph comprising one or more nodes in the computational graph; and assigning, for each subgraph, the operations represented by the one or more nodes in the subgraph to a respective available device in the plurality of available devices for operation.

9.

发明授权
System and method for large-scale data processing using an application-independent framework 有权

公开(公告)号：US11650971B2

公开(公告)日：2023-05-16

申请号：US17834316

申请日：2022-06-07

Applicant: Google LLC

Inventor： Jeffrey Adgate Dean , Sanjay Ghemawat

IPC: G06F16/22 , G06F16/23 , G06F16/2453 , G06F9/48 , G06F9/54

CPC classification number: G06F16/2282 , G06F9/4881 , G06F9/54 , G06F16/2379 , G06F16/24532

Abstract: A method performs large-scale data processing in a distributed and parallel processing environment. The method defines application-independent map and reduce operations, each invoking one or more library functions that automatically handle data partitioning, parallelization of computations, and fault tolerance. A user specifies a map operation, which calls one or more of the application-independent map operators to perform data read and write operations. A user also specifies a reduce operation, which calls one or more of the application-independent reduce operators to perform data read and write operations. The method executes application-independent map worker processes. Each map worker process executes the user-specified map operation to read designated portions of input files and store intermediate data values in intermediate data structures. The method also executes application-independent reduce worker processes. Each reduce worker process executes the user-specified reduce operation to read intermediate data values from the intermediate data structures and produce final output data.

10.

发明申请
System And Method For Large-Scale Data Processing Using An Application-Independent Framework 有权

公开(公告)号：US20220405264A1

公开(公告)日：2022-12-22

申请号：US17834316

申请日：2022-06-07

Applicant: Google LLC

Inventor： Jeffrey Adgate Dean , Sanjay Ghemawat

IPC: G06F16/22 , G06F16/23 , G06F16/2453 , G06F9/48 , G06F9/54

Abstract: A method performs large-scale data processing in a distributed and parallel processing environment. The method defines application-independent map and reduce operations, each invoking one or more library functions that automatically handle data partitioning, parallelization of computations, and fault tolerance. A user specifies a map operation, which calls one or more of the application-independent map operators to perform data read and write operations. A user also specifies a reduce operation, which calls one or more of the application-independent reduce operators to perform data read and write operations. The method executes application-independent map worker processes. Each map worker process executes the user-specified map operation to read designated portions of input files and store intermediate data values in intermediate data structures. The method also executes application-independent reduce worker processes. Each reduce worker process executes the user-specified reduce operation to read intermediate data values from the intermediate data structures and produce final output data.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification