Storing genetic data in a storage system

    公开(公告)号:US10720231B1

    公开(公告)日:2020-07-21

    申请号:US15690844

    申请日:2017-08-30

    Applicant: Google Inc.

    Abstract: A method includes receiving, by a processing device, a plurality of genome files. Each genome file corresponds to a different sample and defining a genetic sequence. The method also includes generating, by the processing device, a two-dimensional alignment file based on the genome files and a reference sequence. A first dimension of the alignment file corresponds to individual genetic sequences and each of the genetic sequences is aligned with respect to the reference sequence along a second dimension of the alignment file. The method includes separating, by the processing device, the alignment file into a plurality of groups and storing the groups in a non-transitory genome data store. Each group contains segments of the genetic sequences of two or more of the genomic files.

    Providing posts from an extended network

    公开(公告)号:US09747347B1

    公开(公告)日:2017-08-29

    申请号:US14476133

    申请日:2014-09-03

    Applicant: Google Inc.

    Abstract: A system includes: an engaging post identifier for identifying and retrieving engaging posts; an extended network post identifier for identifying extended posts from an extended network; a combining module for creating a combined list of added posts from the engaging post and the extended posts, the combining module generating one or more ranked posts by ranking the list of added posts by relevance to a user; and a user interface module for providing the one or more ranked posts. The disclosure also includes a method for finding and providing engaging posts that includes determining engaging posts; determining extended posts from an extended social network using a social graph of the user; adding the engaging posts and the extended posts to create a combined list of added posts; ranking the added posts by relevance to a user; and providing one or more of the ranked posts.

    Redundant data requests with cancellation
    4.
    发明授权
    Redundant data requests with cancellation 有权
    冗余数据请求取消

    公开(公告)号:US09197695B2

    公开(公告)日:2015-11-24

    申请号:US14525072

    申请日:2014-10-27

    Applicant: Google Inc.

    Abstract: A method of processing a request, performed by a respective server, is provided in which a request is received from a client. After receiving the request, a determination is made as to whether at least a first predefined number of other servers have a task-processing status for the request indicating that the other servers have undertaken performance of a task-processing operation for the request. When less than the first number of other servers in the set of other servers have the task-processing status for the request, a processing-status message is sent to one or more of the servers in the set of other servers indicating that the respective server is performing the task-processing operation. Upon completion of the task-processing, a result of the processing is sent to the client contingent upon a status of the other servers in the set of other servers.

    Abstract translation: 提供一种处理由相应服务器执行的请求的方法,其中从客户端接收请求。 在接收到请求之后,确定至少第一预定数量的其他服务器是否具有用于该请求的任务处理状态,指示其他服务器已经对该请求执行任务处理操作。 当小于其他服务器集合中的第一数量的其他服务器具有用于请求的任务处理状态时,处理状态消息被发送到该组其他服务器中的一个或多个服务器,指示相应的服务器 正在执行任务处理操作。 完成任务处理后,根据其他服务器组中的其他服务器的状态,将处理结果发送给客户端。

    Efficiently updating and deleting data in a data storage system
    5.
    发明授权
    Efficiently updating and deleting data in a data storage system 有权
    有效地更新和删除数据存储系统中的数据

    公开(公告)号:US09195611B2

    公开(公告)日:2015-11-24

    申请号:US13910059

    申请日:2013-06-04

    Applicant: Google Inc.

    CPC classification number: G06F12/121 G06F17/30345 G06F17/30368

    Abstract: A method of storing data is disclosed. The method is performed on a data storage server having one or more processors and memory storing one or more programs for execution by the one or more processors. The data storage server receives a first and second data request, the requests including a first and second range of one or more keys and an associated first and second value respectively. The data storage server identifies one or more overlap points associated with the first range and the second range. For each of the overlap points, the data storage server then creates data items including ranges of keys, the ranges of each data item including one or more keys that are either: (a) the keys between a terminal key of the first or second range and the overlap point, or (b) the keys between two adjacent overlap points.

    Abstract translation: 公开了存储数据的方法。 该方法在具有一个或多个处理器的数据存储服务器和存储一个或多个程序的存储器中执行,以供一个或多个处理器执行。 数据存储服务器接收第一和第二数据请求,所述请求分别包括一个或多个密钥的第一和第二范围以及关联的第一和第二值。 数据存储服务器识别与第一范围和第二范围相关联的一个或多个重叠点。 对于每个重叠点,数据存储服务器然后创建包括密钥范围的数据项,每个数据项的范围包括一个或多个密钥,它们是:(a)第一或第二范围的终端密钥之间的密钥 和重叠点,或(b)两个相邻重叠点之间的键。

    Storing and Moving Data in a Distributed Storage System
    6.
    发明申请
    Storing and Moving Data in a Distributed Storage System 有权
    在分布式存储系统中存储和移动数据

    公开(公告)号:US20130346540A1

    公开(公告)日:2013-12-26

    申请号:US13899495

    申请日:2013-05-21

    Applicant: Google Inc.

    CPC classification number: H04L67/1097

    Abstract: A system, computer-readable storage medium storing at least one program, and a computer-implemented method for identifying a storage group in a distributed storage system into which data is to be stored is presented. A data structure including information relating to storage groups in a distributed storage system is maintained, where a respective entry in the data structure for a respective storage group includes placement metrics for the respective storage group. A request to identify a storage group into which data is to be stored is received from a computer system. The data structure is used to determine an identifier for a storage group whose placement metrics satisfy a selection criterion. The identifier for the storage group whose placement metrics satisfy the selection criterion is returned to the computer system.

    Abstract translation: 呈现存储至少一个程序的系统,计算机可读存储介质和用于识别要存储数据的分布式存储系统中的存储组的计算机实现的方法。 维护包括与分布式存储系统中的存储组有关的信息的数据结构,其中相应存储组的数据结构中的相应条目包括各个存储组的布局度量。 从计算机系统接收到识别要存储数据的存储组的请求。 数据结构用于确定其布局度量满足选择标准的存储组的标识符。 其位置指标满足选择标准的存储组的标识符返回给计算机系统。

    Predicting likelihoods of conditions being satisfied using recurrent neural networks

    公开(公告)号:US09646244B2

    公开(公告)日:2017-05-09

    申请号:US15150091

    申请日:2016-05-09

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for predicting likelihoods of conditions being satisfied using recurrent neural networks. One of the systems is configured to process a temporal sequence comprising a respective input at each of a plurality of time steps and comprises: one or more recurrent neural network layers; one or more logistic regression nodes, wherein each of the logistic regression nodes corresponds to a respective condition from a predetermined set of conditions, and wherein each of the logistic regression nodes is configured to, for each of the plurality of time steps: receive the network internal state for the time step; and process the network internal state for the time step in accordance with current values of a set of parameters of the logistic regression node to generate a future condition score for the corresponding condition for the time step.

    PROCESSING COMPUTATIONAL GRAPHS
    8.
    发明申请

    公开(公告)号:US20170124452A1

    公开(公告)日:2017-05-04

    申请号:US15337744

    申请日:2016-10-28

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for receiving a request from a client to process a computational graph; obtaining data representing the computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective operation, wherein each directed edge connects a respective first node to a respective second node that represents an operation that receives, as input, an output of an operation represented by the respective first node; identifying a plurality of available devices for performing the requested operation; partitioning the computational graph into a plurality of subgraphs, each subgraph comprising one or more nodes in the computational graph; and assigning, for each subgraph, the operations represented by the one or more nodes in the subgraph to a respective available device in the plurality of available devices for operation.

    Representative Document Selection for a Set of Duplicate Documents
    9.
    发明申请
    Representative Document Selection for a Set of Duplicate Documents 审中-公开
    一组重复文件的代表性文件选择

    公开(公告)号:US20150026170A1

    公开(公告)日:2015-01-22

    申请号:US14510775

    申请日:2014-10-09

    Applicant: GOOGLE INC.

    Abstract: Systems and methods are provided for obtaining a plurality of documents. A respective document in the plurality of documents is associated with a score and each document in the plurality of documents is from a different data structure in a plurality of data structures. Each data structure in the plurality of data structures represents a different portion of a document address space. A first document in the plurality of documents is selected in accordance with the score associated with the first document. The first document has a fingerprint that indicates that the first document has substantially identical content to every other document in the plurality of documents. In accordance with the score, the first document is indexed thereby producing an indexed first document. With respect to the plurality of documents, the indexed first document is included in a document index as representative of each document in the plurality of documents.

    Abstract translation: 提供了用于获得多个文档的系统和方法。 多个文档中的相应文档与分数相关联,并且多个文档中的每个文档来自多个数据结构中的不同数据结构。 多个数据结构中的每个数据结构表示文档地址空间的不同部分。 根据与第一文档相关联的得分来选择多个文档中的第一文档。 第一文档具有指示,其指示第一文档具有与多个文档中的每个其他文档基本相同的内容。 根据分数,第一个文档被索引,从而产生索引的第一个文档。 关于多个文档,索引的第一文档被包括在作为多个文档中的每个文档的代表的文档索引中。

    Storing genetic data in a storage system

    公开(公告)号:US10354748B1

    公开(公告)日:2019-07-16

    申请号:US14671167

    申请日:2015-03-27

    Applicant: Google Inc.

    Abstract: A method includes receiving, by a processing device, a plurality of genome files. Each genome file corresponds to a different sample and defining a genetic sequence. The method also includes generating, by the processing device, a two-dimensional alignment file based on the genome files and a reference sequence. A first dimension of the alignment file corresponds to individual genetic sequences and each of the genetic sequences is aligned with respect to the reference sequence along a second dimension of the alignment file. The method includes separating, by the processing device, the alignment file into a plurality of groups and storing the groups in a non-transitory genome data store. Each group contains segments of the genetic sequences of two or more of the genomic files.

Patent Agency Ranking