Efficient snapshot read of a database in a distributed storage system

    公开(公告)号:US09659038B2

    公开(公告)日:2017-05-23

    申请号:US13909029

    申请日:2013-06-03

    Applicant: Google Inc.

    CPC classification number: G06F17/30289 G06F17/30067 G06F17/30575

    Abstract: A computer system issues a batch read operation to a tablet in a first replication group in a distributed database and obtains a most recent version of data items in the tablet that have a timestamp no great than a snapshot timestamp T. For each data item in the one tablet, the computer system determines whether the data item has a move-in timestamp less than or equal to the snapshot timestamp T, which is less than a move-out timestamp, and whether the data item has a creation timestamp less than the snapshot timestamp T, which is less than or equal to a deletion timestamp. If the determination is true, the computer system determines whether the move-out timestamp has an actual associated value and, if so, the computer system determines a second tablet in a second replication group in the database that includes the data item and issues the snapshot read operation to the second table in the second replication group to obtain a most-recent version of the data item that has a timestamp no greater than the snapshot timestamp T; otherwise, the computer system issues the snapshot read to the one tablet to obtain a most recent version of the data item that has a timestamp no greater than the snapshot timestamp T.

    Storing and Moving Data in a Distributed Storage System
    2.
    发明申请
    Storing and Moving Data in a Distributed Storage System 有权
    在分布式存储系统中存储和移动数据

    公开(公告)号:US20130346540A1

    公开(公告)日:2013-12-26

    申请号:US13899495

    申请日:2013-05-21

    Applicant: Google Inc.

    CPC classification number: H04L67/1097

    Abstract: A system, computer-readable storage medium storing at least one program, and a computer-implemented method for identifying a storage group in a distributed storage system into which data is to be stored is presented. A data structure including information relating to storage groups in a distributed storage system is maintained, where a respective entry in the data structure for a respective storage group includes placement metrics for the respective storage group. A request to identify a storage group into which data is to be stored is received from a computer system. The data structure is used to determine an identifier for a storage group whose placement metrics satisfy a selection criterion. The identifier for the storage group whose placement metrics satisfy the selection criterion is returned to the computer system.

    Abstract translation: 呈现存储至少一个程序的系统,计算机可读存储介质和用于识别要存储数据的分布式存储系统中的存储组的计算机实现的方法。 维护包括与分布式存储系统中的存储组有关的信息的数据结构,其中相应存储组的数据结构中的相应条目包括各个存储组的布局度量。 从计算机系统接收到识别要存储数据的存储组的请求。 数据结构用于确定其布局度量满足选择标准的存储组的标识符。 其位置指标满足选择标准的存储组的标识符返回给计算机系统。

    EFFICIENT SNAPSHOT READ OF A DATABASE IN A DISTRIBUTED STORAGE SYSTEM
    3.
    发明申请
    EFFICIENT SNAPSHOT READ OF A DATABASE IN A DISTRIBUTED STORAGE SYSTEM 有权
    在分布式存储系统中高效地读取数据库

    公开(公告)号:US20130339301A1

    公开(公告)日:2013-12-19

    申请号:US13909029

    申请日:2013-06-03

    Applicant: Google Inc.

    CPC classification number: G06F17/30289 G06F17/30067 G06F17/30575

    Abstract: A computer system issues a batch read operation to a tablet in a first replication group in a distributed database and obtains a most recent version of data items in the tablet that have a timestamp no great than a snapshot timestamp T. For each data item in the one tablet, the computer system determines whether the data item has a move-in timestamp less than or equal to the snapshot timestamp T, which is less than a move-out timestamp, and whether the data item has a creation timestamp less than the snapshot timestamp T, which is less than or equal to a deletion timestamp. If the determination is true, the computer system determines whether the move-out timestamp has an actual associated value and, if so, the computer system determines a second tablet in a second replication group in the database that includes the data item and issues the snapshot read operation to the second table in the second replication group to obtain a most-recent version of the data item that has a timestamp no greater than the snapshot timestamp T; otherwise, the computer system issues the snapshot read to the one tablet to obtain a most recent version of the data item that has a timestamp no greater than the snapshot timestamp T.

    Abstract translation: 计算机系统向分布式数据库中的第一复制组中的平板电脑发出批量读取操作,并获得平板电脑中具有不大于快照时间戳T的时间戳的最新版本的数据项。对于 一个平板电脑,计算机系统确定数据项是否具有小于或等于快照时间戳T的移入时间戳,小于移出时间戳,以及数据项是否具有小于快照的创建时间戳 时间戳T,小于或等于删除时间戳。 如果确定为真,则计算机系统确定移出时间戳是否具有实际相关联的值,如果是,则计算机系统确定数据库中包括数据项的第二复制组中的第二个平板电脑并发布快照 读操作到第二复制组中的第二表以获得具有不大于快照时间戳T的时间戳的数据项的最新版本; 否则,计算机系统将快照读取发送到一个平板电脑以获得具有不大于快照时间戳T的时间戳的数据项的最新版本。

    SPLITTING AND MOVING RANGES IN A DISTRIBUTED SYSTEM

    公开(公告)号:US20170316026A1

    公开(公告)日:2017-11-02

    申请号:US15144353

    申请日:2016-05-02

    Applicant: Google Inc.

    Abstract: Methods and systems for a distributed transaction in a distributed database system are described. One example includes identifying a request to insert a split point in a source group comprising one or more tablet replicas, each tablet including at least a portion of data from a table in the distributed database system, and the split point splitting data in the source group into a first range and a second range different than the first range; in response to the request: sending a list of filenames in the first range of the source group to a first target group comprising one or more tablet replicas; and creating, at the first target group, a virtual copy of files represented by the list of filenames in the first range, the virtual copy making data of the files available, each using a new name, without duplicating the data of the files.

    System and method for committing transactions on remote servers
    5.
    发明授权
    System and method for committing transactions on remote servers 有权
    在远程服务器上提交事务的系统和方法

    公开(公告)号:US09596294B2

    公开(公告)日:2017-03-14

    申请号:US13892169

    申请日:2013-05-10

    Applicant: Google Inc.

    CPC classification number: H04L67/10 G06F9/466

    Abstract: A system, computer-readable storage medium storing at least one program, and a computer-implemented method for committing transactions on remote servers is presented. Commit requests are issued to remote servers in a set of remote servers to request that the remote servers in the set of remote servers agree to commit a transaction at a first designated future time. When responses from the remote servers in the set of remote servers are received before a first abort time and indicate that all remote servers in the set of remote servers have agreed to commit the transaction at the first designated future time, commit commands are issued to the remote servers in the set of remote servers instructing the remote servers to perform the transaction at the first designated future time.

    Abstract translation: 提出了存储至少一个程序的系统,计算机可读存储介质以及用于在远程服务器上进行事务的计算机实现的方法。 提交请求被发送到一组远程服务器中的远程服务器,以请求远程服务器集中的远程服务器同意在第一个指定的未来时间提交事务。 当在第一个中止时间之前收到远程服务器集中的远程服务器的响应,并指示远程服务器集中的所有远程服务器都同意在第一个指定的未来时间提交事务时,会向 远程服务器中的远程服务器指示远程服务器在第一个指定的未来时间执行交易。

    System and Method for Committing Transactions on Remote Servers
    7.
    发明申请
    System and Method for Committing Transactions on Remote Servers 有权
    用于在远程服务器上提交事务的系统和方法

    公开(公告)号:US20130318146A1

    公开(公告)日:2013-11-28

    申请号:US13892169

    申请日:2013-05-10

    Applicant: Google Inc.

    CPC classification number: H04L67/10 G06F9/466

    Abstract: A system, computer-readable storage medium storing at least one program, and a computer-implemented method for committing transactions on remote servers is presented. Commit requests are issued to remote servers in a set of remote servers to request that the remote servers in the set of remote servers agree to commit a transaction at a first designated future time. When responses from the remote servers in the set of remote servers are received before a first abort time and indicate that all remote servers in the set of remote servers have agreed to commit the transaction at the first designated future time, commit commands are issued to the remote servers in the set of remote servers instructing the remote servers to perform the transaction at the first designated future time.

    Abstract translation: 提出了存储至少一个程序的系统,计算机可读存储介质以及用于在远程服务器上进行事务的计算机实现的方法。 提交请求被发送到一组远程服务器中的远程服务器,以请求远程服务器集中的远程服务器同意在第一个指定的未来时间提交事务。 当在第一个中止时间之前收到远程服务器集中的远程服务器的响应,并指示远程服务器集中的所有远程服务器都同意在第一个指定的未来时间提交事务时,会向 远程服务器中的远程服务器指示远程服务器在第一个指定的未来时间执行交易。

    REDUCING COMMIT WAIT IN A DISTRIBUTED MULTIVERSION DATABASE BY READING THE CLOCK EARLIER

    公开(公告)号:US20180329739A1

    公开(公告)日:2018-11-15

    申请号:US15649920

    申请日:2017-07-14

    Applicant: Google Inc.

    Abstract: In a distributed system where a client's call to commit a transaction occurs outside the transaction's lock-hold interval, computation of timestamp information for the transaction is moved to a client library, while ensuring that no conflicting reads or writes are performed between a time of the computation and acquiring all locks for the transaction. The transaction is committed in phases, with each phase being initiated by the client library. Timestamp information is added to the locks to ensure that timestamps are generated during lock-hold intervals. An increased number of network messages is thereby overlapped with a commit wait period in which a write in a distributed database is delayed in time to ensure concurrency in the database.

Patent Agency Ranking