-
公开(公告)号:US10929041B1
公开(公告)日:2021-02-23
申请号:US16560860
申请日:2019-09-04
Applicant: Amazon Technologies, Inc.
Inventor: Fan Ping , Andrew Boyer , Oleksandr Chychykalo , James Pinkerton , Danny Wei , Norbert Paul Kusters , Divya Ashok Kumar Jain , Jianhua Fan , Thomas Tarak Mathew Veppumthara , Sebastiano Peluso
Abstract: A block-based storage system hosts logical volumes that are implemented via multiple replicas of volume data stored on multiple resource hosts in different failure domains. Also, the block-based storage service allows multiple client computing devices to attach to a same given logical volume at the same time. A membership group authority authorizes sequence numbers for a given logical volume and an associated membership group. The members of the membership group ensure that the members are in agreement on the latest sequence number for the given logical volume before responding to read or write requests directed to the given logical volume.
-
公开(公告)号:US10852996B2
公开(公告)日:2020-12-01
申请号:US15673271
申请日:2017-08-09
Applicant: Amazon Technologies, Inc.
Inventor: Jianhua Fan , Benjamin Arthur Hawks , Norbert Paul Kusters , Nachiappan Arumugam , Danny Wei , John Luther Guthrie, II
Abstract: A slave storage is provisioned using metadata of a master B-tree and updates to references (e.g., offsets) pertaining to data operations of the master B-tree. Master-slave pairs can be used to provide data redundancy, and a master copy can include the master B-tree with references to corresponding data. When provisioning a slave copy, the master sends a B-tree copy to the slave, which stores the slave B-tree copy, allocates the necessary space on local storage, and updates respective offsets of the slave B-tree copy to point to the local storage. Data from the master can then be transferred to the slave and stored according to a note and commit process that ensures operational sequence of the data. Operations received to the master during the process can be committed to the slave copy until the slave is consistent with the master and able to take over as master in the event of a failure.
-
公开(公告)号:US10705956B1
公开(公告)日:2020-07-07
申请号:US15969604
申请日:2018-05-02
Applicant: Amazon Technologies, Inc.
Inventor: Kristina Kraemer Brenneman , Norbert Paul Kusters , Jianhua Fan , Danny Wei
IPC: G06F12/08 , G06F12/0804 , G06F9/52 , G06F16/23
Abstract: A data storage system stores information indicating a determined sequence for performing operations on a data store. A lock is acquired on a portion of the data store. It is determined that performing the operations comprises performing at least one additional operation on the data store. Uncommitted changes implied by the operations are stored in a transaction buffer according to the determined sequence. Changes implied by the additional operation are determined based on a reentrant call to a data store interface. The logged sequence of changes is applied to the data store and the lock is released.
-
公开(公告)号:US20190324666A1
公开(公告)日:2019-10-24
申请号:US16457095
申请日:2019-06-28
Applicant: Amazon Technologies, Inc.
Inventor: Norbert Paul Kusters , Jianhua Fan , Shuvabrata Ganguly , Danny Wei , Avram Israel Blaszka
IPC: G06F3/06
Abstract: A data storage system includes multiple head nodes and data storage sleds. A control plane of the data storage system designates, for a volume partition, one of the head nodes to function as a primary head node storing a primary replica of the volume partition and designates two or more other head nodes to function as reserve head nodes storing reserve replicas of the volume partition. Additionally, the primary head node causes volume data for the volume partition to be erasure encoded and stored on multiple mass storage devices in different ones of the data storage sleds.
-
公开(公告)号:US09753813B1
公开(公告)日:2017-09-05
申请号:US14866655
申请日:2015-09-25
Applicant: Amazon Technologies, Inc.
Inventor: Jianhua Fan , Benjamin Arthur Hawks , Norbert Paul Kusters , Nachiappan Arumugam , Danny Wei , John Luther Guthrie, II
CPC classification number: G06F11/1448 , G06F3/0605 , G06F3/0619 , G06F3/065 , G06F3/067 , G06F3/0689 , G06F11/1464 , G06F11/1471 , G06F2201/84
Abstract: Persistent storage for a master copy is provided using operation numbers. A master copy can include a persistent key-value store such as a B-tree with references to corresponding data. When provisioning a slave copy, the master copy sends a point-in-time copy of the B-tree to the slave copy, which stores a copy of the B-tree, allocates the necessary space, and updates the references of the B-tree to point to a local storage before the data is transferred. When writing the data to persistent storage, a snapshot created on the master copy is an operation that is replicated to the slave copy. The snapshot is generated using a volume view that includes changes to chunks of data of the master copy since a previous snapshot, as determined using the operation number for the previous snapshot. Data (and metadata) for the snapshot is written to persistent storage while new EO operations are processed.
-
公开(公告)号:US12265443B2
公开(公告)日:2025-04-01
申请号:US17937389
申请日:2022-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Fan Ping , Andrew Boyer , Oleksandr Chychykalo , James Pinkerton , Danny Wei , Norbert Paul Kusters , Divya Ashok Kumar Jain , Jianhua Fan , Thomas Tarak Mathew Veppumthara , Sebastiano Peluso
Abstract: A block-based storage system hosts logical volumes that are implemented via multiple replicas of volume data stored on multiple resource hosts in different failure domains. Also, the block-based storage service allows multiple client computing devices to attach to a same given logical volume at the same time. In order to prevent unnecessary failovers, a primary node storing a primary replica is configured with a health check application programmatic interface (API) and a secondary node storing a secondary replica determines whether or not to initiate a failover based on the health of the primary replica.
-
公开(公告)号:US20230022729A1
公开(公告)日:2023-01-26
申请号:US17937389
申请日:2022-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Fan Ping , Andrew Boyer , Oleksandr Chychykalo , James Pinkerton , Danny Wei , Norbert Paul Kusters , Divya Ashok Kumar Jain , Jianhua Fan , Thomas Tarak Mathew Veppumthara , Sebastiano Peluso
Abstract: A block-based storage system hosts logical volumes that are implemented via multiple replicas of volume data stored on multiple resource hosts in different failure domains. Also, the block-based storage service allows multiple client computing devices to attach to a same given logical volume at the same time. In order to prevent unnecessary failovers, a primary node storing a primary replica is configured with a health check application programmatic interface (API) and a secondary node storing a secondary replica determines whether or not to initiate a failover based on the health of the primary replica.
-
公开(公告)号:US10452680B1
公开(公告)日:2019-10-22
申请号:US14866659
申请日:2015-09-25
Applicant: Amazon Technologies, Inc.
Inventor: Jianhua Fan , Benjamin Arthur Hawks , Norbert Paul Kusters , Nachiappan Arumugam , Danny Wei , John Luther Guthrie, II
IPC: G06F16/27 , G06F16/955 , G06F16/22
Abstract: Master-slave pairs can be used to provide data redundancy in an electronic data environment. A master peer can include a B-tree with references to the corresponding data. When provisioning a slave, the master can send a point-in-time copy of the B-tree to the slave, which can allocate the necessary space on local storage and update the references of the B-tree to point to the local storage for the slave. If the master and slave become disconnected, one of the peers can function as a solo master until the peers are again connected, at which point the old peer can be brought current or a new slave provisioned. A log peer can also be provisioned by a solo master, which can store data for operations received during the disconnect for use in catching up a slave peer, which could be the old slave, the log peer, or a new peer.
-
公开(公告)号:US20170364411A1
公开(公告)日:2017-12-21
申请号:US15694684
申请日:2017-09-01
Applicant: Amazon Technologies, Inc.
Inventor: Jianhua Fan , Benjamin Arthur Hawks , Norbert Paul Kusters , Nachiappan Arumugam , Danny Wei , John Luther Guthrie, II
CPC classification number: G06F11/1448 , G06F3/0605 , G06F3/0619 , G06F3/065 , G06F3/067 , G06F3/0689 , G06F11/1464 , G06F11/1471 , G06F2201/84
Abstract: The present disclosure provides persistent storage for a master copy using operation numbers. A master copy can include a B-tree with references to corresponding data. When provisioning a slave copy, the master copy sends a point-in-time copy of the B-tree to the slave copy, which stores a copy of the B-tree, allocates the necessary space, and updates the references of the B-tree to point to a local storage before the data is transferred. When writing the data to persistent storage, a snapshot created on the master copy is an operation that is replicated to the slave copy. The snapshot is generated using a volume view that includes changes to chunks of data of the master copy since a previous snapshot, as determined using the operation number for the previous snapshot. Data (and metadata) for the snapshot is written to persistent storage while new I/O operations are processed.
-
公开(公告)号:US09720620B1
公开(公告)日:2017-08-01
申请号:US14204992
申请日:2014-03-11
Applicant: Amazon Technologies, Inc.
Inventor: Danny Wei , Kerry Quintin Lee , John Luther Guthrie, II , Jianhua Fan , James Michael Thompson , Nandakumar Gopalakrishnan
IPC: G06F3/06
CPC classification number: G06F3/065 , G06F3/0614 , G06F3/0617 , G06F3/067 , G06F3/0683
Abstract: A block-based storage system may implement efficient replication for restoring a data volume from a reduced durability state. A storage node that is not replicating write requests for a data volume may determine that replication for the data volume is to be enabled. A peer storage node may be identified that maintains a stale replica of the data volume. One or more replication operations may be performed to update stale data chunks in the stale replica of the data volume with current data chunks without updating data chunks in the stale replica of the data volume that are current. Stale replicas that are no longer needed may be deleted according timeouts or the amount of stale data in the replica.
-
-
-
-
-
-
-
-
-