-
公开(公告)号:US10402377B1
公开(公告)日:2019-09-03
申请号:US15224444
申请日:2016-07-29
Applicant: Amazon Technologies, Inc.
Inventor: Danny Wei , Lakshmi N. Pallikila , James Andrew Trenton Lipscomb , Yan V. Leshinsky , Tarun Goyal , Kerry Q. Lee
IPC: G06F16/182 , G06F16/27 , G06F16/11 , G06F11/14
Abstract: A computing system recovers volumes in a distributed computing environment while reducing downtime of storage servers. In an embodiment, a storage server contacts a control plane after a storage failure has occurred. If the storage server hosts an authoritative copy of an offline volume, the storage server is requested to restore the volume. Non-authoritative volumes are removed from the storage server and the storage server provides read access to the restored volume while resuming storage services.
-
公开(公告)号:US10268593B1
公开(公告)日:2019-04-23
申请号:US15385800
申请日:2016-12-20
Applicant: Amazon Technologies, Inc.
Inventor: Marc Stephen Olson , Christopher Magee Greenwood , Anthony Nicholas Liguori , James Michael Thompson , Surya Prakash Dhoolam , Marc John Brooker , Danny Wei
Abstract: A request to create a volume to store data is received. A block within the storage node is selected, dependent at least in part on metadata indicating regions of available storage space in a storage node, to associate with a volume. Information is generated that includes an address to the block. A second computer system is determined to lacks the address to the block. The second computer system is enabled, by providing at least a portion of the information to the second computer system, to perform an operation to the block.
-
公开(公告)号:US20180357173A1
公开(公告)日:2018-12-13
申请号:US16105481
申请日:2018-08-20
Applicant: Amazon Technologies, Inc.
Inventor: Danny Wei , John Luther Guthrie, II , James Michael Thompson , Benjamin Arthur Hawks , Norbert P. Kusters
IPC: G06F12/0866 , G06F11/14 , G06F12/0868
CPC classification number: G06F12/0866 , G06F11/14 , G06F11/1471 , G06F11/3409 , G06F11/3485 , G06F12/0804 , G06F12/0868 , G06F2201/885 , G06F2212/1016 , G06F2212/1032 , G06F2212/313 , G06F2212/461
Abstract: A block-based storage system may implement page cache write logging. Write requests for a data volume maintained at a storage node may be received at a storage node. A page cache for may be updated in accordance with the request. A log record describing the page cache update may be stored in a page cache write log maintained in a persistent storage device. Once the write request is performed in the page cache and recorded in a log record in the page cache write log, the write request may be acknowledged. Upon recovery from a system failure where data in the page cache is lost, log records in the page cache write log may be replayed to restore to the page cache a state of the page cache prior to the system failure.
-
公开(公告)号:US20170364411A1
公开(公告)日:2017-12-21
申请号:US15694684
申请日:2017-09-01
Applicant: Amazon Technologies, Inc.
Inventor: Jianhua Fan , Benjamin Arthur Hawks , Norbert Paul Kusters , Nachiappan Arumugam , Danny Wei , John Luther Guthrie, II
CPC classification number: G06F11/1448 , G06F3/0605 , G06F3/0619 , G06F3/065 , G06F3/067 , G06F3/0689 , G06F11/1464 , G06F11/1471 , G06F2201/84
Abstract: The present disclosure provides persistent storage for a master copy using operation numbers. A master copy can include a B-tree with references to corresponding data. When provisioning a slave copy, the master copy sends a point-in-time copy of the B-tree to the slave copy, which stores a copy of the B-tree, allocates the necessary space, and updates the references of the B-tree to point to a local storage before the data is transferred. When writing the data to persistent storage, a snapshot created on the master copy is an operation that is replicated to the slave copy. The snapshot is generated using a volume view that includes changes to chunks of data of the master copy since a previous snapshot, as determined using the operation number for the previous snapshot. Data (and metadata) for the snapshot is written to persistent storage while new I/O operations are processed.
-
公开(公告)号:US09792231B1
公开(公告)日:2017-10-17
申请号:US14571183
申请日:2014-12-15
Applicant: Amazon Technologies, Inc.
Inventor: James Michael Thompson , Marc Stephen Olson , Jeevan Shankar , Danny Wei , John Robert Smiley , John Luther Guthrie, II , Nachiappan Arumugam , Benjamin Arthur Hawks
CPC classification number: G06F13/1642 , G06F9/5061 , H04L43/028 , H04L67/10
Abstract: Systems and methods are described for dynamically detecting outliers in a set of input/output (I/O) metrics collected and aggregated by a storage volume network. An I/O request is received by a storage volume network, and an agent of the storage volume network associates primary and secondary identifiers with that I/O request. For example, a trace may be associated with a request to write data to a storage volume network, and spans may be associated with the individual operations required to fulfill that request. Once gathered, I/O metrics may be aggregated based on the associated identifiers. I/O metric information regarding outliers may be received from the storage volume network, processed, and published by an I/O metrics service to identify the outliers among the primary and secondary identifiers. These outliers may then be stored for further analysis, and may be utilized to determine improvements to the performance of a storage volume network.
-
公开(公告)号:US09720620B1
公开(公告)日:2017-08-01
申请号:US14204992
申请日:2014-03-11
Applicant: Amazon Technologies, Inc.
Inventor: Danny Wei , Kerry Quintin Lee , John Luther Guthrie, II , Jianhua Fan , James Michael Thompson , Nandakumar Gopalakrishnan
IPC: G06F3/06
CPC classification number: G06F3/065 , G06F3/0614 , G06F3/0617 , G06F3/067 , G06F3/0683
Abstract: A block-based storage system may implement efficient replication for restoring a data volume from a reduced durability state. A storage node that is not replicating write requests for a data volume may determine that replication for the data volume is to be enabled. A peer storage node may be identified that maintains a stale replica of the data volume. One or more replication operations may be performed to update stale data chunks in the stale replica of the data volume with current data chunks without updating data chunks in the stale replica of the data volume that are current. Stale replicas that are no longer needed may be deleted according timeouts or the amount of stale data in the replica.
-
公开(公告)号:US09600203B2
公开(公告)日:2017-03-21
申请号:US14204943
申请日:2014-03-11
Applicant: Amazon Technologies, Inc.
Inventor: Danny Wei , Kerry Quintin Lee , James Michael Thompson , John Luther Guthrie, II , Jianhua Fan , Nandakumar Gopalakrishnan
IPC: G06F3/06
CPC classification number: G06F17/30575 , G06F3/0604 , G06F3/0611 , G06F3/0623 , G06F3/064 , G06F3/065 , G06F3/0665 , G06F3/067 , G06F3/0683 , G06F11/2069 , H04L67/1095 , H04L67/1097
Abstract: A block-based storage system may implement reducing durability state for a data volume. A determination may be made that storage node replicating write requests for a data volume is unavailable. In response, subsequent write requests may be processed according to a reduced durability state for the data volume such that replication for the data volume may be disabled for the storage node. Write requests may then be completed at a fewer number of storage nodes prior to acknowledging the write request as complete. Durability state for the data volume may be increase in various embodiments. A storage node may be identified and replication operations may be performed to synchronize the current data volume at the storage node with a replica of the data volume maintained at the identified storage node.
-
公开(公告)号:US09436407B1
公开(公告)日:2016-09-06
申请号:US13860343
申请日:2013-04-10
Applicant: Amazon Technologies, Inc.
Inventor: Jianhua Fan , Kerry Quintin Lee , Danny Wei , Tate Andrew Certain
IPC: G06F3/06
CPC classification number: G06F3/065 , G06F3/0614 , G06F3/067
Abstract: Methods and systems for cursor remirroring are disclosed. A mirroring process is initiated for a plurality of chunks stored by a master node. The mirroring process comprises visiting a sequence of one or more of the chunks and, for at least some of the chunks, copying chunk data or metadata to a slave node. During the initiated mirroring process, a request is received for a write operation on one of the chunks stored by the master node. If the chunk in the request has been visited in the mirroring process, the write operation is performed on the master node and on the slave node. If the chunk in the request has not been visited, the write operation is performed on the master node and postponed on the slave node until the chunk in the request has been visited in the mirroring process.
Abstract translation: 公开了用于光标重新镜像的方法和系统。 为由主节点存储的多个块启动镜像处理。 镜像过程包括访问一个或多个块的序列,并且对于至少一些块,将块数据或元数据复制到从节点。 在启动的镜像处理期间,接收到由主节点存储的块之一上的写入操作的请求。 如果在镜像过程中访问了请求中的块,则在主节点和从节点上执行写操作。 如果请求中的块没有被访问,则在主节点上执行写入操作,并在从节点上延迟,直到在镜像过程中访问了请求中的块。
-
公开(公告)号:US09430320B1
公开(公告)日:2016-08-30
申请号:US14703593
申请日:2015-05-04
Applicant: Amazon Technologies, Inc.
Inventor: Yi Li , Danny Wei , Kerry Quintin Lee , Mahmood Miah , Nandakumar Gopalakrishnan
CPC classification number: G06F11/08 , G06F11/1004 , G06F11/1016 , G06F11/1032 , G06F11/106 , G11C29/38
Abstract: Methods and systems for detecting error in data storage entities based at least in part on importance of data stored in the data storage entities. In an embodiment, multiple verification passes may be performed on a data storage entity comprising one or more data blocks. Each data block may be associated with a probability indicating the likelihood that the data block is to be selected for verification. During each verification pass, a subset of the data blocks may be selected based at least in part on the probabilities associated with the data blocks. The probabilities may be adjusted, for example, at the end of a verification pass, based on importance factors such as usage and verification information associated with the data blocks. The probabilities may be updated to facilitate timely detection of important data blocks. Additionally, error mitigation and/or correction routines may be performed in light of detected errors.
-
90.
公开(公告)号:US20150234716A1
公开(公告)日:2015-08-20
申请号:US14705892
申请日:2015-05-06
Applicant: Amazon Technologies, Inc.
Inventor: Marc J. Brooker , Tobias L. Holgers , Madhuvanesh Parthasarathy , Danny Wei
CPC classification number: G06F11/1458 , G06F3/0619 , G06F3/0653 , G06F3/0683 , G06F11/008 , G06F11/30 , G06F11/3409 , G06F11/3419 , G06F11/3485 , G06F12/02 , G06F2201/815 , G06F2201/86
Abstract: The relative health of data storage drives may be determined based, at least in some aspects, on data access information and/or other drive operation information. In some examples, upon receiving the operation information from a computing device, a health level of a drive may be determined. The health level determination may be based at least in part on operating information received from a client entity. Additionally, a storage space allocation instruction or operation may be determined for execution. The allocation instruction or operation determined to be performed may be based at least in part on the determined health level.
Abstract translation: 至少在一些方面,可以基于数据访问信息和/或其他驱动器操作信息确定数据存储驱动器的相对健康状况。 在一些示例中,在从计算设备接收到操作信息时,可以确定驱动器的健康水平。 健康水平确定可以至少部分地基于从客户实体接收的操作信息。 此外,可以确定存储空间分配指令或操作以执行。 确定要执行的分配指令或操作可以至少部分地基于所确定的健康水平。
-
-
-
-
-
-
-
-
-