摘要:
Techniques for optimizing data storage are disclosed herein. In particular, methods and systems for implementing redundancy encoding schemes with data storage systems are described. The redundancy encoding schemes may be scheduled according to system and data characteristics. The schemes may span multiple tiers or layers of a storage system. The schemes may be generated, for example, in accordance with a transaction rate requirement, a data durability requirement or in the context of the age of the stored data. The schemes may be designed to rectify entropy-related effects upon data storage. The schemes may include one or more erasure codes or erasure coding schemes. Additionally, methods and systems for improving and/or accounting for failure correlation of various components of the storage system, including that of storage devices such as hard disk drives, are described.
摘要:
In a system in which documents are generated dynamically in response to user requests, historical data is collected regarding data retrieval subtasks, such as service requests, that are performed to generate such documents. This data is used to predict the specific subtasks that will be performed to respond to specific document requests, such that these subtasks may be initiated preemptively at or near the outset of the associated document generation task. A subtask that would ordinarily be postponed pending the outcome of a prior subtask can thereby be performed in parallel with the prior subtask, reducing document generation times. In one embodiment, the historical data is included within, or is used to generate, a mapping table that maps document generation tasks (which may correspond to specific URLs) to the data retrieval subtasks that are frequently performed within such tasks.
摘要:
Techniques for optimizing data storage are disclosed herein. In particular, methods and systems for implementing redundancy encoding schemes with data storage systems are described. The redundancy encoding schemes may be scheduled according to system and data characteristics. The schemes may span multiple tiers or layers of a storage system. The schemes may be generated, for example, in accordance with a transaction rate requirement, a data durability requirement or in the context of the age of the stored data. The schemes may be designed to rectify entropy-related effects upon data storage. The schemes may include one or more erasure codes or erasure coding schemes. Additionally, methods and systems for improving and/or accounting for failure correlation of various components of the storage system, including that of storage devices such as hard disk drives, are described.
摘要:
Systems and methods are provided herein for storing data to enable inexpensive and/or guaranteed deletion of data. In various embodiments, a customer specifies a data deletion indication associated with a data object to be stored, specifying when and/or how to delete the data object. Such a data deletion indication may be based, for example, on a regulatory compliance requirement. Based at least in part on the data deletion indication, the storage system may select, from a plurality of storage devices, a storage device to store the data object. Data objects with similar data deletion indications may be stored in the same storage device. In some embodiments, a data object stored in a storage device using the methods described herein may be deleted as part of the deletion of all or a portion of the storage device near a time specified by the data deletion indication of the data object.
摘要:
In response to receiving a request from a client to store an object, a key-durable storage system may assign the object to a volume in its data store, generate a key for the object (e.g., an opaque identifier that encodes information for locating the object in the data store), store the object on one disk in the assigned volume, store the key redundantly in the assigned volume (e.g., using a replication or erasure coding technique), and may return the key to the client. To retrieve the object, the client may send a request including the key, and the system may return the object to the client. If a disk fails, the system may determine which objects were lost, and may return the corresponding keys to the appropriate clients in a notification. The system may be used to back up a more expensive object-redundant storage system.
摘要:
Techniques for optimizing data storage are disclosed herein. In particular, methods and systems for implementing redundancy encoding schemes with data storage systems are described. The redundancy encoding schemes may be scheduled according to system and data characteristics. The schemes may span multiple tiers or layers of a storage system. The schemes may be generated, for example, in accordance with a transaction rate requirement, a data durability requirement or in the context of the age of the stored data. The schemes may be designed to rectify entropy-related effects upon data storage. The schemes may include one or more erasure codes or erasure coding schemes. Additionally, methods and systems for improving and/or accounting for failure correlation of various components of the storage system, including that of storage devices such as hard disk drives, are described.
摘要:
In a system in which documents are generated dynamically in response to user requests, historical data is collected regarding data retrieval subtasks, such as service requests, that are performed to generate such documents. This data is used to predict the specific subtasks that will be performed to respond to specific document requests, such that these subtasks may be initiated preemptively at or near the outset of the associated document generation task. A subtask that would ordinarily be postponed pending the outcome of a prior subtask can thereby be performed in parallel with the prior subtask, reducing document generation times. In one embodiment, the historical data is included within, or is used to generate, a mapping table that maps document generation tasks (which may correspond to specific URLs) to the data retrieval subtasks that are frequently performed within such tasks.
摘要:
In a system in which documents are generated dynamically in response to user requests, historical data is collected regarding data retrieval subtasks, such as service requests, that are performed to generate such documents. This data is used to predict the specific subtasks that will be performed to respond to specific document requests, such that these subtasks may be initiated preemptively at or near the outset of the associated document generation task. A subtask that would ordinarily be postponed pending the outcome of a prior subtask can thereby be performed in parallel with the prior subtask, reducing document generation times. In one embodiment, the historical data is included within, or is used to generate, a mapping table that maps document generation tasks (which may correspond to specific URLs) to the data retrieval subtasks that are frequently performed within such tasks.
摘要:
In a system in which documents are generated dynamically in response to user requests, historical data is collected regarding data retrieval subtasks, such as service requests, that are performed to generate such documents. This data is used to predict the specific subtasks that will be performed to respond to specific document requests, such that these subtasks may be initiated preemptively at or near the outset of the associated page generation task. A subtask that would ordinarily be postponed pending the outcome of a prior subtask can therefore be performed in parallel with the prior subtask, reducing document generation times. In one embodiment, the historical data is included within, or is used to generate, a mapping table that maps document generation tasks (which may correspond to specific URLs) to the data retrieval subtasks that are frequently performed within such tasks.