Data storage migration in replicated environment

    公开(公告)号:US11507279B2

    公开(公告)日:2022-11-22

    申请号:US16781296

    申请日:2020-02-04

    摘要: The described technology is generally directed towards replicating metadata representing a virtual data structure corresponding to replicated legacy data instead of the actual data for the data structure. Once virtual chunks are replicated to a remote, newer storage system, the corresponding legacy data is locally read into the virtual chunks to transform the virtual chunks into real data chunks of the remote newer storage system. A checksum can be replicated for the remote newer storage system to evaluate the consistency of the data. Efficient data storage migration is thus accomplished in a replicated environment based on relatively negligible replication traffic between two remote locations, while still assuring the consistency of migrated data.

    Mapped redundant array of independent data storage regions

    公开(公告)号:US11449248B2

    公开(公告)日:2022-09-20

    申请号:US16584800

    申请日:2019-09-26

    IPC分类号: G06F3/06 G06F9/50

    摘要: A mapped redundant array of independent regions (mapped RAIR) for data storage is disclosed. A mapped RAIR can be allocated on top of one or more regions of a cluster storage construct or system. The cluster storage construct can be N nodes wide by M disks deep. A mapped RAIR cluster can comprise sites from real or mapped regions. A mapped region can comprise sites from two different real regions. Selection of sites comprised in a mapped region of a mapped RAIR can be based on geographic proximity, network proximity, a constraint, best practice, rule, etc., on customer preferences, etc. A mapped RAIR can provide data protection for data at a regional level.

    Framed Event Access in an Ordered Event Stream Storage System

    公开(公告)号:US20220229581A1

    公开(公告)日:2022-07-21

    申请号:US17152544

    申请日:2021-01-19

    IPC分类号: G06F3/06

    摘要: Framed event access in an ordered event stream (OES) storage system is disclosed. Events can be written to one or more segments of an OES and can have an inherent write sequence. Segments can be parallel segments. Reading events from parallel segments can result in a read sequence that does not match the write sequence. This mismatch can be more severe as segment length increases, as event density disparities increase, as access times diverge for different segments, or for numerous other reasons. Event framing can compartmentalize divergence between the write and read sequence. In an aspect, readers in the several segments of the OES can be constrained to read within a frame defined by frame boundaries until all readers have reached the frame boundary, then can advance to a next frame. The restriction can act as a pseudo-synchronization of readers that can mitigate difference between write and read sequences.

    Erasure coding in a large geographically diverse data storage system

    公开(公告)号:US11354191B1

    公开(公告)日:2022-06-07

    申请号:US17333793

    申请日:2021-05-28

    IPC分类号: G06F11/00 G06F11/10 H03M13/15

    摘要: Selectively distributing fragments of a data protection set in a geographically diverse data storage system is disclosed. The data protection set can comprise fewer fragments than there are zones comprising the geographically diverse data storage system, which can result in some zones not storing a fragment of the data protection set. Control over distribution of fragments of different data protection sets in the geographically diverse data storage system can mitigate or avoid unbalanced storage of the protection sets. The distribution can be controlled in accordance with a protection set distribution scheme (PSDS). A first PSDS can generate coding fragments from randomly select data fragments of all zones. A second PSDS can generate coding fragments from determined unique zone combinations. A third PSDS can generate coding fragments based on affinity values from an affinity matrix. In embodiments, threshold values or rules can be employed to force generation of a protection set regardless of an applied PSDS where the PSDS excessively retards generation of sufficient protections sets.

    Multipart upload for large data events in streaming data storage

    公开(公告)号:US11349906B1

    公开(公告)日:2022-05-31

    申请号:US17205133

    申请日:2021-03-18

    摘要: A streaming data storage system facilitates appending of large events (e.g., up to one gigabyte) to a data segment of a streaming data storage system in a multipart upload operation. A micro-transaction data structure is created for a multipart upload of a large event, to which subparts of the multipart upload are appended during write operations. Order of the subparts is preserved, including when not appended in order, to provide for reading of the subparts in order. An event reference to the micro-transaction data structure is maintained in a data segment corresponding to the large event, and when the event reference encountered during reading, results in reading from the micro-transaction data structure when the multipart upload is complete. The reading from the micro-transaction data structure maintains the order of the large event's subparts, such that raw data is returned to an upstream reader application as the large event.

    Ordered event stream merging
    66.
    发明授权

    公开(公告)号:US11340792B2

    公开(公告)日:2022-05-24

    申请号:US16944089

    申请日:2020-07-30

    IPC分类号: G06F3/06

    摘要: Merging of portions of ordered event streams is disclosed. The disclosed merging of events can limit loss of order of events from streams in exchange for reduced computational load by grouping events according to a pseudo-epoch, wherein events are stored according to a scheme, even though the grouping can result reading events in a different order that that in which the events were written. However, by grouping the events, there can be fewer transitions between storage schemes when reading events than if they were read in the same order in which they were written, thereby reducing computational load. Moreover, restraints on the loss of order can be imposed by selecting a maximum progress window and generally restricting groups from comprising events two different storage schemes. Where events can be moved to archival storage, reducing storage scheme transitions can be of further benefit and speed access times of archived events.

    Access Control for an Ordered Event Stream Storage System

    公开(公告)号:US20220100876A1

    公开(公告)日:2022-03-31

    申请号:US17038079

    申请日:2020-09-30

    IPC分类号: G06F21/62

    摘要: Access control for an ordered event stream (OES) storage system is disclosed. Access to a portion of an OES can be controlled at a key-level in relation to a key space of the OES. An application instance can be identified to enable determining a correspondence to one or more keys. The correspondence can be embodied in stored data, for example, via an advanced access control list (AACL) that can be in the form of a list, a table, etc. Application instance access to the portion of the OES can be controlled by determining if an access rule is satisfied, e.g., determining if the key space the application instance wants to access comports with the one or more keys corresponding to the application instance identity. In an aspect, screening data corresponding to the AACL can enable preliminary access screening external to the OES storage system.

    Employing Triggered Retention in an Ordered Event Stream Storage System

    公开(公告)号:US20220100588A1

    公开(公告)日:2022-03-31

    申请号:US17038102

    申请日:2020-09-30

    IPC分类号: G06F9/54 G06F9/451 G06F9/4401

    摘要: Retention of events of an ordered event stream according to at least one triggered retention policy is disclosed. Expiration of events stored in a segment of an ordered event stream (OES) can be desirable. New events can be added to a head of an OES segment, and pruning events from a tail of the OES segment can be desirable. Processing applications can predicate event retention, e.g., restricting expiration of an event, on at least one triggered retention policy. In some embodiments, an additional fixed retention policy can be combined with the triggered retention. The disclosed retention can be performed at the event-level or at less granular levels, e.g., segment-level, OES-level, etc., e.g., via batching of events. Triggers can be affirmative or negative triggers, triggered/fixed retention windows can be combined, etc., facilitating retention policies that can enable event pruning that can be more nuanced than conventional retention techniques.

    Verifiable intra-cluster migration for a chunk storage system

    公开(公告)号:US11288229B2

    公开(公告)日:2022-03-29

    申请号:US16888144

    申请日:2020-05-29

    摘要: Verifiable intra-cluster migration (VICM) for a chunk storage system is disclosed. VICM can migrate data from a first portion of a cluster to a second portion of a cluster. VICM can comprise locking a first portion of a cluster and locking a corresponding first cluster table during a preparation phase. Chunks of the first portion can then be migrated, during a migration phase, to the second portion and a second cluster table, corresponding to the second portion, can be updated accordingly. Garbage management operations, including recovery operations, can be performed via the second cluster table and the second portion during the migration phase. Upon completion of the migration phase, a reconciliation phase can comprise verifying chunk relationships of the second cluster table and the second portion based on the first cluster table. Exceptions to the verification can be reported via an exception report.

    Ordered Event Stream Merging
    70.
    发明申请

    公开(公告)号:US20220035533A1

    公开(公告)日:2022-02-03

    申请号:US16944089

    申请日:2020-07-30

    IPC分类号: G06F3/06

    摘要: Merging of portions of ordered event streams is disclosed. The disclosed merging of events can limit loss of order of events from streams in exchange for reduced computational load by grouping events according to a pseudo-epoch, wherein events are stored according to a scheme, even though the grouping can result reading events in a different order that that in which the events were written. However, by grouping the events, there can be fewer transitions between storage schemes when reading events than if they were read in the same order in which they were written, thereby reducing computational load. Moreover, restraints on the loss of order can be imposed by selecting a maximum progress window and generally restricting groups from comprising events two different storage schemes. Where events can be moved to archival storage, reducing storage scheme transitions can be of further benefit and speed access times of archived events.