DECOMMISSIONING, RE-COMMISSIONING, AND COMMISSIONING NEW METADATA NODES IN A WORKING DISTRIBUTED DATA STORAGE SYSTEM

    公开(公告)号:US20220100710A1

    公开(公告)日:2022-03-31

    申请号:US17465691

    申请日:2021-09-02

    Abstract: In a running distributed data storage system that actively processes I/Os, metadata nodes are commissioned and decommissioned without taking down the storage system and without introducing interruptions to metadata or payload data I/O. The inflow of reads and writes continues without interruption even while new metadata nodes are in the process of being added and/or removed and the strong consistency of the system is guaranteed. Commissioning and decommissioning nodes within the running system enables streamlined replacement of permanently failed nodes and advantageously enables the system to adapt elastically to workload changes. An illustrative distributed barrier logic (the “view change barrier”) controls a multi-state process that controls a coordinated step-wise progression of the metadata nodes from an old view to a new normal. Rules for I/O handling govern each state until the state machine loop has been traversed and the system reaches its new normal.

    HEALING FAILED ERASURE-CODED WRITE ATTEMPTS IN A DISTRIBUTED DATA STORAGE SYSTEM CONFIGURED WITH FEWER STORAGE NODES THAN DATA PLUS PARITY FRAGMENTS

    公开(公告)号:US20220019355A1

    公开(公告)日:2022-01-20

    申请号:US17336081

    申请日:2021-06-01

    Abstract: A distributed data storage system using erasure coding (EC) provides advantages of EC data storage while retaining high resiliency for EC data storage architectures having fewer data storage nodes than the number of EC data-plus-parity fragments. To ameliorate the effects of certain storage node outages or fatal disk failures, incoming data is temporarily replicated so that read and write operations can continue from/to the storage system. The system automatically heals failed EC write attempts in a manner transparent to users and/or applications: when all storage nodes are operational, the distributed data storage system automatically converts the temporarily replicated data to EC storage and reclaims storage space previously used by the temporarily replicated data. Individual hardware failures are healed through migration techniques that reconstruct and re-fragment data blocks according to the governing EC scheme. An illustrative embodiment is a three-node data storage system using EC 4+2.

    GLOBAL DE-DUPLICATION OF VIRTUAL DISKS IN A STORAGE PLATFORM

    公开(公告)号:US20240411487A1

    公开(公告)日:2024-12-12

    申请号:US18812256

    申请日:2024-08-22

    Abstract: In order to avoid writing duplicates of blocks of data into a storage platform, any virtual disk within the storage platform may have a de-duplication feature enabled. Or, all virtual disks have this feature enabled. For virtual disks with de-duplication enabled, a unique message digest is calculated for every block of data written to that virtual disk. Upon a write, these message digests are consulted in order to determine if a particular block of data has already been written, if so, it is not written again, and if not, it is written. All de-duplication virtual disks are written to a single system virtual disk within the storage platform. De-duplication occurs over the entire storage platform and over all its virtual disks because all message digests are consulted before a write is performed for any virtual disk. A read for a de-duplication virtual desk reads from the system virtual disk.

    CLOUD-BASED DISTRIBUTED DATA STORAGE SYSTEM USING BLOCK-LEVEL DEDUPLICATION BASED ON BACKUP FREQUENCIES OF INCOMING BACKUP COPIES

    公开(公告)号:US20220066670A1

    公开(公告)日:2022-03-03

    申请号:US17153674

    申请日:2021-01-20

    Abstract: Disclosed deduplication techniques at a distributed data storage system guarantee that space reclamation will not affect deduplicated data integrity even without perfect synchronization between components. By understanding certain “behavioral” characteristics and schedule cadences of backup operations that generate backup copies received at the distributed data storage system, data blocks that are not re-written by subsequent backup copies are pro-actively aged, while promoting continued retention of data blocks that are re-written. An expiry scheme operates with block-level granularity. Each unique deduplicated data block is given an expiry timeframe based on the block's arrival time at the distributed data storage system (i.e., when a backup copy supplies the block) and further based on backup frequencies of the various virtual disks referencing a unique system-wide identifier of the block, which is based on the block's hash value. Communications between components are kept to an as-needed basis. Cloud-based and multi-cloud configurations are disclosed.

    PERSISTENT RESERVATIONS FOR VIRTUAL DISK USING MULTIPLE TARGETS

    公开(公告)号:US20200241613A1

    公开(公告)日:2020-07-30

    申请号:US16848799

    申请日:2020-04-14

    Abstract: An application within a virtual machine is an iSCSI Initiator and is allowed to use as an iSCSI Target another virtual machine within the same hypervisor in order to make a persistent reservation for a virtual disk within a remotely-located storage platform. Any number of virtual machines within different hypervisors, and perhaps on different computers, use a local controller virtual machine to make a persistent reservation for the same virtual disk. The registration list and the current reservation holder data for an iSCSI persistent reservation for a particular virtual disk are held on a storage node of the storage platform rather than within a single virtual machine of a remote computer. A metadata module on the storage platform handles the incoming requests. A coordinator module within the storage platform uses a lock mechanism to guarantee that the reserve, release, preempt and clear commands are handled properly.

    ANTI-ENTROPY-BASED METADATA RECOVERY IN A STRONGLY CONSISTENT DISTRIBUTED DATA STORAGE SYSTEM

    公开(公告)号:US20230418716A1

    公开(公告)日:2023-12-28

    申请号:US18458377

    申请日:2023-08-30

    Abstract: A strongly consistent distributed data storage system comprises an enhanced metadata service that is capable of fully recovering all metadata that goes missing when a metadata-carrying disk, disks, and/or partition fail. An illustrative recovery service runs automatically or on demand to bring the metadata node back into full service. Advantages of the recovery service include guaranteed full recovery of all missing metadata, including metadata still residing in commit logs, without impacting strong consistency guarantees of the metadata. The recovery service is network-traffic efficient. In preferred embodiments, the recovery service avoids metadata service downtime at the metadata node, thereby reducing the impact of metadata disk failure on the availability of the system. The disclosed metadata recovery techniques are said to be “self-healing” as they do not need manual intervention and instead automatically detect failures and automatically recover from the failures in a non-disruptive manner.

Patent Agency Ranking