Hybrid garbage collection in a distributed storage system

    公开(公告)号:US10789223B2

    公开(公告)日:2020-09-29

    申请号:US15080474

    申请日:2016-03-24

    Abstract: In various embodiments, methods and systems for implementing garbage collection in distributed storage systems are provided. The distributed storage system operates based on independent management of metadata of extent and stream data storage resources. A hybrid garbage collection system based on reference counting garbage collection operations and mark-and-sweep garbage collection operations is implemented. An extent lifetime table that tracks reference weights and mark sequences for extents is initialized and updated based on indications from extent managers and stream managers, respectively. Upon determining that an extent is to be handed-off from weighted reference counting garbage collection operations to mark-and-sweep garbage collection operations, a reference weight field for the extent is voided and a mark sequence field of the extent is updated. The mark sequence field is updated with a latest global sequence number. The mark-and-sweep garbage collection operations are utilized to reclaim the extent when the extent is no longer referenced.

    Garbage collection implementing erasure coding

    公开(公告)号:US10558565B2

    公开(公告)日:2020-02-11

    申请号:US15990969

    申请日:2018-05-29

    Abstract: Provided is a system and method for converting active data identified by a garbage collection operation into erasure coded fragments. In one example, the method may include identifying data blocks in use and interspersed among garbage data blocks not in use in cloud storage based on a garbage collection operation, extracting object data from the identified data blocks in use into a data container while leaving object data of the garbage data blocks not in use, and fragmenting a predetermined amount of extracted object data stored within the data container, the fragmenting comprising converting the predetermined amount of object data into a plurality of fragments including data fragments storing portions of the data and parity fragments for reconstructing the data, and writing the plurality of fragments in a distributed manner among a plurality of storage nodes.

Patent Agency Ranking