CLOUD CAPACITY SCALING IN METADATA SPACE CONSTRAINED DEDUPLICATION SYSTEMS

    公开(公告)号:US20250061091A1

    公开(公告)日:2025-02-20

    申请号:US18903826

    申请日:2024-10-01

    Abstract: Data migrated from a deduplicated storage appliance are stored as cloud units at cloud storage. Each cloud unit includes containers including data containers storing segments of files, segment tree containers storing upper-level segments of segment trees representing the files, and cloud containers storing headers from the data and segment tree containers. A header for a data container includes fingerprints identifying the segments of files. A header for a segment tree container includes fingerprints identifying the upper-level segments. The cloud units are maintained in different states of accessibility including read-write, read-only, and offline. Upon detecting that local storage of the appliance does not have space to support a cloud unit in a read-write or read-only state, another cloud unit that is in a first state is selected, the first state being the read-write or read-only state. The selected cloud unit is placed in a second state, different from the first state.

    Hyperparameter optimization in file compression using sequence alignment

    公开(公告)号:US12216621B2

    公开(公告)日:2025-02-04

    申请号:US17658930

    申请日:2022-04-12

    Abstract: Compressing files is disclosed. An input file to be compressed is first aligned. During or prior to aligning the input file, hyperparameters are set, determined, or configured. The hyperparameters may be set, determined, or configured to achieve a particular performance characteristic. Aligning the file includes splitting the file into sequences that can be aligned. The result is a compression matrix, where each row of the matrix corresponds to part of the file. A consensus sequence id determined from the compression matrix. Using the consensus sequence, pointer pairs are generated. Each pointer pair identifies a subsequence of the consensus matrix. The compressed file includes the pointer pairs and the consensus sequence.

    Data restore system
    5.
    发明授权

    公开(公告)号:US12216614B2

    公开(公告)日:2025-02-04

    申请号:US18096065

    申请日:2023-01-12

    Applicant: Druva Inc.

    Abstract: A data restore system is provided. The data restore system includes a backup data storage configured to store data for a client and a data restore module configured to receive a restore trigger from the client and to initiate restore operation for selected data from the backup data storage in response to the received trigger. The data restore module is further configured to receive information regarding the selected data to be restored and access a metadata store to receive metadata information for the selected data and provide the metadata information and the downloaded data blocks to a controller to facilitate sorting of the downloaded data blocks based on the files they belong to and store the downloaded restored data to a target data storage. The data restore module is further configured to interact with the checkpointing module to track the progress of restore operation in persistent storage and to minimize rework when restore operation is restarted from interrupt.

    Log file management
    6.
    发明授权

    公开(公告)号:US12210479B2

    公开(公告)日:2025-01-28

    申请号:US17352983

    申请日:2021-06-21

    Applicant: Open Text Inc.

    Inventor: Mark Rees

    Abstract: Methods, devices and computer program products facilitate the storage, access and management of log files that are associated with particular client devices. The log files provide a record of user or client device activities that are periodically sent to a data backup center. A dedicated log file server facilitates the processing and storage of an increasingly large number of log files that are generated by new and existing client devices. A storage server pre-processes the received log files to facilitate the processing and storage of the log files by the log file server. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.

    MESSAGING DEDPULICATION IN PUBLISH / SUBSCRIBE SYSTEM

    公开(公告)号:US20250028686A1

    公开(公告)日:2025-01-23

    申请号:US18224981

    申请日:2023-07-21

    Abstract: A device for using message identifiers for Publish/subscribe messaging deduplication is described. The system may fetch one or more sets of data records from a data source, and each data record is associated with a message identifier. The system may store the one or more sets of data records in a data file, which is associated with a metadata comprising the message identifier, a file path and a row number for each data record. The system may determine whether one or more of the data records are duplicated based on the associated message identifiers. In response to determining that the one or more data records are duplicated, the system may generate a second metadata comprising the file paths and row numbers associated with the duplicated data records.

    Techniques for generating a consistent view of an eventually consistent database

    公开(公告)号:US12204521B2

    公开(公告)日:2025-01-21

    申请号:US16905813

    申请日:2020-06-18

    Applicant: NETFLIX, INC.

    Abstract: In various embodiments, a consistency application constructs a consistent view of an eventually consistent database. The consistency application determines multiple backup files that are associated with at least one datacenter included in the eventually consistent database and extracts aggregated data from the backup files. The consistency application performs compaction operation(s) on the aggregated data to generate compacted data. Notably, the aggregated data includes at least two replicas for each data item stored in the eventually consistent database, whereas the compacted data includes a different consistent data item for each data item stored in that eventually consistent database. The consistency application generated the consistent view of the eventually consistent database based on the compacted data. Because the consistency application generates the consistent view based on backup files and does not access the eventually consistent database, generating the consistent view does not adversely impact the performance of the eventually consistent database.

    Extending retention lock protection from on-premises to the cloud

    公开(公告)号:US12197392B2

    公开(公告)日:2025-01-14

    申请号:US18307575

    申请日:2023-04-26

    Abstract: Embodiments for retention locking a deduplicated file stored in cloud storage by defining object metadata for each object of the file, and comprising a lock count and a retention time based on an expiry date of the lock, with each object having segments, the object metadata further having a respective expiry date and lock count for each segment, where at least some segments are shared among two or more files. Also updating the lock count and retention time for all segments of the file being locked; and if the object is not already locked, locking the object using a retention lock defining a retention time and updating the object metadata with a new lock count and the retention time, otherwise incrementing the lock count and updating the retention time for the expiry date if expiry date of a previous lock is older than a current expiry date.

Patent Agency Ranking