OPTIMIZED RESTORATION OF DEDUPLICATED DATA STORED IN CLOUD-BASED STORAGE RESOURCES

    公开(公告)号:US20230315681A1

    公开(公告)日:2023-10-05

    申请号:US18330564

    申请日:2023-06-07

    Abstract: Techniques disclosed herein are well suited to restoring deduplicated backup data from cloud-based storage and from multi-node replicated files systems, and they also improve performance in more traditional data storage technologies. Pre-restore steps include analysis of deduplication indexes to identify data segments that are stored consecutively on storage media. Reading data in aggregate runs of consecutively stored data segments reduces interactions with storage media that hosts the deduplicated data and speeds up retrieval. Parallel reads from multiple storage devices in multi-node replicated file systems also speed up retrieval. An illustrative enhanced media agent pre-fetches data (stored in deduplicated form) in anticipation of read requests that are expected in the restore operation. The pre-fetched data is temporarily stored locally at the media agent, which is responsible for interfacing with storage media and is further responsible for orchestrating the disclosed techniques within an illustrative data storage management system.

    PRUNING DATA SEGMENTS STORED IN CLOUD STORAGE TO RECLAIM CLOUD STORAGE SPACE

    公开(公告)号:US20230153010A1

    公开(公告)日:2023-05-18

    申请号:US17526927

    申请日:2021-11-15

    CPC classification number: G06F3/0652 G06F3/0604 G06F3/0644 G06F3/067

    Abstract: An information management system uses cloud storage resources store secondary copies of primary data created by client computing devices managed by a storage manager. Deduplication operations are performed on the secondary copies, which results in chunk metadata indices that allow for tracking and faster retrieval of the deduplicated secondary copies. The chunk metadata indices may reference data segments of the deduplicated secondary copies. The data segments may be stored in, and across, one or more sub-files. As the secondary copies are aged out from the cloud storage resources, data segments are identified as being orphaned or non-orphaned. Data segments that are orphaned are pruned to remove their corresponding sub-files from the cloud storage resources, where the sub-files are replaced with new sub-files that do not contain the orphaned data segments.

    BLOCK-LEVEL SINGLE INSTANCING
    73.
    发明申请

    公开(公告)号:US20220382643A1

    公开(公告)日:2022-12-01

    申请号:US17884482

    申请日:2022-08-09

    Abstract: Described in detail herein are systems and methods for single instancing blocks of data in a data storage system. For example, the data storage system may include multiple computing devices (e.g., client computing devices) that store primary data. The data storage system may also include a secondary storage computing device, a single instance database, and one or more storage devices that store copies of the primary data (e.g., secondary copies, tertiary copies, etc.). The secondary storage computing device receives blocks of data from the computing devices and accesses the single instance database to determine whether the blocks of data are unique (meaning that no instances of the blocks of data are stored on the storage devices). If a block of data is unique, the single instance database stores it on a storage device. If not, the secondary storage computing device can avoid storing the block of data on the storage devices.

    DATA STORAGE SYSTEM WITH RAPID RESTORE CAPABILITY

    公开(公告)号:US20220210243A1

    公开(公告)日:2022-06-30

    申请号:US17498212

    申请日:2021-10-11

    Abstract: An improved information management system that implements a staging area or cache to temporarily store primary data in a native format before the primary data is converted into secondary copies in a secondary format is described herein. For example, the improved information management system can include various media agents that each include one or more high speed drives. When a client computing device provides primary data for conversion into secondary copies, the primary data can initially be stored in the native format in the high speed drive(s). If the client computing device then submits a request for the primary data, the media agent can simply retrieve the primary data from the high speed drive(s) and transmit the primary data to the client computing device. Because the primary data is already in the native format, no conversion operations are performed by the media agent, thereby reducing the restore delay.

    PARTIAL FILE RESTORE IN A DATA STORAGE SYSTEM

    公开(公告)号:US20210326214A1

    公开(公告)日:2021-10-21

    申请号:US17177018

    申请日:2021-02-16

    Abstract: The data storage system according to certain aspects can implement partial file restore, where only a portion of the secondary copy of a file is restored. Such portion may be designated by one or more application offsets for the file. The system may provide an in-chunk index that includes mapping information between the application offsets and the secondary copy offsets. Chunks may refer to logical data units in which secondary copies are stored, and the in-chunk index for a chunk may be stored in secondary storage with the chunk. Because the mapping information may not be provided at a fixed interval, the system can search through application offsets in the in-chunk index to locate the secondary copy offset corresponding to the portion application offset(s). In this manner, the system may restore the designated portion of the secondary copy in a fast and efficient manner by using the in-chunk index.

    BLOCK-LEVEL SINGLE INSTANCING
    78.
    发明申请

    公开(公告)号:US20210263803A1

    公开(公告)日:2021-08-26

    申请号:US17169257

    申请日:2021-02-05

    Abstract: Described in detail herein are systems and methods for single instancing blocks of data in a data storage system. For example, the data storage system may include multiple computing devices (e.g., client computing devices) that store primary data. The data storage system may also include a secondary storage computing device, a single instance database, and one or more storage devices that store copies of the primary data (e.g., secondary copies, tertiary copies, etc.). The secondary storage computing device receives blocks of data from the computing devices and accesses the single instance database to determine whether the blocks of data are unique (meaning that no instances of the blocks of data are stored on the storage devices). If a block of data is unique, the single instance database stores it on a storage device. If not, the secondary storage computing device can avoid storing the block of data on the storage devices.

    OPERATION READINESS CHECKING AND REPORTING
    80.
    发明申请

    公开(公告)号:US20200257655A1

    公开(公告)日:2020-08-13

    申请号:US16733134

    申请日:2020-01-02

    Abstract: An information management system according to certain aspects may determine whether storage operations will work prior to executing them. The system may check various factors or parameters relating to a storage policy to verify whether the storage policy will work at runtime without actually executing the policy. Some examples of factors can include: availability of primary storage devices, availability of secondary storage devices, license availability for performing that operation, user credentials for connecting to primary and/or second storage devices, available storage capacity, connectivity to storage devices, etc. The system may also check whether a particular system configuration is supported in connection with storage operations. The result of the determination can be provided in the form of a report summarizing any problems found with the storage policy. The report can include recommended courses of action or solutions for resolving any identified issues.

Patent Agency Ranking