Abstract:
Systems and methods integrate disparate backup devices with a unified interface. In certain examples, a management console manages data from various backup devices, while retaining such data in its native format. The management console can display a hierarchical view of the client devices and/or their data and can further provide utilities for processing the various data formats. A data structure including fields for storing both metadata common to the client device data and value-added metadata can be used to mine or process the data of the disparate client devices. The unified single platform and interface reduces the need for multiple data management products and/or customized data utilities for each individual client device and provides a single pane of glass view into data management operations. Integrating the various types of storage formats and media allows a user to retain existing storage infrastructures and further facilitates scaling to meet long-term management needs.
Abstract:
Methods and systems are described for performing storage operations on electronic data in a network. In response to the initiation of a storage operation and according to a first set of selection logic, a media management component is selected to manage the storage operation. In response to the initiation of a storage operation and according to a second set of selection logic, a network storage device to associate with the storage operation. The selected media management component and the selected network storage device perform the storage operation on the electronic data.
Abstract:
Data storage operations, including content-indexing, containerized deduplication, and policy-driven storage, are performed within a cloud environment. The systems support a variety of clients and cloud storage sites that may connect to the system in a cloud environment that requires data transfer over wide area networks, such as the Internet, which may have appreciable latency and/or packet loss, using various network protocols, including HTTP and FTP. Methods are disclosed for content indexing data stored within a cloud environment to facilitate later searching, including collaborative searching. Methods are also disclosed for performing containerized deduplication to reduce the strain on a system namespace, effectuate cost savings, etc. Methods are disclosed for identifying suitable storage locations, including suitable cloud storage sites, for data files subject to a storage policy. Further, systems and methods for providing a cloud gateway and a scalable data object store within a cloud environment are disclosed, along with other features.
Abstract:
Virtual machine (VM) proliferation may be reduced through the use of Virtual Server Agents (VSAs) assigned to a group of VM hosts that may determine the availability of a VM to perform a task. Tasks may be assigned to existing VMs instead of creating a new VM to perform the task. Furthermore, a VSA coordinator may determine a grouping of VMs or VM hosts based on one or more factors associated with the VMs or the VM hosts, such as VM type or geographical location of the VM hosts. The VSA coordinator may also assign one or more VSAs to facilitate managing the group of VM hosts. In some embodiments, the VSA coordinators may facilitate load balancing of VSAs during operation, such as during a backup operation, a restore operation, or any other operation between a primary storage system and a secondary storage system.
Abstract:
An information management system according to certain aspects can implement application archiving. The system may archive one or more applications on computing devices to make more storage space available on these devices. The system can determine which applications on client computing device to archive based on various factors. Some examples of factors can include frequency of use, application type, amount of application data and/or storage, user and/or device location, etc. The data to be archived can include one or more executable file(s), metadata, actual data, etc. After an application is archived, the system can generate a placeholder for the application; a placeholder can include information for restoring the archived application.
Abstract:
A replication feature for providing faster granular file-level replication between distinct data storage devices is managed and orchestrated by components of an illustrative data storage management system. Information and data objects extracted from snapshots or from primary storage at a source file system are replicated to a destination file system by way of a special-purpose restore operation. The file-level granular replication approach selectively transmits only net changed data from source to destination without passing through a backup copy phase. The illustrative replication operation causes source data to be snapshotted; identifies net changed data in the file system since a preceding replication, e.g., add, change, delete, move, etc.; selectively extracts new/changed data objects from the snapshot along with additional information on moves and deletions; and restores the extracted net changed data to the destination. The illustrative replication feature does not rely on making backup copies.
Abstract:
Described in detail herein are systems and methods for managing single instancing data. Using a single instance database and other constructs (e.g. sparse files), data density on archival media (e.g. magnetic tape) is improved, and the number of files per storage operation is reduced. According to one aspect of a method for managing single instancing data, for each storage operation, a chunk folder is created on a storage device that stores single instancing data. The chunk folder contains three files: 1) a file that contains data objects that have been single instanced; 2) a file that contains data objects that have not been eligible for single instancing; and 3) a metadata file used to track the location of data objects within the other files. A second storage operation subsequent to a first storage operation contains references to data objects in the chunk folder created by the first storage operation instead of the data objects themselves.
Abstract:
A method and system for reducing storage requirements and speeding up storage operations by reducing the storage of redundant data includes receiving a request that identifies one or more data objects to which to apply a storage operation. For each data object, the storage system determines if the data object contains data that matches another data object to which the storage operation was previously applied. If the data objects do not match, then the storage system performs the storage operation in a usual manner. However, if the data objects do match, then the storage system may avoid performing the storage operation.
Abstract:
Virtual machine (VM) proliferation may be reduced by determining the availability of existing VMs to perform a task. Tasks may be assigned to existing VMs instead of creating a new VM to perform the task. Furthermore, a coordinator may determine a grouping of VMs or VM hosts based on one or more factors associated with the VMs or the VM hosts, such as VM type or geographical location of the VM hosts. The coordinator may also assign one or more Virtual Server Agents (VSAs) to facilitate managing the group of VM hosts. In some embodiments, the coordinators may facilitate load balancing of VSAs during operation, such as during a backup operation, a restore operation, or any other operation between a primary storage system and a secondary storage system.
Abstract:
Described herein are techniques for better understanding problems arising in an illustrative information management system, such as a data storage management system, and for issuing appropriate alerts and reporting to data management professionals. The illustrative embodiments include a number of features that detect and raise awareness of anomalies in system operations. Categories of interest include events and job anomalies, such as long-running jobs and job success/failure rates. Anomalies are characterized by frequency anomalies and/or by occurrence counts. Utilization is also of interest for certain key system resources, such as deduplication databases, CPU and memory at the storage manager, etc., without limitation. Predicting low utilization periods for these and other key resources is useful for scheduling maintenance activities without interfering with ordinary data protection jobs.