Abstract:
A data management system (DMS) includes a continuous real-time object store that captures all real-time activities, with associated object metadata information. The DMS is capable of reintroducing any point-in-time view of data ranging from a granular object to an entire file system. A set of algorithms (for creation of a file or directory, modification of a file or directory, deletion of a file or directory, and relocation/renaming of a file or directory) are used to generate and maintain a file system history in the DMS and to ensure that a latest version of a directory always refers to a latest version of its children until the directory changed. Any point-in-time recovery is implemented using the file system history in one of various ways to provide strong individual file integrity, exact point-in-time crash consistency, and/or recovery of last version of all files in the file system.
Abstract:
A “forward” delta data management technique uses a “sparse” index associated with a delta file to achieve both delta management efficiency and to eliminate read latency while accessing history data. The invention may be implemented advantageously in a data management system that provides real-time data services to data sources associated with a set of application host servers. A host driver embedded in an application server connects an application and its data to a cluster. The host driver captures real-time data transactions, preferably in the form of an event journal that is provided to the data management system. In particular, the driver functions to translate traditional file/database/block I/O into a continuous, application-aware, output data stream. A given application-aware data stream is processed through a multi-stage data reduction process to produce a compact data representation from which an “any point-in-time” reconstruction of the original data can be made.
Abstract:
An efficient method to apply an erasure encoding and decoding scheme across dispersed data stores that receive constant updates. A data store is a persistent memory for storing a data block. Such data stores include, without limitation, a group of disks, a group of disk arrays, or the like. An encoding process applies a sequencing method to assign a sequence number to each data and checksum block as they are modified and updated onto their data stores. The method preferably uses the sequence number to identify data set consistency. The sequencing method allows for self-healing of each individual data store, and it maintains data consistency and correctness within a data block and among a group of data blocks. The inventive technique can be applied on many forms of distributed persistent data stores to provide failure resiliency and to maintain data consistency and correctness.
Abstract:
A data management system that protects data into a continuous object store includes a management interface having a time control. The time control allows an administrator to specify a “past” time, such as a single point or range. When the time control is set to a single point, a hierarchical display of data appears on a display exactly as the data existed in the system at that moment in the past. The time control enables the management interface to operate within a history mode in which the display provides a visual representation of a “virtual” point in time in the past during which the data management system has been operative to provide the data protection service.
Abstract:
A data management system or “DMS” provides data services to data sources associated with a set of application host servers. The DMS typically comprises one or more regions, with each region having one or more clusters. A given cluster has one or more nodes that share storage. When providing continuous data protection and data distribution, the DMS nodes create distributed active object storage to provide the necessary real-time data management services. The distributed object store can be built above raw storage devices, a traditional file system, a special purpose file system, a clustered file system, and a database. The DMS active object store provides an indexing service to the active objects. In an illustrative embodiment, any object property that has a given attribute is indexed and, as a result, the attribute becomes searchable. The DMS provides hierarchical distributed indexing using index trees to facilitate searching.
Abstract:
A data management system (“DMS”) provides an automated, continuous, real-time, substantially no downtime data protection service to one or more data sources. A host driver embedded in an application server captures real-time data transactions, preferably in the form of an event journal. The driver functions to translate traditional file/database/block I/O and the like into a continuous, application-aware, output data stream. The host driver includes an event processor that can perform a recovery operation to an entire data source or a subset of the data source using former point-in-time data in the DMS. The recovery operation may have two phases. First, the structure of the host data in primary storage is recovered to the intended recovering point-in-time. Thereafter, the actual data itself is recovered. The event processor enables such data recovery in an on-demand manner, by allowing recovery to happen simultaneously while an application accesses and updates the recovering data.
Abstract:
A data management system (DMS) includes a continuous real-time object store that captures all real-time activities, with associated object metadata information. The DMS is capable of reintroducing any point-in-time view of data ranging from a granular object to an entire file system. A set of algorithms (for creation of a file or directory, modification of a file or directory, deletion of a file or directory, and relocation/renaming of a file or directory) are used to generate and maintain a file system history in the DMS and to ensure that a latest version of a directory always refers to a latest version of its children until the directory changed. Any point-in-time recovery is implemented using the file system history in one of various ways to provide strong individual file integrity, exact point-in-time crash consistency, and/or recovery of last version of all files in the file system.
Abstract:
A “forward” delta data management technique uses a “sparse” index associated with a delta file to achieve both delta management efficiency and to eliminate read latency while accessing history data. The invention may be implemented advantageously in a data management system that provides real-time data services to data sources associated with a set of application host servers. A host driver embedded in an application server connects an application and its data to a cluster. The host driver captures real-time data transactions, preferably in the form of an event journal that is provided to the data management system. In particular, the driver functions to translate traditional file/database/block I/O into a continuous, application-aware, output data stream. A given application-aware data stream is processed through a multi-stage data reduction process to produce a compact data representation from which an “any point-in-time” reconstruction of the original data can be made.
Abstract:
A data management system or “DMS” provides data services to data sources associated with a set of application host servers. The data management system typically comprises one or more regions, with each region having one or more clusters. A given cluster has one or more nodes that share storage. When providing continuous data protection and data distribution, the DMS nodes create distributed object storage to provide the necessary real-time data management services. The objects created by the DMS nodes are so-called active objects. The distributed object store can be built above raw storage devices, a traditional file system, a special purpose file system, a clustered file system, a database, and so on. According to the present invention, the DMS active object store provides an indexing service to the active objects. In an illustrative embodiment, any object property that has a given attribute is indexed and, as a result, the attribute becomes searchable. The DMS provides hierarchical distributed indexing using index trees to facilitate searching in a highly efficient manner.
Abstract:
The present invention provides a distributed clustering method to allow multiple active instances of consistency management processes that apply the same encoding scheme to be cooperative and function collectively. The techniques described herein facilitate an efficient method to apply an erasure encoding and decoding scheme across dispersed data stores that receive constant updates. The technique can be applied on many forms of distributed persistent data stores to provide failure resiliency and to maintain data consistency and correctness.