摘要:
A versioned file system comprises a set of structured data representations. At a first time, an interface creates and exports to a data store a first structured data representation corresponding to a first version of a local file system. The first structured data representation is an XML tree having a root element, one or more directory elements associated with the root element, and one or more file elements associated with a given directory element. Upon a change within the file system (e.g., file creation, file deletion, file modification, directory creation, directory deletion and directory modification), the interface creates and exports a second structured data representation corresponding to a second version of the file system. The second structured data representation differs from the first structured data representation up to and including the root element of the second structured data representation. The data store may comprise a cloud storage service provider.
摘要:
A mechanism for adjusting seek activity in a data storage system of physical devices having mirrored logical volumes is presented. Statistics describing at least reading data from the mirrored volumes during successive time periods are collected. From the collected statistics an activity level associated with each of the mirrored logical volumes is determined. Seek activity values for the physical devices are computed based on the activity levels associated with the logical volumes stored on each of the physical devices. The computed seek activity values relate a physical device seek activity to the activity level associated with, and distance between, the mirrored logical volumes residing the physical devices. The computed seek values are used to minimize seek activity for non-mirrored ones of the physical devices.
摘要:
An archival storage cluster of preferably symmetric nodes includes a data protection management system that periodically organizes the then-available nodes into one or more protection sets, with each set comprising a set of n nodes, where “n” refers to a configurable “data protection level” (DPL). At the time of its creation, a given protection set is closed in the sense that each then available node is a member of one, and only one, protection set. When an object is to be stored within the archive, the data protection management system stores the object in a given node of a given protection set and then constrains the distribution of copies of that object to other nodes within the given protection set. As a consequence, all DPL copies of an object are all stored within the same protection set, and only that protection set. This scheme significantly improves MTDL for the cluster as a whole, as the data can only be lost if multiple failures occur within nodes of a given protection set. This is far more unlikely than failures occurring across any random distribution of nodes within the cluster.
摘要:
An automated system randomly generates test cases for hardware or software quality assurance testing. A test case comprises a sequence of discrete, atomic steps (or “building blocks”). A particular test case has a variable number of building blocks. The system takes a set of test actions and links them together to create a much larger library of test cases or “chains.” The chains comprise a large number of random sequence tests that facilitate “chaos-like” or exploratory testing of the overall system under test. Upon execution in the system under test, the test case is considered successful if each building block in the chain executes successfully; if any building block fails, the test case, in its entirety, is considered a failure.
摘要:
A memory storage device has a file storage operating system which uses an inode to record and find segments of each data file. The inode includes a plurality of rows. A portion of the rows are written with direct extents pointing to data blocks storing portions of file segments. At least two of the extents point to data blocks having addresses in different logical volumes.
摘要:
A system and method for increasing efficiency in a mass storage system such as a RAID (redundant array of inexpensive disks) array with a cache memory. Multi-host mass storage systems employ a data structure called a write tree. The write tree is stored in cache memory, and is used to mark addressable data elements stored in the cache memory which must be written back to disk (referred to as “destaging”, or “write-backs”). Disks and disk controllers must scan and traverse the write tree to search for pending write operations. By storing a write tree cache apart from the write tree, the system efficiency is greatly increased. The write tree cache consists of a cylinder address as found in the write tree, and a bit mask indicating pending write operations at a specific level of the write tree. The specific disks and disk controllers can then avoid accessing the write tree in cache memory when searching for pending write operations.
摘要:
A versioned file system comprises a set of structured data representations, such as XML. Each structured data representation corresponds to a “version,” and each version comprises a tree of write-once objects rooted at a root directory manifest. Each version in the versioned file system has associated therewith a “borrow window.” When it is desired to reconstruct the file system to a point in time (or, more generally, a given state), i.e., to perform a “restore,” it is only required to walk (use) a single structured data representation (a tree). During a restore, metadata is pulled back from the cloud first, so users can see the existence of needed files immediately. The remainder of the data is then pulled back from the cloud if/when the user goes to open the file. As a result, the entire file system (or any portion thereof) can be restored to a previous time nearly instantaneously. A “fast” restore is performed if an object being restored exists within a “borrow window” of the version from which the system is restoring.
摘要:
A cluster recovery process is implemented across a set of distributed archives, where each individual archive is a storage cluster of preferably symmetric nodes. Each node of a cluster typically executes an instance of an application that provides object-based storage of fixed content data and associated metadata. According to the storage method, an association or “link” between a first cluster and a second cluster is first established to facilitate replication. The first cluster is sometimes referred to as a “primary” whereas the “second” cluster is sometimes referred to as a “replica.” Once the link is made, the first cluster's fixed content data and metadata are then replicated from the first cluster to the second cluster, preferably in a continuous manner. Upon a failure of the first cluster, however, a failover operation occurs, and clients of the first cluster are redirected to the second cluster. Upon repair or replacement of the first cluster (a “restore”), the repaired or replaced first cluster resumes authority for servicing the clients of the first cluster. This restore operation preferably occurs in two stages: a “fast recovery” stage that involves preferably “bulk” transfer of the first cluster metadata, following by a “fail back” stage that involves the transfer of the fixed content data. Upon receipt of the metadata from the second cluster, the repaired or replaced first cluster resumes authority for the clients irrespective of whether the fail back stage has completed or even begun.
摘要:
A generic testing framework to automatically allocate, install and verify a given version of a system under test, to exercise the system against a series of tests in a “hands-off” objective manner, and then to export information about the tests to one or more developer repositories (such as a query-able database, an email list, a developer web server, a source code version control system, a defect tracking system, or the like). The framework does not “care” or concern itself with the particular implementation language of the test as long as the test can issue directives via a command line or configuration file. During the automated testing of a given test suite having multiple tests, and after a particular test is run, the framework preferably generates an “image” of the system under test and makes that information available to developers, even while additional tests in the suite are being carried out. In this manner, the framework preserves the system “state” to facilitate concurrent or after-the-fact debugging. The framework also will re-install and verify a given version of the system between tests, which may be necessary in the event a given test is destructive or otherwise places the system in an unacceptable condition.
摘要:
A cluster recovery process is implemented across a set of distributed archives, where each individual archive is a storage cluster of preferably symmetric nodes. Each node of a cluster typically executes an instance of an application that provides object-based storage of fixed content data and associated metadata. According to the storage method, an association or “link” between a first cluster and a second cluster is first established to facilitate replication. The first cluster is sometimes referred to as a “primary” whereas the “second” cluster is sometimes referred to as a “replica.” Once the link is made, the first cluster's fixed content data and metadata are then replicated from the first cluster to the second cluster, preferably in a continuous manner. Upon a failure of the first cluster, however, a failover operation occurs, and clients of the first cluster are redirected to the second cluster. Upon repair or replacement of the first cluster (a “restore”), the repaired or replaced first cluster resumes authority for servicing the clients of the first cluster. This restore operation preferably occurs in two stages: a “fast recovery” stage that involves preferably “bulk” transfer of the first cluster metadata, following by a “fail back” stage that involves the transfer of the fixed content data. Upon receipt of the metadata from the second cluster, the repaired or replaced first cluster resumes authority for the clients irrespective of whether the fail back stage has completed or even begun.