摘要:
I/O bandwidth reduction using storage-level common page information is implemented by a storage server. In response to receiving a request from a client for a page stored at a first virtual address, the storage server determines that the first virtual address maps to a page that is a duplicate of a page stored at a second virtual address. Or the storage server determines that the first and second virtual addresses map to a deduplicated page within a storage system. The storage server then transmits metadata to the client. The metadata maps the first virtual address to a second virtual address that also maps to the deduplicated page.
摘要:
I/O bandwidth reduction using storage-level common page information is implemented by a storage server, in response to receiving a request from a client for a page stored at a first virtual address, determining that the first virtual address maps to a page that is a duplicate of a page stored at a second virtual address or that the first and second virtual addresses map to a deduplicated page within a storage system, and transmitting metadata to the client mapping the first virtual address to a second virtual address that also maps to the deduplicated page. For one embodiment, the metadata is transmitted in anticipation of a request for the redundant/deduplicated page via the second virtual address. For an alternate embodiment, the metadata is sent in response to a determination that a page that maps to the second virtual address was previously sent to the client.
摘要:
Methods and system for securely capturing workloads at a live network for replaying at a test network. The disclosed system captures file system states and workloads of a live server at the live network. In one embodiment the captured data is anonymized to protect confidentiality of the data. A file system of a test server at the test network is mirrored from a captured state of the live server. An anonymized version of the captured workloads is replayed as a request to the test server. A lost or incomplete command is recreated from the states of the live server. An order of the commands during replay can be based on an order in the captured workload, or based on a causal relationship. Performance characteristics of the live network are determined based on the response to the replayed command.
摘要:
An alignment data structure is used to map a logical data block start address to a physical data block start address dynamically, to service a client data access request. A separate alignment data structure can be provided for each data object managed by the storage system. Each such alignment data structure can be stored in, or referenced by a pointer in, the inode of the corresponding data object. A consequence of the mapping is that certain physical storage medium regions are not mapped to any logical data blocks. These unmapped regions may be visible only to the file system layer and layers that reside between the file system layer and the mass storage subsystem. They can be used, if desired, to store system information, i.e., information that is not visible to any storage client.
摘要:
One or more techniques and/or systems are provided for describing virtual machine dependencies. In particular, data objects, such as virtual hard drives, associated with virtual machines may be identified and/or examined to identify data structures, such as configuration files, comprising configuration data. The configuration data may be analyzed to determine dependency relationships between virtual machines to describe virtual machine dependencies. Identifying virtual machine dependencies, among other things, allows virtual machines that are no longer used to be repurposed, deleted, reset, etc. with little to no adverse effect on other virtual machines.
摘要:
Methods and systems for efficiently determining a similarity between two or more datasets. In one embodiment, the similarity is determined based on comparing a subset of sorted frequency-weighted blocks from one dataset to a subset of sorted frequency-weighed blocks from another dataset. Data blocks of a dataset are converted into hash values that are frequency-weighted. These frequency-weighted hash values can be compared to frequency-weighted hash values of another dataset to determine a similarity of the two datasets. In another embodiment, upon a change of a block in a subset of the dataset, the similarity value is re-determined without resorting or hashing the blocks of a dataset other than the blocks of the subset, resulting in an increased performance of a similarity comparison. In another embodiment, blocks of a dataset are excluded based on a block-filtering rule to increase the accuracy of the similarity comparison.
摘要:
Techniques for deduplication of virtual machine files in a virtualized desktop environment are described, including receiving data into a page cache, the data being received from a virtual machine and indicating a write operation, and deduplicating the data in the page cache prior to committing the data to storage, the data being deduplicated in-band and in substantially real-time.
摘要:
One or more techniques and/or systems are provided for hosting a virtual machine from a snapshot. In particular, a snapshot of a virtual machine hosted on a primary computing device may be created. The virtual machine may be hosted on a secondary computing device using the snapshot, for example, when a failure of the virtual machine on the primary computing device occurs. If a virtual machine type (format) of the snapshot is not supported by the secondary computing device, then the virtual machine within the snapshot may be converted to a virtual machine type supported by the secondary computing device. In this way, the virtual machine may be operable and/or accessible on the secondary computing device despite the failure. Hosting the virtual machine on the secondary computing device provides, among other things, fault tolerance for the virtual machine and/or applications comprised therein.
摘要:
Methods and systems for efficiently determining a similarity between two or more datasets. In one embodiment, the similarity is determined based on comparing a subset of sorted frequency-weighted blocks from one dataset to a subset of sorted frequency-weighed blocks from another dataset. Data blocks of a dataset are converted into hash values that are frequency-weighted. These frequency-weighted hash values can be compared to frequency-weighted hash values of another dataset to determine a similarity of the two datasets. In another embodiment, upon a change of a block in a subset of the dataset, the similarity value is re-determined without resorting or hashing the blocks of a dataset other than the blocks of the subset, resulting in an increased performance of a similarity comparison. In another embodiment, blocks of a dataset are excluded based on a block-filtering rule to increase the accuracy of the similarity comparison.
摘要:
Techniques for deduplication of virtual machine files in a virtualized desktop environment are described, including receiving data into a page cache, the data being received from a virtual machine and indicating a write operation, and deduplicating the data in the page cache prior to committing the data to storage, the data being deduplicated in-band and in substantially real-time.