Abstract:
A technique for “zero copy” transitive communication of data between virtual address domains maintains a translation table hierarchy for each domain. The hierarchy of each domain includes a portion corresponding to every other domain in the system, where the portion for any particular domain begins at the same offset in the virtual address space of every domain. For each domain, there is a source hierarchy used only by the domain itself, which provides read/write access to the addresses in that domain; and a target hierarchy which provides read-only access to that domain, for use only when another domain is the target of IDC from that domain. Only one instance of the target hierarchy of each domain is provided, for all other domains as targets of IDC from that domain. For further space savings the source and target translation table hierarchies can be combined at all but the top hierarchy level.
Abstract:
Methods and system for securely capturing workloads at a live network for replaying at a test network. The disclosed system captures file system states and workloads of a live server at the live network. In one embodiment the captured data is anonymized to protect confidentiality of the data. A file system of a test server at the test network is mirrored from a captured state of the live server. An anonymized version of the captured workloads is replayed as a request to the test server. A lost or incomplete command is recreated from the states of the live server. An order of the commands during replay can be based on an order in the captured workload, or based on a causal relationship. Performance characteristics of the live network are determined based on the response to the replayed command.
Abstract:
A novel technique for improving throughput in a multi-core system in which data is processed according to a producer-consumer relationship by eliminating latencies caused by compulsory cache misses. The producer and consumer entities run as multiple slices of execution. Each such slice has an associated execution context that comprises of the code and data that particular slice would access. The execution contexts of the producer and consumer slices are small enough to fit in the processor caches simultaneously. When a producer entity scheduled on a first core completed production of data elements as constrained by the size of cache memories, a consumer entity is scheduled on that same core to consume the produced data elements. Meanwhile, a second slice of the producer entity is moved to another core and a second slice of a consumer entity is scheduled to consume elements produced by the second slice of the producer.
Abstract:
The techniques introduced here provide for efficient management of storage resources in a modern, dynamic data center through the use of virtual storage appliances. Virtual storage appliances perform storage operations and execute in or as a virtual machine on a hypervisor. A storage management system monitors a storage system to determine whether the storage system is satisfying a service level objective for an application. The storage management system then manages (e.g., instantiates, shuts down, or reconfigures) a virtual storage appliance on a physical server. The virtual storage appliance uses resources of the physical server to meet the storage related needs of the application that the storage system cannot provide. This automatic and dynamic management of virtual storage appliances by the storage management system allows storage systems to quickly react to changing storage needs of applications without requiring expensive excess storage capacity.
Abstract:
Deduplication of data on disk devices based on a threshold number (THN) of sequential blocks is described herein, the threshold number being two or greater. Deduplication may be performed when a series of THN or more received blocks (THN series) match a sequence of THN or more stored blocks (THN sequence), whereby a sequence comprises blocks stored on the same track of a disk device. Deduplication may be performed using a block-comparison mechanism comprising metadata entries of stored blocks and a mapping mechanism containing mappings of deduplicated blocks to their matching blocks. The mapping mechanism may be used to perform later read requests received for the deduplicated blocks. The deduplication described herein may reduce the read latency as the number of seeks between tracks may be reduced. Also, when a seek to a different track is performed, the seek time cost is spread over THN or more blocks.
Abstract:
The disclosed technique uses virtual machines in solving a problem of persistent state for storage protocols. The technique provides for seamless, persistent, storage protocol session state management on a server, for higher availability. A first virtual server is operated in an active role in a host system to serve a client, by using a stateful protocol between the first virtual server and the client. A second, substantially identical virtual server is maintained in a passive role. In response to a predetermined event, the second virtual server takes over for the first virtual server, while preserving state for a pending client request sent to the first virtual server in the stateful protocol. The method can further include causing the second virtual server to respond to the request before a timeout which is specific to the stateful protocol can occur.
Abstract:
Fault isolation capabilities made available by user space can be provided for a embedded network storage system without sacrificing efficiency. By giving user space processes direct access to specific devices (e.g., network interface cards and storage adapters), processes in a user space can initiate Input/Output requests without issuing system calls (and entering kernel mode). The multiple user spaces processes can initiate requests serviced by a user space device driver by sharing a read-only address space that maps the entire physical memory one-to-one. In addition, a user space process can initiate communication with another user space process by use of transmit and receive queues similar to transmit and receiver queues used by hardware devices. And, a mechanism of ensuring that virtual addresses that work in one address space reference the same physical page in another address space is used.
Abstract:
Deduplication of data using a low-latency random read memory (LLRRM) is described herein. Upon receiving a block, if a matching block stored on a disk device is found, the received block is deduplicated by producing an index to the address location of the matching block. In some embodiments, a matching block having a predetermined threshold number of associated indexes that reference the matching block is transferred to LLRRM, the threshold number being one or greater. Associated indexes may be modified to reflect the new address location in LLRRM. Deduplication may be performed using a mapping mechanism containing mappings of deduplicated blocks to matching blocks, the mappings being used for performing read requests. Deduplication described herein may reduce read latency as LLRRM has relatively low latency in performing random read requests relative to disk devices.
Abstract:
A system and method provides for failover of guest operating systems in a virtual machine environment. During initialization of a computer executing a virtual machine operating system, a first guest operating system allocates a first memory region within a first domain and notifies a second guest operating system operating in a second domain of the allocated first memory region. Similarly, the second guest operating system allocates a second region of memory within the second domain and notifies the first operating system of the allocated second memory region. In the event of a software failure affecting one of the guest operating systems, the surviving guest operating system assumes the identity of the failed operating system and utilizes data stored within the shared memory region to replay to storage devices to render them consistent.
Abstract:
Fault isolation capabilities made available by user space can be provided for a embedded network storage system without sacrificing efficiency. By giving user space processes direct access to specific devices (e.g., network interface cards and storage adapters), processes in a user space can initiate Input/Output requests without issuing system calls (and entering kernel mode). The multiple user spaces processes can initiate requests serviced by a user space device driver by sharing a read-only address space that maps the entire physical memory one-to-one. In addition, a user space process can initiate communication with another user space process by use of transmit and receive queues similar to transmit and receiver queues used by hardware devices. And, a mechanism of ensuring that virtual addresses that work in one address space reference the same physical page in another address space is used.