摘要:
Global memory of a storage system may be used to provide NVRAM capabilities to guest operating systems accessing the storage system. The non-volatility of NVRAM (i.e. that retains its information when power is turned off) provides that an NVRAM device provided by global memory may be used as a journaling device to track storage operations and facilitate recovery and/or failover processing in a storage system without needing to add additional hardware and/or other installed devices. Use of the global memory according to the system described herein to provide an NVRAM device, that may function as a journaling device, provides for the speeding up of transactions, thereby improving metadata intensive operations performance and reducing recovery time and/or failover time of a storage system without adding additional hardware support.
摘要:
A method for safeguarding data stored in a memory of a data storage system includes monitoring values of a subset of environmental variables associated with the data-storage system and updating a portion of a table containing values of environmental variables associated with the data-storage system. The table includes values for environmental variables that are not in the subset of environmental variables monitored. The values of the environmental variables are then inspected. On the basis of the inspection, a condition in which there exists a high-risk of data loss is determined.
摘要:
Described is a distributed lock processing technique that may be used to coordinate access to globally accessed resource between endpoints using the connecting message fabric. Processors in a data storage system communicate using the message switch of the message fabric. Each processor is an endpoint within a data storage system. Each endpoint, prior to requesting a lock, dynamically determines a current lock owner of the lock to be requested in accordance with a determination of which endpoints are available as lock owners at the current time. The lock request is issued to the current lock owner with a requested time period used by the lock owner to determine an expiration time. The lock expires automatically at the expiration time even if the lock holder becomes unavailable. If the current lock owner becomes unavailable, a new lock owner is determined prior to the next request for that lock.
摘要:
Described is a technique for maintaining local cache coherency between endpoints using the connecting message fabric. Processors in a data storage system communicate using the message fabric. Each processor is an endpoint having its own local cache storage in which portions of global memory may be locally cached. A write through caching technique is described. Each local cache line of data of each processor is either in an invalid or a shared state. When a write to global memory is performed by a processor (write miss or a write hit), the following are performed atomically: the global memory is updated, other processor's local cache lines of the data are invalidated, verification of invalidation is received by the processor, and the processor's local copy is updated. Other processors' cache lines are invalidated by transmission of an invalidate command by the processor. A processor updates its local cache lines upon the next read miss or write miss of the updated cacheable global memory.
摘要:
A memory storage device has a file storage operating system that uses inodes to access file segments. Each inode has a plurality of rows. A portion of the rows can store extents pointing, directly or indirectly, to data blocks. Each extent has a field to indicate whether the extent is an indirect extent or a direct extent.
摘要:
Operating at least one hypervisor includes running a first hypervisor as a first thread of an underlying operating system, running a second hypervisor as a second thread of the underlying operating system, loading a first guest operating system using the first hypervisor based on the first thread of the underlying operating system, loading a second guest operating system using the second hypervisor based on the second thread of the underlying operating system, and scheduling sharing of resources of the underlying system between the first hypervisor and the second hypervisor according to a scheduler of the underlying operating system, where the first hypervisor and the second hypervisor run independently of each other. The scheduler of the underlying operating system may schedule fractional time shares for the first hypervisor and the second hypervisor to access the same resource.
摘要:
Maintaining failure survivability in a storage system includes determining a save time corresponding to an amount of time needed to transfer system data from volatile memory to non-volatile memory, determining a threshold corresponding to time for batteries to run while transferring data from volatile memory to non-volatile memory after a power loss, and providing an indication in response to the save time being greater than the threshold. The system may include a plurality of directors and the save time and the threshold may be determined for each of the directors. Determining a threshold may include determining an amount of battery time provided by battery power following power loss and multiplying the amount of battery time by a factor less than one, such as 0.8.
摘要:
When a guest OS loads within the context of a container provided by the host OS, the guest OS uses PCI or other protocol to specify a virtual hardware device. The guest OS enumerates the virtual hardware device to establish the size for the BARs and establish its view of physical addresses for the memory locations. A server running in the context of the container receives read/write requests from the guest OS, maps the read/write requests to host OS physical address space, and posts responses to the virtual hardware device. Since the guest OS executes memory related operations using its own memory space, exits to the container code are not required to implement storage related actions by the Guest OS. This allows performance of an application executing in the context of the guest OS to approximate performance of an application executing in the context of the host OS.
摘要:
A data storage system having protocol controller for converting packets between PCIE format used by a storage processor and Rapid IO format used by a packet switching network. The controller includes a PCIE end point for transferring atomic operation (DSA) requests, a data pipe section having a plurality of data pipes for passing user data; and a message engine section for passing messages among the plurality of storage processors. An acceleration path controller bypasses a DSA buffer in the absence of congestion on the network. Packets fed to the PCIE end point include an address portion having code indicating an atomic operation. An encoder converts the code from a PCIE format into the same atomic operation in SRIO format. Each one of a plurality of CPUs is adapted to perform a second DSA request during execution of a first DSA request.
摘要:
Described is an end-to-end broadcast-based messaging technique used in controlling message flow in a data storage system. Each node stores flow control state information about all the nodes which is used in determining whether to send a data transmission to a receiving node. The flow control state information includes an indicator as to whether each node is receiving incoming data transmissions. If a node is not receiving incoming data transmissions, the flow control state information also includes an associated expiration time. Data transmissions are resumed to a receiving node based on the earlier of a sending node determining that the expiration time has lapsed, or receiving a control message from the receiving node explicitly turning on data transmissions. Each node maintains and updates its local copy of the flow control state information in accordance with control messages sent by each node to turn on and off data transmissions. Each node sends out control messages in accordance with predetermined threshold levels taking into account hardware and/or software resources for message buffering.