-
1.
公开(公告)号:US5727206A
公开(公告)日:1998-03-10
申请号:US690704
申请日:1996-07-31
CPC分类号: G06F11/1435 , Y10S707/99953
摘要: A method for identifying and repairing file system damage following the failure of a processing node within a clustered UNIX file system including a plurality of processing nodes, an interconnection network connecting the processing nodes, and a data storage device connected via a shared interconnect with each one of the plurality of processing nodes. The method includes the step of maintaining a journal for each processing node, each journal containing a bit map identifying inodes to which its associated processing node has acquired and retains an exclusive right. Each bit map journal is updated whenever its associated processing node acquires an exclusive right to an inode. Following a failure of a processing node, a non-failed processing node is designated to audit the inodes associated with the failed node. Auditing is accomplished by reading the bit map journal associated with the failed processing node and obtaining the exclusive right to every inode found within the journal. The inodes within the bit map journal, referred to as suspect inodes, are then compared with a global bit map which identifies each and every unit of space within the file system that is assignable. A suspect node is identified as having a transient state when the unit of space assigned to the suspect inode is also found to be assignable. The assignment of a unit of file system space to any suspect inode identified as having a transient state is thereafter discarded.
摘要翻译: 在包括多个处理节点的集群UNIX文件系统中的处理节点故障之后识别和修复文件系统损坏的方法,连接处理节点的互连网络和经由共享互连连接的数据存储设备 的多个处理节点。 该方法包括维护每个处理节点的日志的步骤,每个日志包含标识其相关联的处理节点所获取的节点的位图并保留排他权。 每当它的关联处理节点获取一个inode的排他权时,每个位图日志就被更新。 在处理节点发生故障之后,指定非故障处理节点来审核与故障节点相关联的inode。 通过读取与失败的处理节点相关联的位图日志并获得在日志中找到的每个inode的独占权,来完成审计。 位图日志内的inode,称为可疑的inode,然后与全局位图进行比较,该全局位图标识可分配文件系统内的每个单元的空间。 当发现分配给嫌疑人inode的空间单位也被发现时,可疑节点被识别为具有瞬态状态。 之后,丢弃将被识别为具有瞬态的任何可疑的inode的文件系统空间的单位分配。
-
公开(公告)号:US5828876A
公开(公告)日:1998-10-27
申请号:US690703
申请日:1996-07-31
CPC分类号: G06F17/30224 , Y10S707/99931 , Y10S707/99939
摘要: An improved file system for managing data storage and retrieval in a clustered UNIX computer system including a plurality of processing nodes and an interconnection network connecting the processing nodes. The improved file system includes a data storage device, such as a disk storage unit, connected via a shared SCSI interconnect with each one of the processing nodes, rather than connected directly with a single processing node. The structure layout for the file system, which is maintained on the data storage device, includes sufficient information to enable all of the processing nodes to access said file system. The layout comprises: superblocks containing offsets to all other file system structures within the file system; a free inode bit map containing a plurality of bits, each bit representing an inode within the file system; a modified inode journal containing a separate inode bit map for each superblock and identifying particular inodes which have been modified by the file system prior to a system failure; a plurality of inodes, each inode being a data structure which contains a definition for each particular file and directory in the file system; a free block bit map containing a bit map wherein each distinct bit represents a logical disk block in the file system; and data blocks containing data representing file contents. The file system interfaces with the computer system's distributed lock manager (DLM) to coordinate file system usage.
摘要翻译: 一种用于在包括多个处理节点和连接处理节点的互连网络的群集UNIX计算机系统中管理数据存储和检索的改进的文件系统。 改进的文件系统包括诸如盘存储单元的数据存储设备,其通过共享SCSI互连与每个处理节点连接,而不是直接与单个处理节点相连。 维护在数据存储设备上的文件系统的结构布局包括足以使所有处理节点访问所述文件系统的信息。 布局包括:超级块,其包含文件系统内所有其他文件系统结构的偏移量; 包含多个位的空闲inode位图,每个位表示文件系统内的一个inode; 修改的inode日志包含用于每个超级块的单独的inode比特图,并且识别在系统故障之前由文件系统修改的特定的索引节点; 多个inode,每个inode是包含文件系统中每个特定文件和目录的定义的数据结构; 包含位图的空闲块位图,其中每个不同位表示文件系统中的逻辑磁盘块; 以及包含表示文件内容的数据的数据块。 文件系统与计算机系统的分布式锁管理器(DLM)进行接口,以协调文件系统的使用。
-
3.
公开(公告)号:US5870540A
公开(公告)日:1999-02-09
申请号:US559865
申请日:1995-11-20
IPC分类号: G06F11/06
CPC分类号: G06F11/076 , G06F11/0709 , H04L43/0811 , H04L43/103
摘要: A method for detecting communication failures on a network comprising a server computer connected to multiple client computers, said server and client computers communicating through the transmission of data packets via the network. The method includes the step of periodically sending an echo request message from the server computer to the client computers. Each client computer, following receipt of the echo request message, sends an echo reply message back to the server computer. The method further includes the steps of monitoring the rate at which data packets are received by the server computer; and reducing the frequency at which echo request messages are sent from the server computer to the client computers during periods when the rate at which data packets are received by the server computer exceeds a re-configurable rate value. The frequency at which the echo request messages are sent is dependent upon the rate at which data packets are received by said server computer.
摘要翻译: 一种用于检测网络上的通信故障的方法,包括连接到多个客户端计算机的服务器计算机,所述服务器和客户端计算机经由网络传输数据分组进行通信。 该方法包括从服务器计算机周期性地向客户端计算机发送回声请求消息的步骤。 在接收到回应请求消息之后,每个客户端计算机将回送应答消息发送回服务器计算机。 该方法还包括监视由服务器计算机接收数据分组的速率的步骤; 并且在由服务器计算机接收数据分组的速率超过可重新配置的速率值的时段期间,减少从服务器计算机向客户端计算机发送回波请求消息的频率。 回送请求消息的发送频率取决于数据包被服务器计算机接收的速率。
-
-