-
公开(公告)号:US20120030511A1
公开(公告)日:2012-02-02
申请号:US12847203
申请日:2010-07-30
申请人: John J. Wylie , Joseph A. Tucek , Eric A. Anderson , Xiaozhou Li , Mustafa Uysal
发明人: John J. Wylie , Joseph A. Tucek , Eric A. Anderson , Xiaozhou Li , Mustafa Uysal
IPC分类号: G06F11/14
CPC分类号: G06F11/2094 , G06F11/1076
摘要: A method is provided for efficiently recovering information in a distributed storage system where a list of values that should be stored on a storage device is maintained. A first convergence round is scheduled to be performed on the list of values to bring each value to an At Maximum Redundancy (AMR) state. A second convergence round is scheduled to be performed on the list by selecting a wait time interval from a predefined range of wait time intervals between starts of convergence rounds.
摘要翻译: 提供一种用于在分布式存储系统中有效地恢复信息的方法,其中应保存存储在存储设备上的值列表。 第一个收敛轮被安排在值列表上执行,以使每个值达到At Maximum Redundancy(AMR)状态。 通过从收敛轮开始之间的等待时间间隔的预定义范围中选择等待时间间隔,调度在列表上执行第二收敛轮。
-
公开(公告)号:US09411682B2
公开(公告)日:2016-08-09
申请号:US12687361
申请日:2010-01-14
申请人: Eric A. Anderson , Xiaozhou Li , Mehul A. Shah , John J. Wylie
发明人: Eric A. Anderson , Xiaozhou Li , Mehul A. Shah , John J. Wylie
CPC分类号: G06F11/1076 , G06F2211/104 , G06F2211/1088
摘要: A method is provided for scrubbing information stored in a data storage system where the information is stored as a plurality of encoded fragments across multiple storage devices. The method includes maintaining on a first storage device a list of metadata entries corresponding to values that are stored in the data storage system at an At Maximum Redundancy (AMR) state, verifying that encoded fragments associated with each of the metadata entries are stored on a second storage, verifying that a corresponding metadata entry is stored on the first storage device for each encoded fragment that is stored on the second storage device, and scheduling for recovery any missing encoded fragments and/or any missing metadata entry.
摘要翻译: 提供了一种用于擦除存储在数据存储系统中的信息的方法,其中信息作为多个编码片段存储在多个存储设备中。 该方法包括在第一存储设备上以与At值最大冗余(AMR)状态存储在数据存储系统中的值相对应的元数据条目的列表,验证与每个元数据条目相关联的编码片段被存储在 第二存储器,验证对于存储在第二存储设备上的每个编码片段,相应的元数据条目存储在第一存储设备上,以及调度恢复任何丢失的编码片段和/或任何丢失的元数据条目。
-
13.
公开(公告)号:US09298760B1
公开(公告)日:2016-03-29
申请号:US13566793
申请日:2012-08-03
申请人: Xiaozhou Li , Yonggang Zhao , Marian Dvorsky , Ovidiu Gheorghioiu
发明人: Xiaozhou Li , Yonggang Zhao , Marian Dvorsky , Ovidiu Gheorghioiu
IPC分类号: G06F17/30
CPC分类号: G06F17/30321
摘要: A method for shard assignment in a large-scale data processing job is provided. Datasets are divided into a plurality of shards and the shards are indexed and aggregated into one or more groups. A worker process is initially assigned an indexed shard from a group. The initial assignment can assigned based on a simple algorithm. The worker's subsequent shard assignment is based on the index of the initially assigned shard.
摘要翻译: 提供了一种在大规模数据处理作业中进行分片分配的方法。 数据集被分成多个分片,并且分片被索引并聚合成一个或多个分组。 工作进程最初从组中分配了索引的分片。 初始赋值可以基于简单的算法分配。 工作人员随后的分片分配是基于最初分配的分片的索引。
-
公开(公告)号:US09292620B1
公开(公告)日:2016-03-22
申请号:US11855700
申请日:2007-09-14
申请人: Christopher Edward Hoover , Eric A. Anderson , Charles E. Christian, Jr. , Tim Reddin , Robert J. Souza , Xiaozhou Li
发明人: Christopher Edward Hoover , Eric A. Anderson , Charles E. Christian, Jr. , Tim Reddin , Robert J. Souza , Xiaozhou Li
IPC分类号: G06F17/30
CPC分类号: G06F17/30949 , G06F17/3048 , G06F17/30545 , G06F17/30864
摘要: Embodiments include methods, apparatus, and systems for retrieving data from multiple locations in storage systems. One embodiment includes a method that determines that data is stored in multiple locations remote to a computer, estimates a latency to retrieve the data from the multiple locations, and requests the data from the plural locations.
摘要翻译: 实施例包括用于从存储系统中的多个位置检索数据的方法,装置和系统。 一个实施例包括确定数据存储在远离计算机的多个位置的方法,估计从多个位置检索数据的等待时间,并从多个位置请求数据。
-
公开(公告)号:US20120131583A1
公开(公告)日:2012-05-24
申请号:US12950887
申请日:2010-11-19
申请人: Ludmila Cherkasova , Xin Zhang , Xiaozhou Li
发明人: Ludmila Cherkasova , Xin Zhang , Xiaozhou Li
IPC分类号: G06F9/46
CPC分类号: G06F11/1448 , G06F9/4881 , G06F11/1461
摘要: Systems and methods of enhanced backup job scheduling are disclosed. An example method may include determining a number of jobs (n) in a backup set, determining a number of tape drives (m) in the backup device, and determining a number of concurrent disk agents (maxDA) configured for each tape drive. The method may also include defining a scheduling problem based on n, m, and maxDA. The method may also include solving the scheduling problem using an integer programming (IP) formulation to derive a bin-packing schedule that minimizes makespan (S) for the backup set.
摘要翻译: 公开了增强备份作业调度的系统和方法。 示例性方法可以包括确定备份集中的多个作业(n),确定备份设备中的磁带驱动器(m)的数量,以及为每个磁带驱动器配置的多个并发磁盘代理(maxDA)。 该方法还可以包括基于n,m和maxDA定义调度问题。 该方法还可以包括使用整数规划(IP)公式解决调度问题,以导出最小化备份集的制造时间(S)的二进制包装调度。
-
公开(公告)号:US08108620B2
公开(公告)日:2012-01-31
申请号:US12400991
申请日:2009-03-10
IPC分类号: G06F13/00
CPC分类号: G06F12/0813
摘要: A method of caching data in a global cache distributed amongst a plurality of computing devices, comprising providing a global cache for caching data accessible to interconnected client devices, where each client contributes a portion of its main memory to the global cache. Each client also maintains an ordering of data that it has in its cache portion. When a remote reference for a cached datum is made, both the supplying client and the requesting client adjust their orderings to reflect the fact that the number of copies of the requested datum now likely exist in the global cache.
摘要翻译: 一种在分布在多个计算设备之间的全局高速缓存中缓存数据的方法,包括提供用于缓存互连的客户端设备可访问的数据的全局高速缓存,其中每个客户端将其主存储器的一部分贡献给全局高速缓存。 每个客户端还维护其在其高速缓存部分中具有的数据的排序。 当进行缓存数据的远程引用时,供应客户端和请求客户端都会调整其顺序,以反映所请求数据的副本数量现在可能存在于全局缓存中的事实。
-
公开(公告)号:US20100235581A1
公开(公告)日:2010-09-16
申请号:US12400991
申请日:2009-03-10
IPC分类号: G06F12/08
CPC分类号: G06F12/0813
摘要: A method of caching data in a global cache distributed amongst a plurality of computing devices, comprising providing a global cache for caching data accessible to interconnected client devices, where each client contributes a portion of its main memory to the global cache. Each client also maintains an ordering of data that it has in its cache portion. When a remote reference for a cached datum is made, both the supplying client and the requesting client adjust their orderings to reflect the fact that the number of copies of the requested datum now likely exist in the global cache.
摘要翻译: 一种在分布在多个计算设备之间的全局高速缓存中缓存数据的方法,包括提供用于缓存互连的客户端设备可访问的数据的全局高速缓存,其中每个客户端将其主存储器的一部分贡献给全局高速缓存。 每个客户端还维护其在其高速缓存部分中具有的数据的排序。 当进行缓存数据的远程引用时,供应客户端和请求客户端都会调整其顺序,以反映所请求数据的副本数量现在可能存在于全局缓存中的事实。
-
公开(公告)号:US08997056B2
公开(公告)日:2015-03-31
申请号:US12968910
申请日:2010-12-15
申请人: Xiaozhou Li , Mehul A. Shah
发明人: Xiaozhou Li , Mehul A. Shah
CPC分类号: G06F11/3612 , G06F11/3636
摘要: A system comprises a processor and storage containing software executable by the processor. The storage also contains a trace log that contains information pertaining to read and write operations and, for each read and write operation, the information is indicative of a start time, a completion time, and a value targeted by the read or write operation, Based on the trace log, the software causes the processor to construct a directed graph comprising nodes as well as edges interconnecting at least some of the nodes, each node representing a read or write operation and determine whether the constructed directed graph has a cycle. At least one edge is at least one of a data edge representing a data precedence between operations and a time edge representing a time precedence between operations, and at least one edge is a hybrid edge representing both time and data precedence between operations.
摘要翻译: 系统包括处理器和包含由处理器可执行的软件的存储器。 存储还包含跟踪日志,其中包含与读写操作有关的信息,并且对于每次读写操作,该信息表示开始时间,完成时间以及读/写操作所针对的值。基于 在跟踪日志上,软件使处理器构建包括节点以及互连至少一些节点的边缘的有向图,每个节点表示读取或写入操作,并且确定构造的有向图是否具有循环。 至少一个边缘是表示操作之间的数据优先级的数据边缘和表示操作之间的时间优先级的时间边缘中的至少一个,并且至少一个边缘是表示操作之间的时间和数据优先级的混合边缘。
-
公开(公告)号:US20130151478A1
公开(公告)日:2013-06-13
申请号:US13323577
申请日:2011-12-12
申请人: Xiaozhou Li , Wojciech Golab , Mehul A. Shah
发明人: Xiaozhou Li , Wojciech Golab , Mehul A. Shah
IPC分类号: G06F17/30
CPC分类号: G06F17/30371
摘要: A method for verifying a consistency level in a key-value store, in which a value is stored in a cloud-based storage system comprising a read/write register identified by a key. At a centralized monitor node, a history of operations including writes and reads performed at the key is created, and a distance between a read of a value at the key and a latest write to the key is determined. It can then be ascertained whether the distance satisfies a relaxed atomicity property.
摘要翻译: 一种验证密钥值存储中的一致性级别的方法,其中值存储在基于云的存储系统中,该存储系统包括由密钥识别的读/写寄存器。 在集中式监视节点处,创建包括在密钥处执行的写入和读取的操作的历史记录,并且确定在键处的读取和对该键的最新写入之间的距离。 然后可以确定距离是否满足松弛的原子性质。
-
20.
公开(公告)号:US08326807B2
公开(公告)日:2012-12-04
申请号:US12359190
申请日:2009-01-23
IPC分类号: G06F17/30
CPC分类号: G06F11/008 , G06F11/28
摘要: A method for measuring consistability of a distributed storage system is disclosed. The method includes determining at least one consistency level that the distributed storage system can provide. A plurality of failure classes can be determined for the distributed storage system. A probability of the distributed storage system to be in each of the plurality of failure classes can be measured. Each failure class can be mapped to the at least one consistency level. The probability of each failure class for each consistency level can be summed to determine an expected portion of time that the distributed storage system provides each consistency level.
摘要翻译: 公开了一种用于测量分布式存储系统的可组合性的方法。 该方法包括确定分布式存储系统可以提供的至少一个一致性级别。 可以为分布式存储系统确定多个故障等级。 可以测量分布式存储系统处于多个故障等级中的每一个中的概率。 每个故障类可以映射到至少一个一致性级别。 可以将每个一致性级别的每个故障类的概率相加以确定分布式存储系统提供每个一致性级别的预期部分时间。
-
-
-
-
-
-
-
-
-