-
公开(公告)号:US10824508B2
公开(公告)日:2020-11-03
申请号:US16386577
申请日:2019-04-17
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Patrick J. Meaney , Christian Jacobi , Barry M. Trager
Abstract: A memory system includes memory modules having a number of sets of memory devices including data memory devices for data and error correction code (ECC). The ECC memory devices carry ECC symbols for the memory modules. A host receives and decodes the ECC symbols and executes error correction operations. The host and the memory modules are coupled by a number of channels.
-
公开(公告)号:US20190317856A1
公开(公告)日:2019-10-17
申请号:US15953805
申请日:2018-04-16
Applicant: International Business Machines Corporation
Inventor: James A. O'Connor, JR. , Barry M. Trager , Warren E. Maule , Marc A. Gollub , Brad W. Michael , Patrick J. Meaney
Abstract: Embodiments of the present invention include a memory module that includes a plurality of memory devices and a memory buffer device. Each of the memory devices are characterized as one of a high random bit error rate (RBER) and a low RBER memory device. The memory buffer device includes a read data interface to receive data read from a memory address on one of the memory devices. The memory buffer device also includes common error correction logic to detect and correct error conditions in data read from both high RBER and low RBER memory devices. The common error correction logic includes a plurality of error correction units which provide different complexity levels of error correction and have different latencies. The error correction units include a first fast path error correction unit for isolating and correcting random symbol errors.
-
公开(公告)号:US20160011996A1
公开(公告)日:2016-01-14
申请号:US14701371
申请日:2015-04-30
Applicant: International Business Machines Corporation
Inventor: Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu
CPC classification number: G06F13/287 , G06F9/06 , G06F9/3004 , G06F9/30047 , G06F9/3885 , G06F12/0811 , G06F12/0831 , G06F12/0862 , G06F12/0864 , G06F12/1027 , G06F15/17381 , G06F15/17387 , G06F15/76 , G06F15/8069 , G06F2212/1016 , G06F2212/602 , G06F2212/6022 , G06F2212/6024 , G06F2212/6032 , Y02D10/13 , Y02D10/14
Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time and supports DMA functionality allowing for parallel processing message-passing.
Abstract translation: 100 petaflop规模的多千兆高效并行超级计算机包括基于片上系统技术的节点架构,其中每个处理节点包括单个专用集成电路(ASIC)。 ASIC节点通过五维环面网络互连,最优化节点之间的分组通信的吞吐量并最小化等待时间。 网络实现集体网络和提供全局障碍和通知功能的全球异步网络。 集成在节点设计中包括一个基于列表的预取器。 存储系统实现事务存储器,线程级别推测和多重切换缓存,同时提高软错误率,并支持DMA功能,允许并行处理消息传递。
-
公开(公告)号:US10601448B2
公开(公告)日:2020-03-24
申请号:US15830526
申请日:2017-12-04
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Glenn Gilda , Patrick J. Meaney , Arthur O'Neill , Barry M. Trager
Abstract: Systems, methods, and computer-readable media are disclosed for performing reduced latency error decoding using a reduced latency symbol error correction decoder that utilizes enumerated parallel multiplication in lieu of division and replaces general multiplication with constant multiplication. The use of parallel multiplication in lieu of division can provide reduced latency and replacement of general multiplication with constant multiplication allows for logic reduction. In addition, the reduced symbol error correction decoder can utilize decode term sharing which can yield a further reduction in decoder logic and a further latency improvement.
-
公开(公告)号:US10824504B2
公开(公告)日:2020-11-03
申请号:US15953805
申请日:2018-04-16
Applicant: International Business Machines Corporation
Inventor: James A. O'Connor, Jr. , Barry M. Trager , Warren E. Maule , Marc A. Gollub , Brad W. Michael , Patrick J. Meaney
Abstract: Embodiments of the present invention include a memory module that includes a plurality of memory devices and a memory buffer device. Each of the memory devices are characterized as one of a high random bit error rate (RBER) and a low RBER memory device. The memory buffer device includes a read data interface to receive data read from a memory address on one of the memory devices. The memory buffer device also includes common error correction logic to detect and correct error conditions in data read from both high RBER and low RBER memory devices. The common error correction logic includes a plurality of error correction units which provide different complexity levels of error correction and have different latencies. The error correction units include a first fast path error correction unit for isolating and correcting random symbol errors.
-
6.
公开(公告)号:US20240103967A1
公开(公告)日:2024-03-28
申请号:US17954464
申请日:2022-09-28
Applicant: International Business Machines Corporation
Inventor: Barry M. Trager , Patrick James Meaney , Glenn David Gilda , Lawrence Jones
CPC classification number: G06F11/1068 , G06F11/1004 , G06F11/2273
Abstract: A memory controller stores each of a plurality of data blocks encoded by error correction code (ECC) across multiple channels of a redundant memory system. Based on receiving, from the memory system, channel data of a fetch operation requesting a data block, the memory controller decodes the channel data and concurrently generates a predicted channel mark based on tests of channel-induced syndromes generated from the channel data. The predicted channel mark identifies a marked channel among the multiple channels as a likely source of data errors. The memory controller determines whether the decoding detects an uncorrectable error in the channel data and, based on determining the decoding detects an uncorrectable error in the channel data, re-reads channel data corresponding to the data block and corrects the re-read channel data by excluding, from decoding, channel data received from the marked channel.
-
公开(公告)号:US10901839B2
公开(公告)日:2021-01-26
申请号:US16142440
申请日:2018-09-26
Applicant: International Business Machines Corporation
Inventor: James A. O'Connor , Barry M. Trager , Warren E. Maule , Brad W. Michael , Marc A. Gollub , Patrick J. Meaney
Abstract: Embodiments of the present invention include a memory module that includes a plurality of memory devices and a memory buffer device. The memory devices are characterized as one of a high or low random bit error rate (RBER) memory device. The memory buffer device includes a read data interface to receive data read from a memory address on one of the memory devices, and common error correction logic to detect and correct error conditions in data read from both high RBER and low RBER memory devices. The memory buffer device also includes refresh rate logic configured to adjust a refresh rate based on the detected error conditions.
-
公开(公告)号:US10606692B2
公开(公告)日:2020-03-31
申请号:US15849396
申请日:2017-12-20
Applicant: International Business Machines Corporation
Inventor: Paul W. Coteus , Kyu-hyoun Kim , Luis A. Lastras-Montano , Warren E. Maule , Patrick J. Meaney , James A. O'Connor , Barry M. Trager
Abstract: An embodiment includes a method for use in operating a memory chip, the method comprising: operating the memory chip with an increased burst length relative to a standard burst length of the memory chip; and using the increased burst length to access metadata during a given operation of the memory chip. Another embodiment includes a memory module, comprising a plurality of memory chips, each memory chip being operable with an increased burst length relative to a standard burst length of the memory chip, the increased burst length being used to access metadata during a given operation of the memory module.
-
公开(公告)号:US20200097359A1
公开(公告)日:2020-03-26
申请号:US16142440
申请日:2018-09-26
Applicant: International Business Machines Corporation
Inventor: James A. O'Connor , Barry M. Trager , Warren E. Maule , Brad W. Michael , Marc A. Gollub , Patrick J. Meaney
Abstract: Embodiments of the present invention include a memory module that includes a plurality of memory devices and a memory buffer device. The memory devices are characterized as one of a high or low random bit error rate (RBER) memory device. The memory buffer device includes a read data interface to receive data read from a memory address on one of the memory devices, and common error correction logic to detect and correct error conditions in data read from both high RBER and low RBER memory devices. The memory buffer device also includes refresh rate logic configured to adjust a refresh rate based on the detected error conditions.
-
公开(公告)号:US20190188074A1
公开(公告)日:2019-06-20
申请号:US15849396
申请日:2017-12-20
Applicant: International Business Machines Corporation
Inventor: Paul W. Coteus , Kyu-hyoun Kim , Luis A. Lastras-Montano , Warren E. Maule , Patrick J. Meaney , James A. O'Connor , Barry M. Trager
CPC classification number: G06F11/1044 , G06F11/1012 , G06F11/1064 , G11C29/52
Abstract: An embodiment includes a method for use in operating a memory chip, the method comprising: operating the memory chip with an increased burst length relative to a standard burst length of the memory chip; and using the increased burst length to access metadata during a given operation of the memory chip. Another embodiment includes a memory module, comprising a plurality of memory chips, each memory chip being operable with an increased burst length relative to a standard burst length of the memory chip, the increased burst length being used to access metadata during a given operation of the memory module.
-
-
-
-
-
-
-
-
-