专利检索 ap:("Pranay Koka" OR "Michael O. McCracken" OR "Herbert D. Schwetman, Jr." OR "David A. Munday") AND inv:"David A. Munday" 第 1 页

1.

发明授权
Using broadcast-based TLB sharing to reduce address-translation latency in a shared-memory system with electrical interconnect 有权
标题翻译：使用基于广播的TLB共享来减少具有电互连的共享存储器系统中的地址转换延迟

公开(公告)号：US09009446B2

公开(公告)日：2015-04-14

申请号：US13565460

申请日：2012-08-02

申请人： Pranay Koka , David A. Munday , Michael O. McCracken , Herbert D. Schwetman, Jr.

发明人： Pranay Koka , David A. Munday , Michael O. McCracken , Herbert D. Schwetman, Jr.

IPC分类号： G06F12/00 , G06F12/10

CPC分类号： G06F12/10 , G06F12/0833 , G06F12/1027 , G06F12/1072 , G06F2212/1024 , G06F2212/682 , Y02D10/13

摘要： The disclosed embodiments provide a system that uses broadcast-based TLB-sharing techniques to reduce address-translation latency in a shared-memory multiprocessor system with two or more nodes that are connected by an electrical interconnect. During operation, a first node receives a memory operation that includes a virtual address. Upon determining that one or more TLB levels of the first node will miss for the virtual address, the first node uses the electrical interconnect to broadcast a TLB request to one or more additional nodes of the shared-memory multiprocessor in parallel with scheduling a speculative page-table walk for the virtual address. If the first node receives a TLB entry from another node of the shared-memory multiprocessor via the electrical interconnect in response to the TLB request, the first node cancels the speculative page-table walk. Otherwise, if no response is received, the first node instead waits for the completion of the page-table walk.

摘要翻译： 所公开的实施例提供了一种使用基于广播的TLB共享技术来减少具有通过电互连连接的两个或更多个节点的共享存储器多处理器系统中的地址转换等待时间的系统。在操作期间，第一节点接收包括虚拟地址的存储器操作。在确定第一节点的一个或多个TLB级别将为虚拟地址而错过时，第一节点使用电互连向共享存储器多处理器的一个或多个附加节点广播TLB请求，并且与调度推测页面 -table walk为虚拟地址。如果第一节点响应于TLB请求经由电互连从另一节点接收到共享存储器多处理器的TLB条目，则第一节点取消推测页表行进。否则，如果没有收到响应，则第一个节点等待完成页表步行。

2.

发明申请
USING BROADCAST-BASED TLB SHARING TO REDUCE ADDRESS-TRANSLATION LATENCY IN A SHARED-MEMORY SYSTEM WITH ELECTRICAL INTERCONNECT 有权
标题翻译：使用基于广播的TLB共享在具有电互连的共享记忆系统中减少地址转换延迟

公开(公告)号：US20140040562A1

公开(公告)日：2014-02-06

申请号：US13565460

申请日：2012-08-02

申请人： Pranay Koka , David A. Munday , Michael O. McCracken , Herbert D. Schwetman, JR.

发明人： Pranay Koka , David A. Munday , Michael O. McCracken , Herbert D. Schwetman, JR.

IPC分类号： G06F12/08 , G06F12/10

CPC分类号： G06F12/10 , G06F12/0833 , G06F12/1027 , G06F12/1072 , G06F2212/1024 , G06F2212/682 , Y02D10/13

摘要： The disclosed embodiments provide a system that uses broadcast-based TLB-sharing techniques to reduce address-translation latency in a shared-memory multiprocessor system with two or more nodes that are connected by an electrical interconnect. During operation, a first node receives a memory operation that includes a virtual address. Upon determining that one or more TLB levels of the first node will miss for the virtual address, the first node uses the electrical interconnect to broadcast a TLB request to one or more additional nodes of the shared-memory multiprocessor in parallel with scheduling a speculative page-table walk for the virtual address. If the first node receives a TLB entry from another node of the shared-memory multiprocessor via the electrical interconnect in response to the TLB request, the first node cancels the speculative page-table walk. Otherwise, if no response is received, the first node instead waits for the completion of the page-table walk.

摘要翻译： 所公开的实施例提供了一种使用基于广播的TLB共享技术来减少具有通过电互连连接的两个或更多个节点的共享存储器多处理器系统中的地址转换等待时间的系统。在操作期间，第一节点接收包括虚拟地址的存储器操作。在确定第一节点的一个或多个TLB级别将为虚拟地址而错过时，第一节点使用电互连向共享存储器多处理器的一个或多个附加节点广播TLB请求，并且与调度推测页面 -table walk为虚拟地址。如果第一节点响应于TLB请求经由电互连从另一节点接收到共享存储器多处理器的TLB条目，则第一节点取消推测页表行进。否则，如果没有收到响应，则第一个节点等待完成页表步行。

3.

发明授权
Using a shared last-level TLB to reduce address-translation latency 有权
标题翻译：使用共享的最后一级TLB来减少地址转换延迟

公开(公告)号：US09081706B2

公开(公告)日：2015-07-14

申请号：US13468904

申请日：2012-05-10

申请人： Pranay Koka , Michael O. McCracken , Herbert D. Schwetman, Jr. , David A. Munday

发明人： Pranay Koka , Michael O. McCracken , Herbert D. Schwetman, Jr. , David A. Munday

IPC分类号： G06F3/03 , G06F12/10 , G06F12/08

CPC分类号： G06F12/1027 , G06F12/0811 , G06F2212/656 , G06F2212/681

摘要： The disclosed embodiments provide techniques for reducing address-translation latency and the serialization latency of combined TLB and data cache misses in a coherent shared-memory system. For instance, the last-level TLB structures of two or more multiprocessor nodes can be configured to act together as either a distributed shared last-level TLB or a directory-based shared last-level TLB. Such TLB-sharing techniques increase the total amount of useful translations that are cached by the system, thereby reducing the number of page-table walks and improving performance. Furthermore, a coherent shared-memory system with a shared last-level TLB can be further configured to fuse TLB and cache misses such that some of the latency of data coherence operations is overlapped with address translation and data cache access latencies, thereby further improving the performance of memory operations.

摘要翻译： 所公开的实施例提供了用于在一致的共享存储器系统中减少地址转换等待时间和组合TLB和数据高速缓存未命中的串行化延迟的技术。例如，两个或多个多处理器节点的最后一级TLB结构可以配置为一起作为分布式共享的最后一级TLB或基于目录的共享的最后一级TLB。这种TLB共享技术增加了系统缓存的有用的翻译的总量，从而减少了页表行进的数量并提高了性能。此外，具有共享的最后一级TLB的一致的共享存储器系统可以被进一步配置为对TLB和高速缓存未命中进行融合，使得数据相干操作的一些等待时间与地址转换和数据高速缓存访问延迟重叠，从而进一步改善记忆操作的表现。

4.

发明授权
Using broadcast-based TLB sharing to reduce address-translation latency in a shared-memory system with optical interconnect 有权
标题翻译：使用基于广播的TLB共享来减少具有光互连的共享存储器系统中的地址转换延迟

公开(公告)号：US09235529B2

公开(公告)日：2016-01-12

申请号：US13565476

申请日：2012-08-02

申请人： Pranay Koka , David A. Munday , Michael O. McCracken , Herbert D. Schwetman, Jr.

发明人： Pranay Koka , David A. Munday , Michael O. McCracken , Herbert D. Schwetman, Jr.

IPC分类号： G06F12/00 , G06F12/10 , H04Q11/00

CPC分类号： G06F12/1027 , G06F12/0815 , G06F12/084 , G06F12/10 , G06F2212/1024 , G06F2212/681 , G06F2212/682 , H04Q2011/0052

摘要： The disclosed embodiments provide a system that uses broadcast-based TLB sharing to reduce address-translation latency in a shared-memory multiprocessor system with two or more nodes that are connected by an optical interconnect. During operation, a first node receives a memory operation that includes a virtual address. Upon determining that one or more TLB levels of the first node will miss for the virtual address, the first node uses the optical interconnect to broadcast a TLB request to one or more additional nodes of the shared-memory multiprocessor in parallel with scheduling a speculative page-table walk for the virtual address. If the first node receives a TLB entry from another node of the shared-memory multiprocessor via the optical interconnect in response to the TLB request, the first node cancels the speculative page-table walk. Otherwise, if no response is received, the first node instead waits for the completion of the page-table walk.

摘要翻译： 所公开的实施例提供一种使用基于广播的TLB共享来减少具有通过光互连连接的两个或更多个节点的共享存储器多处理器系统中的地址转换等待时间的系统。在操作期间，第一节点接收包括虚拟地址的存储器操作。在确定第一节点的一个或多个TLB级别将为虚拟地址而错过时，第一节点使用光互连来与调度推测页面并行地向共享存储器多处理器的一个或多个附加节点广播TLB请求 -table walk为虚拟地址。如果第一节点响应于TLB请求经由光互连从第一节点从共享存储器多处理器的另一节点接收到TLB条目，则第一节点取消推测页表行进。否则，如果没有收到响应，则第一个节点等待完成页表步行。

5.

发明申请
USING BROADCAST-BASED TLB SHARING TO REDUCE ADDRESS-TRANSLATION LATENCY IN A SHARED-MEMORY SYSTEM WITH OPTICAL INTERCONNECT 有权
标题翻译：使用基于广播的TLB共享减少具有光互联的共享记忆系统中的地址转换延迟

公开(公告)号：US20150301949A1

公开(公告)日：2015-10-22

申请号：US13565476

申请日：2012-08-02

申请人： Pranay Koka , David A. Munday , Michael O. McCracken , Herbert D. Schwetman, JR.

发明人： Pranay Koka , David A. Munday , Michael O. McCracken , Herbert D. Schwetman, JR.

IPC分类号： G06F12/10

CPC分类号： G06F12/1027 , G06F12/0815 , G06F12/084 , G06F12/10 , G06F2212/1024 , G06F2212/681 , G06F2212/682 , H04Q2011/0052

摘要： The disclosed embodiments provide a system that uses broadcast-based TLB sharing to reduce address-translation latency in a shared-memory multiprocessor system with two or more nodes that are connected by an optical interconnect. During operation, a first node receives a memory operation that includes a virtual address. Upon determining that one or more TLB levels of the first node will miss for the virtual address, the first node uses the optical interconnect to broadcast a TLB request to one or more additional nodes of the shared-memory multiprocessor in parallel with scheduling a speculative page-table walk for the virtual address. If the first node receives a TLB entry from another node of the shared-memory multiprocessor via the optical interconnect in response to the TLB request, the first node cancels the speculative page-table walk. Otherwise, if no response is received, the first node instead waits for the completion of the page-table walk.

摘要翻译： 所公开的实施例提供一种使用基于广播的TLB共享来减少具有通过光互连连接的两个或更多个节点的共享存储器多处理器系统中的地址转换等待时间的系统。在操作期间，第一节点接收包括虚拟地址的存储器操作。在确定第一节点的一个或多个TLB级别将为虚拟地址而错过时，第一节点使用光互连来与调度推测页面并行地向共享存储器多处理器的一个或多个附加节点广播TLB请求 -table walk为虚拟地址。如果第一节点响应于TLB请求经由光互连从第一节点从共享存储器多处理器的另一节点接收到TLB条目，则第一节点取消推测页表行进。否则，如果没有收到响应，则第一个节点等待完成页表步行。

6.

发明申请
COMBINING A REMOTE TLB LOOKUP AND A SUBSEQUENT CACHE MISS INTO A SINGLE COHERENCE OPERATION 有权
标题翻译：组合远程TLB查询和后续的高速缓存进入单一的相关操作

公开(公告)号：US20140013074A1

公开(公告)日：2014-01-09

申请号：US13494843

申请日：2012-06-12

申请人： Pranay Koka , Michael O. McCracken , Herbert D. Schwetman, JR. , David A. Munday , Jose Renau Ardevol

发明人： Pranay Koka , Michael O. McCracken , Herbert D. Schwetman, JR. , David A. Munday , Jose Renau Ardevol

IPC分类号： G06F12/10

CPC分类号： G06F12/1045 , G06F12/0817 , G06F2212/1024 , G06F2212/682

摘要： The disclosed embodiments provide techniques for reducing address-translation latency and the serialization latency of combined TLB and data cache misses in a coherent shared-memory system. For instance, the last-level TLB structures of two or more multiprocessor nodes can be configured to act together as either a distributed shared last-level TLB or a directory-based shared last-level TLB. Such TLB-sharing techniques increase the total amount of useful translations that are cached by the system, thereby reducing the number of page-table walks and improving performance. Furthermore, a coherent shared-memory system with a shared last-level TLB can be further configured to fuse TLB and cache misses such that some of the latency of data coherence operations is overlapped with address translation and data cache access latencies, thereby further improving the performance of memory operations.

摘要翻译： 所公开的实施例提供了用于在一致的共享存储器系统中减少地址转换等待时间和组合TLB和数据高速缓存未命中的串行化延迟的技术。例如，两个或多个多处理器节点的最后一级TLB结构可以配置为一起作为分布式共享的最后一级TLB或基于目录的共享的最后一级TLB。这种TLB共享技术增加了系统缓存的有用的翻译的总量，从而减少了页表行进的数量并提高了性能。此外，具有共享的最后一级TLB的一致的共享存储器系统可以被进一步配置为对TLB和高速缓存未命中进行融合，使得数据相干操作的一些等待时间与地址转换和数据高速缓存访问延迟重叠，从而进一步改善记忆操作的表现。

7.

发明授权
Combining a remote TLB lookup and a subsequent cache miss into a single coherence operation 有权
标题翻译：将远程TLB查找和后续高速缓存未命中组合到单个相干操作中

公开(公告)号：US09003163B2

公开(公告)日：2015-04-07

申请号：US13494843

申请日：2012-06-12

申请人： Pranay Koka , Michael O. McCracken , Herbert D. Schwetman, Jr. , David A. Munday , Jose Renau Ardevol

发明人： Pranay Koka , Michael O. McCracken , Herbert D. Schwetman, Jr. , David A. Munday , Jose Renau Ardevol

IPC分类号： G06F12/10 , G06F12/08

CPC分类号： G06F12/1045 , G06F12/0817 , G06F2212/1024 , G06F2212/682

摘要： The disclosed embodiments provide techniques for reducing address-translation latency and the serialization latency of combined TLB and data cache misses in a coherent shared-memory system. For instance, the last-level TLB structures of two or more multiprocessor nodes can be configured to act together as either a distributed shared last-level TLB or a directory-based shared last-level TLB. Such TLB-sharing techniques increase the total amount of useful translations that are cached by the system, thereby reducing the number of page-table walks and improving performance. Furthermore, a coherent shared-memory system with a shared last-level TLB can be further configured to fuse TLB and cache misses such that some of the latency of data coherence operations is overlapped with address translation and data cache access latencies, thereby further improving the performance of memory operations.

摘要翻译： 所公开的实施例提供了用于在一致的共享存储器系统中减少地址转换等待时间和组合TLB和数据高速缓存未命中的串行化延迟的技术。例如，两个或多个多处理器节点的最后一级TLB结构可以配置为一起作为分布式共享的最后一级TLB或基于目录的共享的最后一级TLB。这种TLB共享技术增加了系统缓存的有用的翻译的总量，从而减少了页表行进的数量并提高了性能。此外，具有共享的最后一级TLB的一致的共享存储器系统可以被进一步配置为对TLB和高速缓存未命中进行融合，使得数据相干操作的一些等待时间与地址转换和数据高速缓存访问延迟重叠，从而进一步改善记忆操作的表现。

8.

发明申请
USING A SHARED LAST-LEVEL TLB TO REDUCE ADDRESS-TRANSLATION LATENCY 有权
标题翻译：使用共享的最后一级TLB来减少地址转换延迟

公开(公告)号：US20140052917A1

公开(公告)日：2014-02-20

申请号：US13468904

申请日：2012-05-10

申请人： Pranay Koka , Michael O. McCracken , Herbert D. Schwetman, JR. , David A. Munday

发明人： Pranay Koka , Michael O. McCracken , Herbert D. Schwetman, JR. , David A. Munday

IPC分类号： G06F12/10 , G06F12/08

CPC分类号： G06F12/1027 , G06F12/0811 , G06F2212/656 , G06F2212/681

摘要： The disclosed embodiments provide techniques for reducing address-translation latency and the serialization latency of combined TLB and data cache misses in a coherent shared-memory system. For instance, the last-level TLB structures of two or more multiprocessor nodes can be configured to act together as either a distributed shared last-level TLB or a directory-based shared last-level TLB. Such TLB-sharing techniques increase the total amount of useful translations that are cached by the system, thereby reducing the number of page-table walks and improving performance. Furthermore, a coherent shared-memory system with a shared last-level TLB can be further configured to fuse TLB and cache misses such that some of the latency of data coherence operations is overlapped with address translation and data cache access latencies, thereby further improving the performance of memory operations.

摘要翻译： 所公开的实施例提供了用于在一致的共享存储器系统中减少地址转换等待时间和组合TLB和数据高速缓存未命中的串行化延迟的技术。例如，两个或多个多处理器节点的最后一级TLB结构可以配置为一起作为分布式共享的最后一级TLB或基于目录的共享的最后一级TLB。这种TLB共享技术增加了系统缓存的有用的翻译的总量，从而减少了页表行进的数量并提高了性能。此外，具有共享的最后一级TLB的一致的共享存储器系统可以被进一步配置为对TLB和高速缓存未命中进行融合，使得数据相干操作的一些等待时间与地址转换和数据高速缓存访问延迟重叠，从而进一步改善记忆操作的表现。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类