专利检索 ap:("Hazim Shafi" OR "Khaled S. Sedky") AND inv:"Hazim Shafi" 第 2 页

11.

发明申请
Methods and Arrangements to Manage On-Chip Memory to Reduce Memory Latency 有权
标题翻译：管理片上存储器以减少内存延迟的方法和布置

公开(公告)号：US20080263284A1

公开(公告)日：2008-10-23

申请号：US12145034

申请日：2008-06-24

申请人： Dilma Menezes da Silva , Elmootazbellah Nabil Elnozahy , Orran Yaakov Krieger , Hazim Shafi , Xiaowei Shen , Balaram Sinharoy , Robert Brett Tremaine

发明人： Dilma Menezes da Silva , Elmootazbellah Nabil Elnozahy , Orran Yaakov Krieger , Hazim Shafi , Xiaowei Shen , Balaram Sinharoy , Robert Brett Tremaine

IPC分类号： G06F12/08

CPC分类号： G06F12/08 , G06F12/0893 , G06F2212/251 , G06F2212/253

摘要： Methods, systems, and media for reducing memory latency seen by processors by providing a measure of control over on-chip memory (OCM) management to software applications, implicitly and/or explicitly, via an operating system are contemplated. Many embodiments allow part of the OCM to be managed by software applications via an application program interface (API), and part managed by hardware. Thus, the software applications can provide guidance regarding address ranges to maintain close to the processor to reduce unnecessary latencies typically encountered when dependent upon cache controller policies. Several embodiments utilize a memory internal to the processor or on a processor node so the memory block used for this technique is referred to as OCM.

摘要翻译： 考虑通过操作系统提供对软件应用（OCM）的控制的措施来减少处理器所看到的存储器延迟的方法，系统和媒体。许多实施例允许OCM的一部分由软件应用程序通过应用程序接口（API）和由硬件管理的部分来管理。因此，软件应用程序可以提供关于地址范围的指导，以保持靠近处理器，以减少在依赖于缓存控制器策略时通常遇到的不必要的延迟。几个实施例利用处理器内部或处理器节点上的存储器，因此用于该技术的存储器块被称为OCM。

12.

发明授权
Methods and arrangements to manage on-chip memory to reduce memory latency 有权
标题翻译：管理片上存储器以减少内存延迟的方法和安排

公开(公告)号：US07437517B2

公开(公告)日：2008-10-14

申请号：US11032876

申请日：2005-01-11

申请人： Dilma Menezes da Silva , Elmootazbellah Nabil Elnozahy , Orran Yaakov Krieger , Hazim Shafi , Xiaowei Shen , Balaram Sinharoy , Robert Brett Tremaine

发明人： Dilma Menezes da Silva , Elmootazbellah Nabil Elnozahy , Orran Yaakov Krieger , Hazim Shafi , Xiaowei Shen , Balaram Sinharoy , Robert Brett Tremaine

IPC分类号： G06F12/00 , G06F13/00 , G06F3/00

CPC分类号： G06F12/08 , G06F12/0893 , G06F2212/251 , G06F2212/253

摘要： Methods, systems, and media for reducing memory latency seen by processors by providing a measure of control over on-chip memory (OCM) management to software applications, implicitly and/or explicitly, via an operating system are contemplated. Many embodiments allow part of the OCM to be managed by software applications via an application program interface (API), and part managed by hardware. Thus, the software applications can provide guidance regarding address ranges to maintain close to the processor to reduce unnecessary latencies typically encountered when dependent upon cache controller policies. Several embodiments utilize a memory internal to the processor or on a processor node so the memory block used for this technique is referred to as OCM.

摘要翻译： 考虑通过操作系统提供对软件应用（OCM）的控制的措施来减少处理器所看到的存储器延迟的方法，系统和媒体。许多实施例允许OCM的一部分由软件应用程序通过应用程序接口（API）和由硬件管理的部分来管理。因此，软件应用程序可以提供关于地址范围的指导，以保持靠近处理器，以减少在依赖于缓存控制器策略时通常遇到的不必要的延迟。几个实施例利用处理器内部或处理器节点上的存储器，因此用于该技术的存储器块被称为OCM。

13.

发明授权
Chained cache coherency states for sequential non-homogeneous access to a cache line with outstanding data response 有权
标题翻译：链接高速缓存一致性状态用于对具有出色数据响应的高速缓存行的顺序非均匀访问

公开(公告)号：US07409504B2

公开(公告)日：2008-08-05

申请号：US11245312

申请日：2005-10-06

申请人： Ramakrishnan Rajamony , Hazim Shafi , Derek Edward Williams , Kenneth Lee Wright

发明人： Ramakrishnan Rajamony , Hazim Shafi , Derek Edward Williams , Kenneth Lee Wright

IPC分类号： G06F12/00

CPC分类号： G06F12/0831

摘要： A method for sequentially coupling successive processor requests for a cache line before the data is received in the cache of a first coupled processor. Both homogenous and non-homogenous operations are chained to each other, and the coherency protocol includes several new intermediate coherency responses associated with the chained states. Chained coherency states are assigned to track the chain of processor requests and the grant of access permission prior to receipt of the data at the first processor. The chained coherency states also identify the address of the receiving processor. When data is received at the cache of the first processor within the chain, the processor completes its operation on (or with) the data and then forwards the data to the next processor in the chain. The chained coherency protocol frees up address bus bandwidth by reducing the number of retries.

摘要翻译： 一种用于在数据在第一耦合处理器的高速缓存中接收数据之前顺序耦合高速缓存行的连续处理器请求的方法。同质和非均匀的操作彼此链接，并且一致性协议包括与链接状态相关联的几个新的中间一致性响应。分配链接一致性状态以在第一处理器接收到数据之前跟踪处理器请求链和授予访问权限。链接的一致性状态还标识接收处理器的地址。当在链中的第一处理器的高速缓存处接收到数据时，处理器完成其对数据的（或与）数据的操作，然后将数据转发到链中的下一个处理器。链接的一致性协议通过减少重试次数来释放地址总线带宽。

14.

发明授权
Mechanisms and methods for using data access patterns 有权
标题翻译：使用数据访问模式的机制和方法

公开(公告)号：US07395407B2

公开(公告)日：2008-07-01

申请号：US11250288

申请日：2005-10-14

申请人： Xiaowei Shen , Hazim Shafi

发明人： Xiaowei Shen , Hazim Shafi

IPC分类号： G06F12/00

CPC分类号： G06F12/0862 , G06F12/0815 , G06F2212/6026

摘要： The present invention comprises a data access pattern interface that allows software to specify one or more data access patterns such as stream access patterns, pointer-chasing patterns and producer-consumer patterns. Software detects a data access pattern for a memory region and passes the data access pattern information to hardware via proper data access pattern instructions defined in the data access pattern interface. Hardware maintains the data access pattern information properly when the data access pattern instructions are executed. Hardware can then use the data access pattern information to dynamically detect data access patterns for a memory region throughout the program execution, and voluntarily invoke appropriate memory and cache operations such as pre-fetch, pre-send, acquire-ownership and release-ownership. Further, hardware can provide runtime monitoring information for memory accesses to the memory region, wherein the runtime monitoring information indicates whether the software-provided data access pattern information is accurate.

摘要翻译： 本发明包括数据访问模式接口，其允许软件指定一个或多个数据访问模式，例如流访问模式，指针追踪模式和生产者 - 消费者模式。软件检测存储器区域的数据访问模式，并通过数据访问模式界面中定义的适当的数据访问模式指令将数据访问模式信息传递给硬件。当执行数据访问模式指令时，硬件正确地维护数据访问模式信息。然后，硬件可以使用数据访问模式信息在整个程序执行期间动态地检测存储器区域的数据访问模式，并且主动地调用适当的存储器和缓存操作，例如预取，预发送，获取所有权和释放所有权。此外，硬件可以提供用于存储器访问存储器区域的运行时监视信息，其中运行时监视信息指示软件提供的数据访问模式信息是否准确。

15.

发明申请
System and Method for Reducing Unnecessary Cache Operations 失效
标题翻译：减少不必要的缓存操作的系统和方法

公开(公告)号：US20070136535A1

公开(公告)日：2007-06-14

申请号：US11674960

申请日：2007-02-14

申请人： Ramakrishnan Rajamony , Hazim Shafi , William Speight , Lixin Zhang

发明人： Ramakrishnan Rajamony , Hazim Shafi , William Speight , Lixin Zhang

IPC分类号： G06F12/00

CPC分类号： G06F12/0897 , G06F12/0804 , G06F12/0817

摘要： A system and method for cache management in a data processing system. The data processing system includes a processor and a memory hierarchy. The memory hierarchy includes at least an upper memory cache, at least a lower memory cache, and a write-back data structure. In response to replacing data from the upper memory cache, the upper memory cache examines the write-back data structure to determine whether or not the data is present in the lower memory cache. If the data is present in the lower memory cache, the data is replaced in the upper memory cache without casting out the data to the lower memory cache.

摘要翻译： 一种用于数据处理系统中缓存管理的系统和方法。数据处理系统包括处理器和存储器层级。存储器层级至少包括上部存储器高速缓存，至少下部存储器高速缓存和回写数据结构。响应于从上部存储器高速缓存替换数据，上部存储器高速缓存检查回写数据结构以确定数据是否存在于下部存储器高速缓存中。如果数据存在于较低存储器高速缓存中，则数据将在上部存储器高速缓存中替换，而不会将数据丢弃到较低的内存高速缓存。

16.

发明申请
System and method of managing cache hierarchies with adaptive mechanisms 失效
标题翻译：用自适应机制管理缓存层次的系统和方法

公开(公告)号：US20060277366A1

公开(公告)日：2006-12-07

申请号：US11143328

申请日：2005-06-02

申请人： Ramakrishnan Rajamony , Hazim Shafi , William Speight , Lixin Zhang

发明人： Ramakrishnan Rajamony , Hazim Shafi , William Speight , Lixin Zhang

IPC分类号： G06F12/00

CPC分类号： G06F12/0897 , G06F12/0817 , G06F12/0822

摘要： A system and method of managing cache hierarchies with adaptive mechanisms. A preferred embodiment of the present invention includes, in response to selecting a data block for eviction from a memory cache (the source cache) out of a collection of memory caches, examining a data structure to determine whether an entry exists that indicates that the data block has been evicted from the source memory cache, or another peer cache, to a slower cache or memory and subsequently retrieved from the slower cache or memory into the source memory cache or other peer cache. Also, a preferred embodiment of the present invention includes, in response to determining the entry exists in the data structure, selecting a peer memory cache out of the collection of memory caches at the same level in the hierarchy to receive the data block from the source memory cache upon eviction.

摘要翻译： 一种使用自适应机制管理缓存层次结构的系统和方法。本发明的优选实施例包括响应于从存储器高速缓存的集合中的存储器高速缓存（源高速缓存）中选择用于逐出的数据块，检查数据结构以确定是否存在指示数据块已经从源存储器高速缓存或另一个对等缓存驱逐到较慢的高速缓存或存储器，并随后从较慢的高速缓存或存储器检索到源存储器高速缓存或其他对等高速缓存。此外，本发明的优选实施例包括响应于确定条目存在于数据结构中，从层级中的相同级别的存储器高速缓存的集合中选择对等存储器高速缓存以从源接收数据块内存缓存被驱逐。

17.

发明授权
Indicating parallel operations with user-visible events 有权

公开(公告)号：US09846628B2

公开(公告)日：2017-12-19

申请号：US12816165

申请日：2010-06-15

申请人： Edward G. Essey , Igor Ostrovsky , Pooja Nagpal , Huseyin S. Yildiz , Hazim Shafi , William T. Colburn

发明人： Edward G. Essey , Igor Ostrovsky , Pooja Nagpal , Huseyin S. Yildiz , Hazim Shafi , William T. Colburn

IPC分类号： G06F9/44 , G06F3/048 , G06F11/32 , G06F11/34

CPC分类号： G06F11/323 , G06F11/3409 , G06F11/3476 , G06F2201/86

摘要： The present invention extends to methods, systems, and computer program products for indicating parallel operations with user-visible events. Event markers can be used to indicate an abstracted outer layer of execution as well as expose internal specifics of parallel processing systems, including systems that provide data parallelism. Event markers can be used to show a variety of execution characteristics including higher-level markers to indicate the beginning and end of an execution program (e.g., a query). Inside the execution program (query) individual fork/join operations can be indicated with sub-levels of markers to expose their operations. Additional decisions made by an execution engine, such as, for example, when elements initially yield, when queries overlap or nest, when the query is cancelled, when the query bails to sequential operation, when premature merging or re-partitioning are needed can also be exposed.

18.

发明授权
Analysis and visualization of cluster resource utilization 有权
标题翻译：集群资源利用的分析与可视化

公开(公告)号：US08990551B2

公开(公告)日：2015-03-24

申请号：US12883859

申请日：2010-09-16

申请人： Hazim Shafi

发明人： Hazim Shafi

IPC分类号： G06F1/24 , G06F9/00 , G06F9/50 , G06F11/32 , G06F11/36 , G06F11/34

CPC分类号： G06F9/5011 , G06F11/323 , G06F11/3404 , G06F11/3612 , G06F2209/508

摘要： An analysis and visualization depicts how an application is leveraging processor cores of a distributed computing system, such as a computer cluster, in time. The analysis and visualization enables a developer to readily identify the degree of concurrency exploited by an application at runtime and the amount of overhead used by libraries or middleware. Information regarding processes or threads running on the nodes over time is received, analyzed, and presented to indicate portions of computer cluster that are used by the application, idle, other processes, and libraries in the system. The analysis and visualization can help a developer understand or confirm contention for or under-utilization of system resources for the application and libraries.

摘要翻译： 分析和可视化描述了应用程序如何及时利用分布式计算系统（如计算机集群）的处理器核心。分析和可视化使开发人员能够轻松识别应用程序在运行时利用的并发程度以及库或中间件使用的开销量。接收，分析和呈现关于节点上随时间运行的进程或线程的信息，以指示系统中应用程序，空闲，其他进程和库使用的计算机集群的部分。分析和可视化可以帮助开发人员了解或确认应用程序和库的系统资源的争用或利用不足。

19.

发明授权
Assist thread for injecting cache memory in a microprocessor 有权
标题翻译：协助在微处理器中注入高速缓存的线程

公开(公告)号：US08949837B2

公开(公告)日：2015-02-03

申请号：US13434423

申请日：2012-03-29

申请人： Patrick Joseph Bohrer , Orran Yaakov Krieger , Ramakrishnan Rajamony , Michael Rosenfield , Hazim Shafi , Balaram Sinharoy , Robert Brett Tremaine

发明人： Patrick Joseph Bohrer , Orran Yaakov Krieger , Ramakrishnan Rajamony , Michael Rosenfield , Hazim Shafi , Balaram Sinharoy , Robert Brett Tremaine

IPC分类号： G06F9/46 , G06F12/00 , G06F9/38 , G06F12/08 , G06F9/48

CPC分类号： G06F9/383 , G06F9/3851 , G06F9/4881 , G06F12/0862

摘要： A data processing system includes a microprocessor having access to multiple levels of cache memories. The microprocessor executes a main thread compiled from a source code object. The system includes a processor for executing an assist thread also derived from the source code object. The assist thread includes memory reference instructions of the main thread and only those arithmetic instructions required to resolve the memory reference instructions. A scheduler configured to schedule the assist thread in conjunction with the corresponding execution thread is configured to execute the assist thread ahead of the execution thread by a determinable threshold such as the number of main processor cycles or the number of code instructions. The assist thread may execute in the main processor or in a dedicated assist processor that makes direct memory accesses to one of the lower level cache memory elements.

摘要翻译： 数据处理系统包括具有访问多级缓存存储器的微处理器。微处理器执行从源代码对象编译的主线程。该系统包括用于执行也源自源代码对象的辅助线程的处理器。辅助线程包括主线程的存储器参考指令和仅解析存储器参考指令所需的算术指令。配置成与对应的执行线程一起调度辅助线程的调度器被配置为通过诸如主处理器周期的数量或代码指令的数量的可确定的阈值来执行执行线程之前的辅助线程。辅助线程可以在主处理器或专用辅助处理器中执行，该处理器直接对下一级高速缓冲存储器元件之一进行存储器访问。

20.

发明申请
MARKER CORRELATION OF APPLICATION CONSTRUCTS WITH VISUALIZATIONS 有权
标题翻译：应用程序结构与可视化的标记相关

公开(公告)号：US20110078661A1

公开(公告)日：2011-03-31

申请号：US12571075

申请日：2009-09-30

申请人： Hazim Shafi

发明人： Hazim Shafi

IPC分类号： G06F9/44

CPC分类号： G06F11/3664 , G06F11/323 , G06F11/3476 , G06F11/3624 , G06F11/3632 , G06F2201/865

摘要： The use of marker(s) in the source code of a program under evaluation. A representation of the marker(s) remains in the binary version of the program under evaluation. During execution, upon executing the marker, data is gathered regarding the timeline of the execution of the marker in the context of overall timeline of execution. A visualization of the marker is then displayed that illustrates the execution of the marker in the context of a larger timeframe of execution. Optionally, the marker may be associated with text, or other data, at least some of which being rendered with the visualization. Accordingly, an application developer, or indeed anyone evaluating the program, may place markers within source code and/or evaluate the timeline of execution of those markers.

摘要翻译： 在评估程序的源代码中使用标记。标记的表示保留在正在评估的程序的二进制版本中。在执行期间，在执行标记时，在总执行时间线的上下文中收集关于标记执行的时间线的数据。然后显示标记的可视化，其示出了在更大的执行时间范围内的标记的执行。可选地，标记可以与文本或其他数据相关联，其中至少一些被可视化呈现。因此，应用程序开发人员或任何评估程序的人员可以将标记放置在源代码中和/或评估执行这些标记的时间线。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类