专利检索 ap:("Stephen R. Van Doren" OR "Gregory Edward Tierney" OR "Simon C. Steely, Jr.") AND inv:"Simon C. Steely, Jr." 第 5 页

41.

发明授权
Set prediction cache memory system using bits of the main memory address 失效
标题翻译：使用主存储器地址的位设置预测高速缓存存储器系统

公开(公告)号：US5235697A

公开(公告)日：1993-08-10

申请号：US956827

申请日：1992-10-05

申请人： Simon C. Steely, Jr. , John H. Zurawski

发明人： Simon C. Steely, Jr. , John H. Zurawski

IPC分类号： G06F12/08

CPC分类号： G06F12/0864 , G06F2212/6082

摘要： The set-prediction cache memory system comprises an extension of a set-associative cache memory system which operates in parallel to the set-associative structure to increase the overall speed of the cache memory while maintaining its performance. The set prediction cache memory system includes a plurality of data RAMs and a plurality of tag RAMs to store data and data tags, respectively. Also included in the system are tag store comparators to compare the tag data contained in a specific tag RAM location with a second index comprising a predetermined second portion of a main memory address. The elements of the set prediction cache memory system which operate in parallel to the set-associative cache memory include: a set-prediction RAM which receives at least one third index comprising a predetermined third portion of the main memory address, and stores such third index to essentially predict the data cache RAM holding the data indexed by the third index; a data-select multiplexer which receives the prediction index and selects a data output from the data cache RAM indexed by the prediction index; and a mispredict logic device to determine if the set prediction RAM predicted the correct data cache RAM and if not, issue a mispredict signal which may comprise a write data signal, the write data signal containing information intended to correct the prediction index contained in the set prediction RAM.

摘要翻译： 设置预测高速缓冲存储器系统包括与集合关联结构并行操作的集合关联高速缓冲存储器系统的扩展，以在保持其性能的同时增加高速缓冲存储器的总体速度。集合预测高速缓冲存储器系统包括分别存储数据和数据标签的多个数据RAM和多个标签RAM。还包括在系统中的标签存储比较器，用于将包含在特定标签RAM位置中的标签数据与包含主存储器地址的预定第二部分的第二索引进行比较。与设置关联高速缓存存储器并行操作的集合预测高速缓冲存储器系统的元件包括：设置预测RAM，其接收包含主存储器地址的预定第三部分的至少一个第三索引，并存储这样的第三索引以基本预测由第三指标索引的数据的数据缓存RAM; 数据选择多路复用器，其接收预测索引并选择从由预测索引索引的数据高速缓存RAM输出的数据; 以及用于确定所设置的预测RAM是否预测正确的数据高速缓存RAM的错误预测逻辑设备，如果不是，则发出可能包括写入数据信号的错误预测信号，所述写入数据信号包含旨在校正包含在该组中的预测索引的信息预测RAM。

42.

发明授权
Signature based hit-predicting cache 有权
标题翻译：基于签名的命中预测缓存

公开(公告)号：US09262327B2

公开(公告)日：2016-02-16

申请号：US13538390

申请日：2012-06-29

申请人： Simon C. Steely, Jr. , William C. Hasenplaugh , Aamer Jaleel , Joel S. Emer , Carole-Jean Wu

发明人： Simon C. Steely, Jr. , William C. Hasenplaugh , Aamer Jaleel , Joel S. Emer , Carole-Jean Wu

IPC分类号： G06F12/08

CPC分类号： G06F12/0862

摘要： An apparatus may comprise a cache file having a plurality of cache lines and a hit predictor. The hit predictor may contain a table of counter values indexed with signatures that are associated with the plurality of cache lines. The apparatus may fill cache lines into the cache file with either low or high priority. Low priority lines may be chosen to be replaced by a replacement algorithm before high priority lines. In this way, the cache naturally may contain more high priority lines than low priority ones. This priority filling process may improve the performance of most replacement schemes including the best known schemes which are already doing better than LRU.

摘要翻译： 装置可以包括具有多个高速缓存行和命中预测器的高速缓存文件。命中预测器可以包含用与多个高速缓存行相关联的签名索引的计数器值的表。该装置可以以低优先级或高优先级将高速缓存行填充到高速缓存文件中。低优先级行可以被选择为在高优先级行之前由替换算法代替。以这种方式，高速缓存当然可以包含比优先级更高的优先级更高的行。该优先填充过程可以改善大多数替换方案的性能，包括已经比LRU更好的已知方案。

43.

发明授权
Method and apparatus for adaptively bypassing one or more levels of a cache hierarchy 有权
标题翻译：用于自适应地绕过高速缓存层级的一个或多个级别的方法和装置

公开(公告)号：US06647466B2

公开(公告)日：2003-11-11

申请号：US09769552

申请日：2001-01-25

申请人： Simon C. Steely, Jr.

发明人： Simon C. Steely, Jr.

IPC分类号： G06F1200

CPC分类号： G06F12/0897 , G06F12/0811 , G06F12/0888 , G06F12/0891

摘要： A system for adaptively bypassing one or more higher cache levels following a miss in a lower level of a cache hierarchy is described. Each cache level preferably includes a tag store containing address and state information for each cache line resident in the respective cache. When an invalidate request is received at a given cache hierarchy, each cache level is searched for the address specified by the invalidate request. When an address match is detected, the state of the respective cache line is changed to the invalid state, although the address of the cache line is left in the tag store. Thereafter, if the processor or entity associated with this cache hierarchy issues its own request for this same cache line, the cache hierarchy begins searching the tag store of each level starting with the lowest cache level. Since the address of the invalidated cache line was left in the respective tag store, a match will be detected at one of the cache levels, although the corresponding state of this cache line is invalid. This condition is specifically detected and is considered to be an “inval_miss” occurrence. In response, to an inval_miss, the cache hierarchy calls off searching any higher levels, and instead, issues a memory reference request for the desired cache line. In a further embodiment, the entity that sourced an invalidate request is stored, and a subsequent memory reference request for the same cache line is sent directly to the source entity.

摘要翻译： 描述了用于在高速缓存层级的较低级别中错过之后自适应地绕过一个或多个更高的高速缓存级别的系统。每个高速缓存级别优选地包括标签存储，其包含驻留在相应高速缓存中的每个高速缓存行的地址和状态信息。当在给定的缓存层次结构中接收到无效请求时，将搜索每个高速缓存级别以查找由无效请求指定的地址。当检测到地址匹配时，尽管高速缓存行的地址被留在标签存储器中，但各个高速缓存行的状态被改变为无效状态。此后，如果与该高速缓存层级相关联的处理器或实体发出其对该相同高速缓存行的自身请求，则高速缓存层级开始以最低高速缓存级别开始搜索每个级别的标签存储。由于无效高速缓存行的地址被留在相应的标签存储中，所以在高速缓存级别之一处将检测到匹配，尽管该高速缓存行的相应状态是无效的。该条件被特别检测并被认为是“inval_miss”事件。作为响应，对于inval_miss，缓存层次结构调用搜索任何更高级别，而是发出所需高速缓存行的内存引用请求。在另一个实施例中，存储了源自无效请求的实体，并且将相同高速缓存行的后续存储器引用请求直接发送到源实体。

44.

发明授权
System for passing an index value with each prediction in forward direction to enable truth predictor to associate truth value with particular branch instruction 失效
标题翻译：用于向前传递每个预测的索引值的系统，以使真实预测器能够将真值与特定分支指令相关联

公开(公告)号：US6081887A

公开(公告)日：2000-06-27

申请号：US191869

申请日：1998-11-12

申请人： Simon C. Steely, Jr. , Edward J. McLellan , Joel S. Emer

发明人： Simon C. Steely, Jr. , Edward J. McLellan , Joel S. Emer

IPC分类号： G06F9/38 , G06F9/32

CPC分类号： G06F9/3844

摘要： A technique for predicting the result of a conditional branch instruction for use with a processor having instruction pipeline. A stored predictor is connected to the front end of the pipeline and is trained from a truth based predictor connected to the back end of the pipeline. The stored predictor is accessible in one instruction cycle, and therefore provides minimum predictor latency. Update latency is minimized by storing multiple predictions in the front end stored predictor which are indexed by an index counter. The multiple predictions, as provided by the back end, are indexed by the index counter to select a particular one as current prediction on a given instruction pipeline cycle. The front end stored predictor also passes along to the back end predictor, such as through the instruction pipeline, a position value used to generate the predictions. This further structure accommodates ghost branch instructions that turn out to be flushed out of the pipeline when it must be backed up. As a result, the front end always provides an accurate prediction with minimum update latency.

摘要翻译： 一种用于预测与具有指令流水线的处理器一起使用的条件转移指令的结果的技术。存储的预测器连接到管道的前端，并且从连接到管道后端的基于真实的预测器训练。存储的预测器可以在一个指令周期中访问，因此提供最小预测器延迟。通过将多个预测存储在由索引计数器索引的前端存储的预测器中来最小化更新延迟。由后端提供的多个预测由索引计数器索引，以选择特定的预测作为给定指令流水线周期上的当前预测。前端存储的预测器还将传递到后端预测器，例如通过指令流水线，用于产生预测的位置值。这种进一步的结构可以容纳重影分支指令，当它必须被备份时，这些指令将被清除流出管道。因此，前端总是以最小的更新延迟提供准确的预测。

45.

发明授权
Next line prediction apparatus for a pipelined computed system 失效
标题翻译：用于流水线计算系统的下一行预测装置

公开(公告)号：US5283873A

公开(公告)日：1994-02-01

申请号：US546364

申请日：1990-06-29

申请人： Simon C. Steely, Jr. , David J. Sager

发明人： Simon C. Steely, Jr. , David J. Sager

IPC分类号： G06F9/38 , G06F9/34 , G06F9/40

CPC分类号： G06F9/3806

摘要： A next line prediction mechanism for predicting a next instruction index to an instruction cache of a computer pipeline, has a latency equal to the cycle time of the instruction cache to maximize the instruction bandwidth out of the instruction cache. The instruction cache outputs a block of instructions with each fetch initiated by a next instruction index provided by the line prediction mechanism. The instructions of the block are processed in parallel for instruction decode and branch prediction to maintain a high rate of instruction flow through the pipeline.

摘要翻译： 用于预测对计算机流水线的指令高速缓存的下一个指令索引的下一行预测机制具有等于指令高速缓冲存储器的循环时间的等待时间，以使指令高速缓存中的指令带宽最大化。指令高速缓存输出由行预测机制提供的下一指令索引发起的每次提取的指令块。块的指令被并行处理，用于指令解码和分支预测，以保持高流量的指令流经管线。

46.

发明授权
Register mapping system having a log containing sequential listing of registers that were changed in preceding cycles for precise post-branch recovery 失效
标题翻译：具有包含顺序列表的寄存器映射系统的寄存器映射系统，用于在精确的分支后恢复中预测循环中的寄存器

公开(公告)号：US5197132A

公开(公告)日：1993-03-23

申请号：US546411

申请日：1990-06-29

申请人： Simon C. Steely, Jr. , David J. Sager

发明人： Simon C. Steely, Jr. , David J. Sager

IPC分类号： G06F9/38

CPC分类号： G06F9/384 , G06F9/3863

摘要： A register map having a free list of available physical locations in a register file, a log containing a sequential listing of logical registers changed during a predetermined number of cycles, a back-up map associating the logical registers with corresponding physical homes at a back-up point in a computer pipeline operation and a predicted map associating the logical registers with corresponding physical homes at a current point in the computer pipeline operation. A set of valid bits is associated with the maps to indicate whether a particular logical register is to be taken from the back-up map or the predicted map indication of a corresponding physical home. The valid bits can be "flash cleared" in a single cycle to back-up the computer pipeline to the back-up point during a trap event.

47.

发明授权
Subroutine return prediction mechanism using ring buffer and comparing predicated address with actual address to validate or flush the pipeline 失效
标题翻译：使用环形缓冲器的SUBROUTINE返回预测机制，并与实际地址进行比较以预测地址以验证或冲洗管道

公开(公告)号：US5179673A

公开(公告)日：1993-01-12

申请号：US451943

申请日：1989-12-18

申请人： Simon C. Steely, Jr. , David J. Sager

发明人： Simon C. Steely, Jr. , David J. Sager

IPC分类号： G06F9/32 , G06F9/38 , G06F9/42

CPC分类号： G06F9/3806 , G06F9/30054 , G06F9/322 , G06F9/4426

摘要： A method and arrangement for producing a predicted subroutine return address in response to entry of a subroutine return instruction in a computer pipeline that has a ring pointer counter and a ring buffer coupled to the ring pointer counter. The ring pointer counter contains a ring pointer that is changed when either a subroutine call instruction or return instruction enters the computer pipeline. The ring buffer has buffer locations which store a value present at its input into the buffer location pointed to by the ring pointer when a subroutine call instruction enters the pipeline. The ring buffer provides a value from the buffer location pointed to by the ring pointer when a subroutine return instruction enters the computer pipeline, this provided value being the predicted subroutine return address.

48.

发明授权
High bandwidth full-block write commands 有权

公开(公告)号：US10102124B2

公开(公告)日：2018-10-16

申请号：US13993716

申请日：2011-12-28

申请人： Simon C. Steely, Jr. , William C. Hasenplaugh , Joel S. Emer , Samantika Subramaniam

发明人： Simon C. Steely, Jr. , William C. Hasenplaugh , Joel S. Emer , Samantika Subramaniam

IPC分类号： G06F13/00 , G06F12/0808 , G06F13/28

摘要： A micro-architecture may provide a hardware and software of a high bandwidth write command. The micro-architecture may invoke a method to perform the high bandwidth write command. The method may comprise sending a write request from a requester to a record keeping structure. The write request may have a memory address of a memory that stores requested data. The method may further determine copies of the requested data being present in a distributed cache system outside the memory, sending invalidation requests to elements holding copies of the requested data in the distributed cache system, sending a notification to the requester to inform presence of copies of the requested data and sending a write response message after a latest value of the requested data and all invalidation acknowledgements have been received.

49.

发明授权
Method and apparatus for optimizing the usage of cache memories 有权
标题翻译：用于优化高速缓存存储器的使用的方法和装置

公开(公告)号：US09418016B2

公开(公告)日：2016-08-16

申请号：US12974907

申请日：2010-12-21

申请人： Simon C. Steely, Jr. , Joel S. Emer , William C. Hasenplaugh

发明人： Simon C. Steely, Jr. , Joel S. Emer , William C. Hasenplaugh

IPC分类号： G06F12/02 , G06F12/08

CPC分类号： G06F12/0891 , G06F12/0804 , G06F12/0815 , G06F2212/1028 , Y02D10/13

摘要： A method and apparatus to reduce unnecessary write backs of cached data to a main memory and to optimize the usage of a cache memory tag directory. In one embodiment of the invention, the power consumption of a processor can be saved by eliminating write backs of cache memory lines that has information that has reached its end-of-life. In one embodiment of the invention, when a processing unit is required to clear one or more cache memory lines, it uses a write-zero command to clear the one or more cache memory lines. The processing unit does not perform a write operation to move or pass data values of zero to the one or more cache memory lines. By doing so, it reduces the power consumption of the processing unit.

摘要翻译： 一种减少对主存储器的缓存数据的不必要的回写并优化高速缓存存储器标签目录的使用的方法和装置。在本发明的一个实施例中，通过消除具有已经达到其使用寿命的信息的高速缓冲存储器线的写回，可以节省处理器的功耗。在本发明的一个实施例中，当需要处理单元来清除一个或多个高速缓存存储器线时，它使用写入零命令来清除一个或多个高速缓存存储器线。处理单元不执行写入操作以将数据值0移动或传递给一个或多个高速缓存存储器线。通过这样做，它降低了处理单元的功耗。

50.

发明授权
Cache coherency mechanism using arbitration masks 有权
标题翻译：使用仲裁掩码的缓存一致性机制

公开(公告)号：US06961825B2

公开(公告)日：2005-11-01

申请号：US09768418

申请日：2001-01-24

申请人： Simon C. Steely, Jr. , Stephen Van Doren , Madhumitra Sharma

发明人： Simon C. Steely, Jr. , Stephen Van Doren , Madhumitra Sharma

IPC分类号： G06F12/08 , H04L29/08 , G06F12/00

CPC分类号： H04L67/2804 , G06F12/0824 , G06F12/0826 , H04L67/2852 , H04L69/329

摘要： A distributed processing system includes a cache coherency mechanism that essentially encodes network routing information into sectored presence bits. The mechanism organizes the sectored presence bits as one or more arbitration masks that system switches decode and use directly to route invalidate messages through one or more higher levels of the system. The lower level or levels of the system use local routing mechanisms, such as local directories, to direct the invalidate messages to the individual processors that are holding the data of interest.

摘要翻译： 分布式处理系统包括高速缓存一致性机制，其基本上将网络路由信息编码为扇区存在位。该机制将分区存在位组织为一个或多个仲裁掩码，系统交换机直接解码并使用，以通过系统的一个或多个更高级别路由无效消息。系统的较低级别或级别使用本地路由机制（如本地目录）将无效消息引导到保存感兴趣的数据的各个处理器。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类