专利检索 ap:("Simon C. Steely, Jr." OR "Samantika Subramaniam" OR "William C. Hasenplaugh" OR "Joel S. Emer") AND inv:"Simon C. Steely, Jr." 第 1 页

1.

发明授权
High bandwidth full-block write commands 有权

公开(公告)号：US10102124B2

公开(公告)日：2018-10-16

申请号：US13993716

申请日：2011-12-28

申请人： Simon C. Steely, Jr. , William C. Hasenplaugh , Joel S. Emer , Samantika Subramaniam

发明人： Simon C. Steely, Jr. , William C. Hasenplaugh , Joel S. Emer , Samantika Subramaniam

IPC分类号： G06F13/00 , G06F12/0808 , G06F13/28

摘要： A micro-architecture may provide a hardware and software of a high bandwidth write command. The micro-architecture may invoke a method to perform the high bandwidth write command. The method may comprise sending a write request from a requester to a record keeping structure. The write request may have a memory address of a memory that stores requested data. The method may further determine copies of the requested data being present in a distributed cache system outside the memory, sending invalidation requests to elements holding copies of the requested data in the distributed cache system, sending a notification to the requester to inform presence of copies of the requested data and sending a write response message after a latest value of the requested data and all invalidation acknowledgements have been received.

2.

发明授权
Short circuit of probes in a chain 有权
标题翻译：探针在链中短路

公开(公告)号：US09201792B2

公开(公告)日：2015-12-01

申请号：US13996012

申请日：2011-12-29

申请人： Simon C. Steely, Jr. , Samantika Subramaniam , William C. Hasenplaugh , Joel S. Emer

发明人： Simon C. Steely, Jr. , Samantika Subramaniam , William C. Hasenplaugh , Joel S. Emer

IPC分类号： G06F12/08

CPC分类号： G06F12/084 , G06F12/082

摘要： A multi-core processing apparatus may provide a cache probe and data retrieval method. The method may comprise sending a memory request from a requester to a record keeping structure. The memory request may have a memory address of a memory that stores requested data. The method may further comprise determining that a local last accessor of the memory address may have a copy of the requested data up to date with the memory. The local last accessor may be within a local domain that the requester belongs to. The method may further comprise sending a cache probe to the local last accessor and retrieving a latest value of the requested data from the local last accessor to the requester.

摘要翻译： 多核处理装置可以提供高速缓存探针和数据检索方法。该方法可以包括将请求者的存储器请求发送到记录保存结构。存储器请求可以具有存储请求的数据的存储器的存储器地址。该方法还可以包括确定存储器地址的本地最后访问器可以具有与存储器一起的所请求数据的副本。本地最后一个访问者可能在请求者所属的本地域内。该方法还可以包括向本地最后一个访问器发送高速缓存探测器，并且从本地最后一个访问器检索所请求的数据的最新值到请求者。

3.

发明授权
Domain state 有权
标题翻译：域状态

公开(公告)号：US09588889B2

公开(公告)日：2017-03-07

申请号：US13995991

申请日：2011-12-29

申请人： Simon C. Steely, Jr. , William C. Hasenplaugh , Joel S. Emer

发明人： Simon C. Steely, Jr. , William C. Hasenplaugh , Joel S. Emer

IPC分类号： G06F12/08 , G06F13/00

CPC分类号： G06F12/0802 , G06F12/0811 , G06F12/0817 , G06F13/00

摘要： Method and apparatus to efficiently maintain cache coherency by reading/writing a domain state field associated with a tag entry within a cache tag directory. A value may be assigned to a domain state field of a tag entry in a cache tag directory. The cache tag directory may belong to a hierarchy of cache tag directories. Each tag entry may be associated with a cache line from a cache belonging to a first domain. The first domain may contain multiple caches. The value of the domain state field may indicate whether its associated cache line can be read or changed.

摘要翻译： 通过读/写与缓存标签目录中的标签条目相关联的域状态字段来有效地维持高速缓存一致性的方法和装置。可以将值分配给缓存标签目录中的标签条目的域状态字段。缓存标签目录可能属于高速缓存标签目录的层次结构。每个标签条目可以与来自属于第一域的高速缓存行相关联。第一个域可能包含多个缓存。域状态字段的值可以指示其相关联的高速缓存行是否可以被读取或改变。

4.

发明授权
Signature based hit-predicting cache 有权
标题翻译：基于签名的命中预测缓存

公开(公告)号：US09262327B2

公开(公告)日：2016-02-16

申请号：US13538390

申请日：2012-06-29

申请人： Simon C. Steely, Jr. , William C. Hasenplaugh , Aamer Jaleel , Joel S. Emer , Carole-Jean Wu

发明人： Simon C. Steely, Jr. , William C. Hasenplaugh , Aamer Jaleel , Joel S. Emer , Carole-Jean Wu

IPC分类号： G06F12/08

CPC分类号： G06F12/0862

摘要： An apparatus may comprise a cache file having a plurality of cache lines and a hit predictor. The hit predictor may contain a table of counter values indexed with signatures that are associated with the plurality of cache lines. The apparatus may fill cache lines into the cache file with either low or high priority. Low priority lines may be chosen to be replaced by a replacement algorithm before high priority lines. In this way, the cache naturally may contain more high priority lines than low priority ones. This priority filling process may improve the performance of most replacement schemes including the best known schemes which are already doing better than LRU.

摘要翻译： 装置可以包括具有多个高速缓存行和命中预测器的高速缓存文件。命中预测器可以包含用与多个高速缓存行相关联的签名索引的计数器值的表。该装置可以以低优先级或高优先级将高速缓存行填充到高速缓存文件中。低优先级行可以被选择为在高优先级行之前由替换算法代替。以这种方式，高速缓存当然可以包含比优先级更高的优先级更高的行。该优先填充过程可以改善大多数替换方案的性能，包括已经比LRU更好的已知方案。

5.

发明授权
Method and apparatus for optimizing the usage of cache memories 有权
标题翻译：用于优化高速缓存存储器的使用的方法和装置

公开(公告)号：US09418016B2

公开(公告)日：2016-08-16

申请号：US12974907

申请日：2010-12-21

申请人： Simon C. Steely, Jr. , Joel S. Emer , William C. Hasenplaugh

发明人： Simon C. Steely, Jr. , Joel S. Emer , William C. Hasenplaugh

IPC分类号： G06F12/02 , G06F12/08

CPC分类号： G06F12/0891 , G06F12/0804 , G06F12/0815 , G06F2212/1028 , Y02D10/13

摘要： A method and apparatus to reduce unnecessary write backs of cached data to a main memory and to optimize the usage of a cache memory tag directory. In one embodiment of the invention, the power consumption of a processor can be saved by eliminating write backs of cache memory lines that has information that has reached its end-of-life. In one embodiment of the invention, when a processing unit is required to clear one or more cache memory lines, it uses a write-zero command to clear the one or more cache memory lines. The processing unit does not perform a write operation to move or pass data values of zero to the one or more cache memory lines. By doing so, it reduces the power consumption of the processing unit.

摘要翻译： 一种减少对主存储器的缓存数据的不必要的回写并优化高速缓存存储器标签目录的使用的方法和装置。在本发明的一个实施例中，通过消除具有已经达到其使用寿命的信息的高速缓冲存储器线的写回，可以节省处理器的功耗。在本发明的一个实施例中，当需要处理单元来清除一个或多个高速缓存存储器线时，它使用写入零命令来清除一个或多个高速缓存存储器线。处理单元不执行写入操作以将数据值0移动或传递给一个或多个高速缓存存储器线。通过这样做，它降低了处理单元的功耗。

6.

发明授权
Retrieval of previously accessed data in a multi-core processor 有权
标题翻译：在多核处理器中检索以前访问过的数据

公开(公告)号：US09146871B2

公开(公告)日：2015-09-29

申请号：US13995283

申请日：2011-12-28

申请人： Simon C. Steely, Jr. , William C. Hasenplaugh , Joel S. Emer

发明人： Simon C. Steely, Jr. , William C. Hasenplaugh , Joel S. Emer

IPC分类号： G06F13/00 , G06F12/08 , G06F13/38 , G06F13/10

CPC分类号： G06F12/0822 , G06F12/0817 , G06F12/0846 , G06F13/00 , G06F13/10 , G06F13/385

摘要： A multi-core processing apparatus may provide a cache probe and data retrieval method. The method may comprise sending a memory request from a requester to a record keeping structure. The memory request may have a memory address of a memory that stores requested data. The method may further comprise determining a last accessor of the memory address, sending a cache probe to the last accessor, determining the last accessor no longer has a copy of the line; and sending a request for the previously accessed version of the line. The request may bypass the tag-directories and obtain the requested data from memory.

摘要翻译： 多核处理装置可以提供高速缓存探针和数据检索方法。该方法可以包括将请求者的存储器请求发送到记录保存结构。存储器请求可以具有存储请求的数据的存储器的存储器地址。该方法还可以包括确定存储器地址的最后存取器，向最后一个存取器发送高速缓存探测器，确定最后一个访问器不再具有该行的副本; 并发送对先前访问版本的行的请求。该请求可以绕过标签目录并从存储器获取所请求的数据。

7.

发明授权
Efficient support of sparse data structure access 有权
标题翻译：有效支持稀疏数据结构访问

公开(公告)号：US09037804B2

公开(公告)日：2015-05-19

申请号：US13995209

申请日：2011-12-29

申请人： Simon C. Steely, Jr. , William C. Hasenplaugh , Joel S. Emer

发明人： Simon C. Steely, Jr. , William C. Hasenplaugh , Joel S. Emer

IPC分类号： G06F13/00 , G06F12/08

CPC分类号： G06F12/0891 , G06F12/0895

摘要： Method and apparatus to efficiently organize data in caches by storing/accessing data of varying sizes in cache lines. A value may be assigned to a field indicating the size of usable data stored in a cache line. If the field indicating the size of the usable data in the cache line indicates a size less than the maximum storage size, a value may be assigned to a field in the cache line indicating which subset of the data in the field to store data is usable data. A cache request may determine whether the size of the usable data in a cache line is equal to the maximum data storage size. If the size of the usable data in the cache line is equal to the maximum data storage size the entire stored data in the cache line may be returned.

摘要翻译： 通过在高速缓存行中存储/访问不同大小的数据来高效地组织高速缓存中的数据的方法和装置。可以将值分配给指示存储在高速缓存行中的可用数据的大小的字段。如果指示高速缓存行中的可用数据的大小的字段指示小于最大存储大小的大小，则可以将值分配给高速缓存行中的字段，指示字段中存储数据的数据的哪个子集是可用的数据。缓存请求可以确定高速缓存行中的可用数据的大小是否等于最大数据存储大小。如果高速缓存行中的可用数据的大小等于最大数据存储大小，则可以返回高速缓存行中的整个存储数据。

8.

发明授权
Instruction prefetching using cache line history 有权
标题翻译：指令预取使用高速缓存行历史记录

公开(公告)号：US08533422B2

公开(公告)日：2013-09-10

申请号：US12895387

申请日：2010-09-30

申请人： Samantika Subramaniam , Aamer Jaleel , Simon C. Steely, Jr.

发明人： Samantika Subramaniam , Aamer Jaleel , Simon C. Steely, Jr.

IPC分类号： G06F12/06 , G06F12/08

CPC分类号： G06F12/0862 , G06F9/3816 , G06F2212/452 , G06F2212/6024 , Y02D10/13

摘要： An apparatus of an aspect includes a prefetch cache line address predictor to receive a cache line address and to predict a next cache line address to be prefetched. The next cache line address may indicate a cache line having at least 64-bytes of instructions. The prefetch cache line address predictor may have a cache line target history storage to store a cache line target history for each of multiple most recent corresponding cache lines. Each cache line target history may indicate whether the corresponding cache line had a sequential cache line target or a non-sequential cache line target. The cache line address predictor may also have a cache line target history predictor. The cache line target history predictor may predict whether the next cache line address is a sequential cache line address or a non-sequential cache line address, based on the cache line target history for the most recent cache lines.

摘要翻译： 一方面的装置包括预取高速缓存行地址预测器，用于接收高速缓存行地址并预测要预取的下一个高速缓存行地址。下一个高速缓存行地址可以指示具有至少64字节指令的高速缓存行。预取高速缓存线地址预测器可以具有高速缓存行目标历史存储器，以存储多个最新对应的高速缓存行中的每一个的高速缓存行目标历史。每个高速缓存行目标历史可以指示对应的高速缓存线是否具有顺序高速缓存行目标或非顺序高速缓存行目标。高速缓存行地址预测器也可以具有高速缓存行目标历史预测器。高速缓存行目标历史预测器可以基于最近的高速缓存行的高速缓存行目标历史来预测下一个高速缓存行地址是顺序高速缓存行地址还是非顺序高速缓存行地址。

9.

发明授权
System for passing an index value with each prediction in forward direction to enable truth predictor to associate truth value with particular branch instruction 失效
标题翻译：用于向前传递每个预测的索引值的系统，以使真实预测器能够将真值与特定分支指令相关联

公开(公告)号：US6081887A

公开(公告)日：2000-06-27

申请号：US191869

申请日：1998-11-12

申请人： Simon C. Steely, Jr. , Edward J. McLellan , Joel S. Emer

发明人： Simon C. Steely, Jr. , Edward J. McLellan , Joel S. Emer

IPC分类号： G06F9/38 , G06F9/32

CPC分类号： G06F9/3844

摘要： A technique for predicting the result of a conditional branch instruction for use with a processor having instruction pipeline. A stored predictor is connected to the front end of the pipeline and is trained from a truth based predictor connected to the back end of the pipeline. The stored predictor is accessible in one instruction cycle, and therefore provides minimum predictor latency. Update latency is minimized by storing multiple predictions in the front end stored predictor which are indexed by an index counter. The multiple predictions, as provided by the back end, are indexed by the index counter to select a particular one as current prediction on a given instruction pipeline cycle. The front end stored predictor also passes along to the back end predictor, such as through the instruction pipeline, a position value used to generate the predictions. This further structure accommodates ghost branch instructions that turn out to be flushed out of the pipeline when it must be backed up. As a result, the front end always provides an accurate prediction with minimum update latency.

摘要翻译： 一种用于预测与具有指令流水线的处理器一起使用的条件转移指令的结果的技术。存储的预测器连接到管道的前端，并且从连接到管道后端的基于真实的预测器训练。存储的预测器可以在一个指令周期中访问，因此提供最小预测器延迟。通过将多个预测存储在由索引计数器索引的前端存储的预测器中来最小化更新延迟。由后端提供的多个预测由索引计数器索引，以选择特定的预测作为给定指令流水线周期上的当前预测。前端存储的预测器还将传递到后端预测器，例如通过指令流水线，用于产生预测的位置值。这种进一步的结构可以容纳重影分支指令，当它必须被备份时，这些指令将被清除流出管道。因此，前端总是以最小的更新延迟提供准确的预测。

10.

发明授权
Method and apparatus for achieving non-inclusive cache performance with inclusive caches 有权
标题翻译：用于通过包含缓存实现非包容性缓存性能的方法和装置

公开(公告)号：US08769209B2

公开(公告)日：2014-07-01

申请号：US12973051

申请日：2010-12-20

申请人： Aamer Jaleel , Simon C. Steely, Jr. , Eric R. Borch , Malini K. Bhandaru , Joel S. Emer

发明人： Aamer Jaleel , Simon C. Steely, Jr. , Eric R. Borch , Malini K. Bhandaru , Joel S. Emer

IPC分类号： G06F12/12

CPC分类号： G06F12/0813 , G06F12/128

摘要： An apparatus and method for improving cache performance in a computer system having a multi-level cache hierarchy. For example, one embodiment of a method comprises: selecting a first line in a cache at level N for potential eviction; querying a cache at level M in the hierarchy to determine whether the first cache line is resident in the cache at level M, wherein M

摘要翻译： 一种用于在具有多级高速缓存层级的计算机系统中提高高速缓存性能的装置和方法。例如，方法的一个实施例包括：在级别N的高速缓存中选择潜在驱逐的第一行; 查询层级中的级别M的高速缓存，以确定第一高速缓存行是否驻留在级别M的高速缓存中，其中M

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类