-
公开(公告)号:US09582422B2
公开(公告)日:2017-02-28
申请号:US14582348
申请日:2014-12-24
Applicant: INTEL CORPORATION
Inventor: Xiangyao Yu , Christopher J. Hughes , Nadathur Rajagopalan Satish
CPC classification number: G06F12/0862 , G06F9/30047 , G06F9/3455 , G06F2212/602 , G06F2212/6024 , G06F2212/6026
Abstract: Two techniques address bottlenecking in processors. The first is indirect prefetching. The technique can be especially useful for graph analytics and sparse matrix applications. For graph analytics and sparse matrix applications, the addresses of most random memory accesses come from an index array B which is sequentially scanned by an application. The random accesses are actually indirect accesses in the form A[B[i]]. A hardware component is introduced to detect this pattern. The hardware can then read B a certain distance ahead, and prefetch the corresponding element in A. For example, if the “prefetch distance” is k, when B[i] is accessed, the hardware reads B[i+k], and then A[B[i+k]. For partial cacheline accessing, the indirect accesses are usually accessing random memory locations and only accessing a small portion of a cacheline. Instead of loading the whole cacheline into L1 cache, the second technique only loads a part of the cacheline.
Abstract translation: 两种技术解决了处理器中的瓶颈问题。 第一个是间接预取。 该技术对于图形分析和稀疏矩阵应用尤其有用。 对于图形分析和稀疏矩阵应用,大多数随机存储器访问的地址来自由应用程序依次扫描的索引数组B. 随机访问实际上是以A [B [i]]形式的间接访问。 引入硬件组件来检测此模式。 然后,硬件可以在某一距离前面读取B,并在A中预取相应的元素。例如,如果“预取距离”为k,则当访问B [i]时,硬件读取B [i + k],并且 那么A [B [i + k]。 对于部分缓存线访问,间接访问通常访问随机存储器位置,并且仅访问高速缓存行的一小部分。 而不是将整个缓存线加载到L1缓存中,第二种技术只加载了一部分缓存线。