Methods, apparatus, instructions and logic to provide vector packed histogram functionality

    公开(公告)号:US09875213B2

    公开(公告)日:2018-01-23

    申请号:US14752054

    申请日:2015-06-26

    Abstract: Instructions and logic provide SIMD vector packed histogram functionality. Some processor embodiments include first and second registers storing, in each of a plurality of data fields of a register lane portion, corresponding elements of a first and of a second data type, respectively. A decode stage decodes an instruction for SIMD vector packed histograms. One or more execution units, compare each element of the first data type, in the first register lane portion, with a range specified by the instruction. For any elements of the first register portion in said range, corresponding elements of the second data type, from the second register portion, are added into one of a plurality data fields of a destination register lane portion, selected according to the value of its corresponding element of the first data type, to generate packed weighted histograms for each destination register lane portion.

    Apparatus and method for memory-mapped register caching
    25.
    发明授权
    Apparatus and method for memory-mapped register caching 有权
    用于存储器映射寄存器缓存的装置和方法

    公开(公告)号:US09189398B2

    公开(公告)日:2015-11-17

    申请号:US13730030

    申请日:2012-12-28

    CPC classification number: G06F12/0802 G06F12/0875 G06F12/0897 Y02D10/13

    Abstract: A processor is described comprising: an architectural register file implemented as a combination of a register file cache and an architectural register region within a level 1 (L1) data cache, and a data location table (DLT) to store data indicating a location of each architectural register within the register file cache and/or the architectural register region within the L1 data cache.

    Abstract translation: 描述了一种处理器,包括:实现为级别1(L1)数据高速缓存中的寄存器文件高速缓存和结构寄存器区域的组合的架构寄存器文件,以及数据位置表(DLT),用于存储指示每个 寄存器文件缓存内的架构寄存器和/或L1数据高速缓存内的体系结构寄存器区域。

    Shared read—using a request tracker as a temporary read cache

    公开(公告)号:US11422939B2

    公开(公告)日:2022-08-23

    申请号:US16727657

    申请日:2019-12-26

    Abstract: Disclosed embodiments relate to a shared read request (SRR) using a common request tracker (CRT) as a temporary cache. In one example, a multi-core system includes a memory and a memory controller to receive a SRR from a core when a Leader core is not yet identified, allocate a CRT entry and store the SRR therein, mark it as a Leader, send a read request to a memory address indicated by the SRR, and when read data returns from the memory, store the read data in the CRT entry, send the read data to the Leader core, and await receipt, unless already received, of another SRR from a Follower core, the other SRR having a same address as the SRR, then, send the read data to the Follower core, and deallocate the CRT entry.

    DYNAMICALLY CONFIGURABLE MULTI-MODE MEMORY ALLOCATION IN AN ACCELERATOR MULTI-CORE SYSTEM ON CHIP

    公开(公告)号:US20220066923A1

    公开(公告)日:2022-03-03

    申请号:US17523384

    申请日:2021-11-10

    Abstract: Systems, apparatuses and methods may provide for technology that determines runtime memory requirements of an artificial intelligence (AI) application, defines a remote address range for a plurality of memories based on the runtime memory requirements, wherein each memory in the plurality of memories corresponds to a processor in a plurality of processors, and defines a shared address range for the plurality of memories based on the runtime memory requirements, wherein the shared address range is aliased. In one example, the technology configures memory mapping hardware to access the remote address range in a linear sequence and access the shared address range in a hashed sequence.

Patent Agency Ranking