-
1.
公开(公告)号:US20230205692A1
公开(公告)日:2023-06-29
申请号:US17561571
申请日:2021-12-23
Applicant: Intel Corporation
Inventor: ANANT NORI , RAHUL BERA , SHANKAR BALACHANDRAN , JOYDEEP RAKSHIT , Om Ji OMER , SREENIVAS SUBRAMONEY , AVISHAII ABUHATZERA , BELLIAPPA KUTTANNA
IPC: G06F12/0811 , G06F9/48 , G06F9/30 , G06N3/02
CPC classification number: G06F12/0811 , G06F9/4881 , G06F9/3012 , G06N3/02
Abstract: Apparatus and method for leveraging simultaneous multithreading for bulk compute operations. For example, one embodiment of a processor comprises: a plurality of cores including a first core to simultaneously process instructions of a plurality of threads; a cache hierarchy coupled to the first core and the memory, the cache hierarchy comprising a Level 1 (L1) cache, a Level 2 (L2) cache, and a Level 3 (L3) cache; and a plurality of compute units coupled to the first core including a first compute unit associated with the L1 cache, a second compute unit associated with the L2 cache, and a third compute unit associated with the L3 cache, wherein the first core is to offload instructions for execution by the compute units, the first core to offload instructions from a first thread to the first compute unit, instructions from a second thread to the second compute unit, and instructions from a third thread to the third compute unit.
-
公开(公告)号:US20180285364A1
公开(公告)日:2018-10-04
申请号:US15475238
申请日:2017-03-31
Applicant: Intel Corporation
Inventor: MAHESH MAMIDIPAKA , SRIVATSAVA JANDHYALA , ANISH N K , NAGADASTAGIRI REDDY C , SREENIVAS SUBRAMONEY
IPC: G06F17/30
CPC classification number: G06F16/24578 , G06F16/24568 , G06F16/248 , G06F16/9535
Abstract: A processor may include a plurality of processing elements and a hardware accelerator for selecting data elements. The hardware accelerator may: access an input data set comprising a set of data elements, each data element having a score value; increment bin counters based on the score values of the set of data elements, each bin counter to count a number of data elements with an associated score value; determine a cumulative sum of count values for a sequence of bin counters, the sequence beginning with a first bin counter of the plurality of bin counters; identify a second bin counter in the sequence of bin counters at which the cumulative sum reaches a selection quantity N; and generate an output data set based on a comparison of the set of data elements to a threshold score associated with the second bin counter.
-
3.
公开(公告)号:US20240211408A1
公开(公告)日:2024-06-27
申请号:US18087887
申请日:2022-12-23
Applicant: INTEL CORPORATION
Inventor: JOYDEEP RAKSHIT , ANANT VITHAL NORI , SREENIVAS SUBRAMONEY , HANNA ALAM , JOSEPH NUZMAN
IPC: G06F12/0891 , G06F12/1009
CPC classification number: G06F12/0891 , G06F12/1009 , G06F2212/1016
Abstract: Apparatus and method for probabilistic cacheline replacement for accelerating address translation. For example, one embodiment of a processor comprises: a plurality of cores, each core to process instructions; a cache to be shared by a subset of the plurality of cores, the cache comprising an N-way set associative cache for storing page table entry (PTE) cachelines and non-PTE cachelines; and a cache manager to implement a PTE-aware eviction policy for evicting cachelines from the cache, the PTE-aware eviction policy to cause a reduction of evictions of PTE cachelines during non-PTE cacheline fills.
-
公开(公告)号:US20220197657A1
公开(公告)日:2022-06-23
申请号:US17130016
申请日:2020-12-22
Applicant: Intel Corporation
IPC: G06F9/38
Abstract: In one embodiment, a processor includes a branch predictor to predict whether a branch instruction is to be taken and a branch target buffer (BTB) coupled to the branch predictor. The branch target buffer may be segmented into a first cache portion and a second cache portion, where, in response to an indication that the branch is to be taken, the BTB is to access an entry in one of the first cache portion and the second cache portion based at least in part on a type of the branch instruction, an occurrence frequency of the branch instruction, and spatial information regarding a distance between a target address of a target of the branch instruction and an address of the branch instruction. Other embodiments are described and claimed.
-
公开(公告)号:US20190310853A1
公开(公告)日:2019-10-10
申请号:US16024808
申请日:2018-06-30
Applicant: Intel Corporation
Inventor: RAHUL BERA , ANANT VITHAL NORI , SREENIVAS SUBRAMONEY , HONG WANG
IPC: G06F9/38 , G06F12/0875
Abstract: An apparatus and method for adaptive spatial accelerated prefetching. For example, one embodiment of an apparatus comprises: execution circuitry to execute instructions and process data; a Level 2 (L2) cache to store at least a portion of the data; and a prefetcher to prefetch data from a memory subsystem to the L2 cache in anticipation of the data being needed by the execution unit to execute one or more of the instructions, the prefetcher comprising a buffer to store one or more prefetched memory pages or portions thereof, and signature data indicating detected patterns of access to the one or more prefetched memory pages; wherein the prefetcher is to prefetch one or more cache lines based on the signature data.
-
-
-
-