-
公开(公告)号:US10983793B2
公开(公告)日:2021-04-20
申请号:US16369846
申请日:2019-03-29
Applicant: INTEL CORPORATION
Inventor: Joshua Fryman , Ankit More , Jason Howard , Robert Pawlowski , Yigit Demir , Nick Pepperling , Fabrizio Petrini , Sriram Aananthakrishnan , Shaden Smith
Abstract: The present disclosure is directed to systems and methods of performing one or more broadcast or reduction operations using direct memory access (DMA) control circuitry. The DMA control circuitry executes a modified instruction set architecture (ISA) that facilitates the broadcast distribution of data to a plurality of destination addresses in system memory circuitry. The broadcast instruction may include broadcast of a single data value to each destination address. The broadcast instruction may include broadcast of a data array to each destination address. The DMA control circuitry may also execute a reduction instruction that facilitates the retrieval of data from a plurality of source addresses in system memory and performing one or more operations using the retrieved data. Since the DMA control circuitry, rather than the processor circuitry performs the broadcast and reduction operations, system speed and efficiency is beneficially enhanced.
-
公开(公告)号:US20240256283A1
公开(公告)日:2024-08-01
申请号:US18566068
申请日:2022-03-31
Applicant: Intel Corporation
Inventor: Joshua B. Fryman , Byoungchan Oh , Sai Dheeraj Polagani , Kevin P. Ma , Robert S. Pawlowski , Bharadwaj Coimbatore Krishnamurthy , Shruti Sharma , Smitha P. Vasantha Kumar , Jason Howard , Daniel S. Klowden
CPC classification number: G06F9/3851 , G06F11/3409
Abstract: A system is provided that includes a set of graph processing cores and a set of dense compute cores. where the set of graph processing cores and the set of dense cores are interconnected in a network. The dense compute cores include offload queue circuitry to receive an offload request from the set of graph processing cores to handle dense compute workloads. Memory controllers are also provided in the system for use by the graph processing cores in reading and writing to memory in association with sparse graph applications. the memory controllers enhanced to efficiently handle memory transactions in sparse graph applications.
-
公开(公告)号:US20220413855A1
公开(公告)日:2022-12-29
申请号:US17359305
申请日:2021-06-25
Applicant: Intel Corporation
Inventor: Robert Pawlowski , Sriram Aananthakrishnan , Jason Howard , Joshua Fryman
IPC: G06F9/30 , G06F9/38 , G06F12/0875
Abstract: Techniques for operating on an indirect memory access instruction, where the instruction accesses a memory location via at least one indirect address. A pipeline processes the instruction and a memory operation engine generates a first access to the at least one indirect address and a second access to a target address determined by the at least one indirect address. A cache memory used with the pipeline and the memory operation engine caches pointers. In response to a cache hit when executing the indirect memory access instruction, operations dereference a pointer to obtain the at least one indirect address, not set a cache bit, and return data for the instruction without storing the data in the cache memory; and in response to a cache miss, operations set the cache bit, obtain, and store a cache line for a missed pointer, and return data without storing the data in the cache memory.
-
公开(公告)号:US20200310795A1
公开(公告)日:2020-10-01
申请号:US16369846
申请日:2019-03-29
Applicant: INTEL CORPORATION
Inventor: Joshua Fryman , Ankit More , Jason Howard , Robert Pawlowski , Yigit Demir , Nick Pepperling , Fabrizio Petrini , Sriram Aananthakrishnan , Shaden Smith
Abstract: The present disclosure is directed to systems and methods of performing one or more broadcast or reduction operations using direct memory access (DMA) control circuitry. The DMA control circuitry executes a modified instruction set architecture (ISA) that facilitates the broadcast distribution of data to a plurality of destination addresses in system memory circuitry. The broadcast instruction may include broadcast of a single data value to each destination address. The broadcast instruction may include broadcast of a data array to each destination address. The DMA control circuitry may also execute a reduction instruction that facilitates the retrieval of data from a plurality of source addresses in system memory and performing one or more operations using the retrieved data. Since the DMA control circuitry, rather than the processor circuitry performs the broadcast and reduction operations, system speed and efficiency is beneficially enhanced.
-
公开(公告)号:US10929132B1
公开(公告)日:2021-02-23
申请号:US16579806
申请日:2019-09-23
Applicant: Intel Corporation
Inventor: Robert Pawlowski , Scott Hagan Schmittel , Joshua Fryman , Wim Heirman , Jason Howard , Ankit More , Shaden Smith , Scott Cline
Abstract: Disclosed embodiments relate to systems and methods for performing instructions to access a compressed graphic list. In one example, a processor includes fetch and decode circuitry to fetch and decode the single instruction to access the compressed graphic list, and execution circuitry to execute the decoded single instruction to cause access to the compressed graphic list by: receiving, from a load store queue, at a first op-engine associated with a first data location, an indirection request, computing, via the first op-engine, a second data location associated with a second op-engine, computing, via the second op-engine, a third data location associated with a third op-engine responsive to the indirection request, and providing, via the third op-engine, a data response to the load store queue responsive to receiving data from the third data location.
-
公开(公告)号:US20200004587A1
公开(公告)日:2020-01-02
申请号:US16024343
申请日:2018-06-29
Applicant: Intel Corporation
Inventor: Paul Griffin , Joshua Fryman , Jason Howard , Sang Phill Park , Robert Pawlowski , Michael Abbott , Scott Cline , Samkit Jain , Ankit More , Vincent Cave , Fabrizio Petrini , Ivan Ganev
Abstract: Embodiments of apparatuses, methods, and systems for a multithreaded processor core with hardware-assisted task scheduling are described. In an embodiment, a processor includes a first hardware thread, a second hardware thread, and a task manager. The task manager is to issue a task to the first hardware thread. The task manager includes a hardware task queue in which to store a plurality of task descriptors. Each of the task descriptors is to represent one of a single task, a collection of iterative tasks, and a linked list of tasks.
-
公开(公告)号:US20250133129A1
公开(公告)日:2025-04-24
申请号:US19002995
申请日:2024-12-27
Applicant: Intel Corporation
Inventor: Akhilesh Thyagaturu , Jason Howard , Stanley T. Mo , Nicholas G. Ross , Sanjaya Tayal
Abstract: A cross-domain device includes a first interface to couple to a first device and a second interface to couple to a second device, where the first device is to implement a first component in a radio access network (RAN) system in a first computing domain, and the second device is to implement a second component in the RAN system in a second computing domain. The first component is to interface within the second component in a RAN processing pipeline. The cross-domain device further comprises hardware to implement a communication channel between the first device and the second device to pass data from the first component to the second component, where the communication channel enforces isolation of the first computing domain from the second computing domain.
-
公开(公告)号:US12204901B2
公开(公告)日:2025-01-21
申请号:US17359305
申请日:2021-06-25
Applicant: Intel Corporation
Inventor: Robert Pawlowski , Sriram Aananthakrishnan , Jason Howard , Joshua Fryman
IPC: G06F9/30 , G06F9/38 , G06F12/0875
Abstract: Techniques for operating on an indirect memory access instruction, where the instruction accesses a memory location via at least one indirect address. A pipeline processes the instruction and a memory operation engine generates a first access to the at least one indirect address and a second access to a target address determined by the at least one indirect address. A cache memory used with the pipeline and the memory operation engine caches pointers. In response to a cache hit when executing the indirect memory access instruction, operations dereference a pointer to obtain the at least one indirect address, not set a cache bit, and return data for the instruction without storing the data in the cache memory; and in response to a cache miss, operations set the cache bit, obtain, and store a cache line for a missed pointer, and return data without storing the data in the cache memory.
-
公开(公告)号:US20240020428A1
公开(公告)日:2024-01-18
申请号:US18476026
申请日:2023-09-27
Applicant: Intel Corporation
Inventor: Akhilesh Thyagaturu , Jason Howard , Nicholas Ross , Sanjaya Tayal , Vinodh Gopal
CPC classification number: G06F21/85 , G06F21/71 , G06F21/577 , G06F2221/034
Abstract: Systems, apparatus, articles of manufacture, and methods are disclosed to generate and manage a firewall policy. An example includes interface circuitry, machine readable instructions, and programmable circuitry to at least one of instantiate or execute the machine readable instructions to determine whether an operation is allowed to pass between a first component on a system-on-chip (SoC) and a second component on the SoC, detect an interconnect between the first component on the SoC and the second component on the SoC, cause the interconnect to filter the operation based on the determination of whether the operation is allowed to pass between the first component and the second component, and transmit a request to filter the operation based on the determination of whether the operation is allowed to pass between the first component and the second component.
-
公开(公告)号:US20220229723A1
公开(公告)日:2022-07-21
申请号:US17711646
申请日:2022-04-01
Applicant: Intel Corporation
Inventor: Joshua B. Fryman , Byoungchan Oh , Jason Howard , Sai Dheeraj Polagani
IPC: G06F11/10
Abstract: Memory requests are protected by encoding memory requests to include error correction codes. A subset of bits in a memory request are compared to a pre-defined pattern to determine whether the subset of bits matches a pre-defined pattern, where a match indicates that a compression can be applied to the memory request. The error correction code is generated for the memory request and the memory request is encoded to remove the subset of bits, add the error correction code, and add at least one metadata bit to the memory request to generate a protected version of the memory request, where the at least one metadata bit identifies whether the compression was applied to the memory request.
-
-
-
-
-
-
-
-
-