Patent search ap:("INTEL CORPORATION") AND inv:"Samantika S. Sury" Page 2

11.

发明申请
INSTRUCTIONS FOR REMOTE ATOMIC OPERATIONS 审中-公开

公开(公告)号：US20190004810A1

公开(公告)日：2019-01-03

申请号：US15638120

申请日：2017-06-29

Applicant: Intel Corporation

Inventor： Doddaballapur N. Jayasimha , Jonas Svennebring , Samantika S. Sury , Christopher J. Hughes , Jong Soo Park , Lingxiang Xiang

IPC: G06F9/38 , G06F12/0893 , G06F9/26 , G06F13/28

Abstract: Disclosed embodiments relate to atomic memory operations. In one example, a method of executing an instruction atomically and with weak order includes: fetching, by fetch circuitry, the instruction from code storage, the instruction including an opcode, a source identifier, and a destination identifier, decoding, by decode circuitry, the fetched instruction, selecting, by a scheduling circuit, an execution circuit among multiple circuits in a system, scheduling, by the scheduling circuit, execution of the decoded instruction out of order with respect to other instructions, with an order selected to optimize at least one of latency, throughput, power, and performance, and executing the decoded instruction, by the execution circuit, to: atomically read a datum from a location identified by the destination identifier, perform an operation on the datum as specified by the opcode, the operation to use a source operand identified by the source identifier, and write a result back to the location.

12.

发明授权
Synchronization logic for memory requests 有权

公开(公告)号：US10146690B2

公开(公告)日：2018-12-04

申请号：US15180351

申请日：2016-06-13

Applicant: Intel Corporation

Inventor： Samantika S. Sury , Robert G. Blankenship , Simon C. Steely, Jr.

IPC: G06F12/0831

Abstract: In an embodiment, a processor includes a plurality of cores and synchronization logic. The synchronization logic includes circuitry to: receive a first memory request and a second memory request; determine whether the second memory request is in contention with the first memory request; and in response to a determination that the second memory request is in contention with the first memory request, process the second memory request using a non-blocking cache coherence protocol. Other embodiments are described and claimed.

13.

发明授权
Hardware apparatuses and methods to control cache line coherency 有权

公开(公告)号：US09934146B2

公开(公告)日：2018-04-03

申请号：US14498946

申请日：2014-09-26

Applicant: INTEL CORPORATION

Inventor： Simon C. Steely, Jr. , Samantika S. Sury , William C. Hasenplaugh

IPC: G06F12/08 , G06F12/0817 , G06F12/0811

CPC classification number: G06F12/0824 , G06F12/0811 , G06F2212/1024 , G06F2212/1048 , G06F2212/2542

Abstract: Methods and apparatuses to control cache line coherency are described. A processor may include a first core having a cache to store a cache line, a second core to send a request for the cache line from the first core, moving logic to cause a move of the cache line between the first core and a memory and to update a tag directory of the move, and cache line coherency logic to create a chain home in the tag directory from the request to cause the cache line to be sent from the tag directory to the second core. A method to control cache line coherency may include creating a chain home in a tag directory from a request for a cache line in a first processor core from a second processor core to cause the cache line to be sent from the tag directory to the second processor core.

14.

发明申请
METHOD, APPARATUS, AND SYSTEM FOR CACHE COHERENCY USING A COARSE DIRECTORY 审中-公开

公开(公告)号：US20170351430A1

公开(公告)日：2017-12-07

申请号：US15170050

申请日：2016-06-01

Applicant: Intel Corporation

Inventor： Robert G. Blankenship , Simon C. Steely, JR. , Samantika S. Sury

IPC: G06F3/06 , G06F12/0808 , G06F12/0815 , G06F12/0811 , G06F12/0842

CPC classification number: G06F3/0605 , G06F3/0625 , G06F3/0659 , G06F3/0673 , G06F12/0808 , G06F12/0811 , G06F12/0815 , G06F12/0824 , G06F12/0826 , G06F12/0831 , G06F12/0842 , G06F2212/1028 , G06F2212/1048 , Y02D10/13

Abstract: Systems, methods, and apparatuses are directed to requesting access to a memory address; storing an identification of the memory address in a data structure; receiving a first request for access to the memory address, the request comprising a reference to a second processor core; storing the reference to the second processor in the data structure; receiving a second request for access to the memory address, the second request comprising a reference to a third processor core; determining, based on the data structure, that the third processor core is different from the second processor core; and responding to the second request without buffering the second request.

15.

发明授权
Multicast tree-based data distribution in distributed shared cache 有权

公开(公告)号：US09734069B2

公开(公告)日：2017-08-15

申请号：US14567026

申请日：2014-12-11

Applicant: Intel Corporation

Inventor： Simon C. Steely, Jr. , William C. Hasenplaugh , Samantika S. Sury

IPC: G06F12/08 , G06F12/084 , G06F12/0815 , G06F12/0817

CPC classification number: G06F12/084 , G06F12/0815 , G06F12/0822 , G06F2212/1021 , G06F2212/281 , Y02D10/13

Abstract: Systems and methods for multicast tree-based data distribution in a distributed shared cache. An example processing system comprises: a plurality of processing cores, each processing core communicatively coupled to a cache; a tag directory associated with caches of the plurality of processing cores; a shared cache associated with the tag directory; a processing logic configured, responsive to receiving an invalidate request with respect to a certain cache entry, to: allocate, within the shared cache, a shared cache entry corresponding to the certain cache entry; transmit, to at least one of: a tag directory or a processing core that last accessed the certain entry, an update read request with respect to the certain cache entry; and responsive to receiving an update of the certain cache entry, broadcast the update to at least one of: one or more tag directories or one or more processing cores identified by a tag corresponding to the certain cache entry.

16.

发明授权
Instructions for remote atomic operations 有权

公开(公告)号：US11989555B2

公开(公告)日：2024-05-21

申请号：US15638120

申请日：2017-06-29

Applicant: Intel Corporation

Inventor： Doddaballapur N. Jayasimha , Jonas Svennebring , Samantika S. Sury , Christopher J. Hughes , Jong Soo Park , Lingxiang Xiang

IPC: G06F9/30 , G06F9/38 , G06F9/46 , G06F13/28

CPC classification number: G06F9/3004 , G06F9/3001 , G06F9/30185 , G06F9/3836 , G06F9/46 , G06F13/28

Abstract: Disclosed embodiments relate to atomic memory operations. In one example, a method of executing an instruction atomically and with weak order includes: fetching, by fetch circuitry, the instruction from code storage, the instruction including an opcode, a source identifier, and a destination identifier, decoding, by decode circuitry, the fetched instruction, selecting, by a scheduling circuit, an execution circuit among multiple circuits in a system, scheduling, by the scheduling circuit, execution of the decoded instruction out of order with respect to other instructions, with an order selected to optimize at least one of latency, throughput, power, and performance, and executing the decoded instruction, by the execution circuit, to: atomically read a datum from a location identified by the destination identifier, perform an operation on the datum as specified by the opcode, the operation to use a source operand identified by the source identifier, and write a result back to the location.

17.

发明授权
Remote atomic operations in multi-socket systems 有权

公开(公告)号：US11537520B2

公开(公告)日：2022-12-27

申请号：US17494651

申请日：2021-10-05

Applicant: Intel Corporation

Inventor： Doddaballapur N. Jayasimha , Samantika S. Sury , Christopher J. Hughes , Jonas Svennebring , Yen-Cheng Liu , Stephen R. Van Doren , David A. Koufaty

IPC: G06F12/0815 , G06F12/0808 , G06F9/30 , G06F12/0817 , G06F12/0831

Abstract: Disclosed embodiments relate to remote atomic operations (RAO) in multi-socket systems. In one example, a method, performed by a cache control circuit of a requester socket, includes: receiving the RAO instruction from the requester CPU core, determining a home agent in a home socket for the addressed cache line, providing a request for ownership (RFO) of the addressed cache line to the home agent, waiting for the home agent to either invalidate and retrieve a latest copy of the addressed cache line from a cache, or to fetch the addressed cache line from memory, receiving an acknowledgement and the addressed cache line, executing the RAO instruction on the received cache line atomically, subsequently receiving multiple local RAO instructions to the addressed cache line from one or more requester CPU cores, and executing the multiple local RAO instructions on the received cache line independently of the home agent.

18.

发明申请
REMOTE ATOMIC OPERATIONS IN MULTI-SOCKET SYSTEMS 有权

公开(公告)号：US20220091983A1

公开(公告)日：2022-03-24

申请号：US17494651

申请日：2021-10-05

Applicant: Intel Corporation

Inventor： Doddaballapur N. Jayasimha , Samantika S. Sury , Christopher J. Hughes , Jonas Svennebring , Yen-Cheng Liu , Stephen R. Van Doren , David A. Koufaty

IPC: G06F12/0815 , G06F12/0808 , G06F9/30 , G06F12/0817

Abstract: Disclosed embodiments relate to remote atomic operations (RAO) in multi-socket systems. In one example, a method, performed by a cache control circuit of a requester socket, includes: receiving the RAO instruction from the requester CPU core, determining a home agent in a home socket for the addressed cache line, providing a request for ownership (RFO) of the addressed cache line to the home agent, waiting for the home agent to either invalidate and retrieve a latest copy of the addressed cache line from a cache, or to fetch the addressed cache line from memory, receiving an acknowledgement and the addressed cache line, executing the RAO instruction on the received cache line atomically, subsequently receiving multiple local RAO instructions to the addressed cache line from one or more requester CPU cores, and executing the multiple local RAO instructions on the received cache line independently of the home agent.

19.

发明申请
SPATIAL AND TEMPORAL MERGING OF REMOTE ATOMIC OPERATIONS 审中-公开

公开(公告)号：US20200319886A1

公开(公告)日：2020-10-08

申请号：US16799619

申请日：2020-02-24

Applicant: Intel Corporation

Inventor： Christopher J. Hughes , Joseph Nuzman , Jonas Svennebring , Doddaballapur N. Jayasimha , Samantika S. Sury , David A. Koufaty , Niall D. McDonnell , Yen-Cheng Liu , Stephen R. Van Doren , Stephen J. Robinson

IPC: G06F9/30 , G06F12/0875

Abstract: Disclosed embodiments relate to spatial and temporal merging of remote atomic operations. In one example, a system includes an RAO instruction queue stored in a memory and having entries grouped by destination cache line, each entry to enqueue an RAO instruction including an opcode, a destination identifier, and source data, optimization circuitry to receive an incoming RAO instruction, scan the RAO instruction queue to detect a matching enqueued RAO instruction identifying a same destination cache line as the incoming RAO instruction, the optimization circuitry further to, responsive to no matching enqueued RAO instruction being detected, enqueue the incoming RAO instruction; and, responsive to a matching enqueued RAO instruction being detected, determine whether the incoming and matching RAO instructions have a same opcode to non-overlapping cache line elements, and, if so, spatially combine the incoming and matching RAO instructions by enqueuing both RAO instructions in a same group of cache line queue entries at different offsets.

20.

发明授权
Processors, methods, and systems for a configurable spatial accelerator with memory system performance, power reduction, and atomics support features 有权

公开(公告)号：US10387319B2

公开(公告)日：2019-08-20

申请号：US15640534

申请日：2017-07-01

Applicant: Intel Corporation

Inventor： Michael C. Adler , Chiachen Chou , Neal C. Crago , Kermin Fleming , Kent D. Glossop , Aamer Jaleel , Pratik M. Marolia , Simon C. Steely, Jr. , Samantika S. Sury

IPC: G06F12/0802 , G06F15/00 , G06F12/0862 , H03K19/177 , G06F15/78 , G11C8/12 , G06F17/50 , G06F15/80

Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements. The processor also includes a streamer element to prefetch the incoming operand set from two or more levels of a memory system.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification