Patent search ap:("INTEL CORPORATION") AND inv:"Paul Caprioli" Page 2

11.

发明申请
APPARATUSES, METHODS, AND SYSTEMS FOR SWIZZLE OPERATIONS IN A CONFIGURABLE SPATIAL ACCELERATOR 审中-公开

公开(公告)号：US20200310797A1

公开(公告)日：2020-10-01

申请号：US16370915

申请日：2019-03-30

Applicant: Intel Corporation

Inventor： Jesus Corbal , Rohan Sharma , Simon Steely, JR. , Chinmay Ashok , Kent D. Glossop , Dennis Bradford , Paul Caprioli , Louise Huot , Kermin ChoFleming , Barry Tannenbaum

IPC: G06F9/30 , G06F9/54

Abstract: Systems, methods, and apparatuses relating to swizzle operations and disable operations in a configurable spatial accelerator (CSA) are described. Certain embodiments herein provide for an encoding system for a specific set of swizzle primitives across a plurality of packed data elements in a CSA. In one embodiment, a CSA includes a plurality of processing elements, a circuit switched interconnect network between the plurality of processing elements, and a configuration register within each processing element to store a configuration value having a first portion that, when set to a first value that indicates a first mode, causes the processing element to pass an input value to operation circuitry of the processing element without modifying the input value, and, when set to a second value that indicates a second mode, causes the processing element to perform a swizzle operation on the input value to form a swizzled input value before sending the swizzled input value to the operation circuitry of the processing element, and a second portion that causes the processing element to perform an operation indicated by the second portion the configuration value on the input value in the first mode and the swizzled input value in the second mode with the operation circuitry.

12.

发明授权
Apparatus and method for architectural performance monitoring in binary translation systems 有权

公开(公告)号：US10387159B2

公开(公告)日：2019-08-20

申请号：US14614264

申请日：2015-02-04

Applicant: Intel Corporation

Inventor： Jason M Agron , Polychronis Xekalakis , Paul Caprioli , Jiwei Oliver Lu , Koichi Yamada

IPC: G06F8/52 , G06F9/38 , G06F11/34 , G06F11/36 , G06F9/455

Abstract: Methods and apparatuses relate to emulating architectural performance monitoring in a binary translation system. In one embodiment, a processor includes an architectural performance counter to maintain an architectural value associated with instruction execution, a register to store the architectural value of the architectural performance counter, binary translation logic to embed an architectural value from the architectural performance counter into a stream of translated instructions having a transactional code region and to store the architectural value into the register, and an execution unit to execute the transactional code region of the stream of translated instructions. The binary translation logic is configured to add the architectural value from the register to the architectural performance counter upon completion of the transactional code region of the stream of translated instructions. In one embodiment, a binary translation system overcomes software incompatibilities by using microarchitectural support to transparently and accurately emulate architectural performance counter behavior.

13.

发明授权
Technologies for translation cache management in binary translation systems 有权

公开(公告)号：US10282182B2

公开(公告)日：2019-05-07

申请号：US15274624

申请日：2016-09-23

Applicant: Intel Corporation

Inventor： Paul Caprioli , Jeffrey J. Cook

IPC: G06F8/52 , G06F11/34 , G06F12/02 , G06F9/455 , G06F9/30

Abstract: Technologies for optimized binary translation include a computing device that determines a cost-benefit metric associated with each translated code block of a translation cache. The cost-benefit metric is indicative of translation cost and performance benefit associated with the translated code block. The translation cost may be determined by measuring translation time of the translated code block. The cost-benefit metric may be calculated using a weighted cost-benefit function based on an expected workload of the computing device. In response to determining to free space in the translation cache, the computing device determines whether to discard each translated code block as a function of the cost-benefit metric. In response to determining to free space in the translation cache, the computing device may increment an iteration count and skip each translated code block if the iteration count modulo the corresponding cost-benefit metric is non-zero. Other embodiments are described and claimed.

14.

发明授权
State recovery methods and apparatus for computing platforms 有权

公开(公告)号：US09507575B2

公开(公告)日：2016-11-29

申请号：US14709154

申请日：2015-05-11

Applicant: Intel Corporation

Inventor： Abhay S. Kanhere , Saurabh Shukla , Suriya Subramanian , Paul Caprioli

IPC: G06F9/45 , G06F9/455

CPC classification number: G06F8/443 , G06F9/45516 , G06F11/1405 , G06F2201/805

Abstract: State recovery methods and apparatus for computing platforms are disclosed. An example method includes inserting a first instruction into optimized code to cause a first portion of a register in a first state to be saved to memory before execution of a region of the optimized code; and maintaining a value indicative of a manner in which a second portion of the register in the first state is to be restored in connection with a state recovery from the optimized code.

15.

发明授权
Technologies for shadow stack manipulation for binary translation systems 有权
Title translation: 二进制翻译系统的影子栈操作技术

公开(公告)号：US09477453B1

公开(公告)日：2016-10-25

申请号：US14748363

申请日：2015-06-24

Applicant: Intel Corporation

Inventor： Tugrul Ince , Koichi Yamada , Paul Caprioli , Jiwei Lu

IPC: G06F9/45 , G06F12/08

CPC classification number: G06F8/52 , G06F9/4486 , G06F12/08 , G06F2212/451

Abstract: Technologies for shadow stack management include a computing device that, when executing a translated call routine in a translated binary, pushes a native return address on to a native stack of the computing device, adds a constant offset to a stack pointer of the computing device, executes a native call instruction to a translated call target, and, after executing the native call instruction, subtracts the constant offset from the stack pointer. Executing the native call instruction pushes a translated return address onto a shadow stack of the computing device. The computing device may map two or more virtual memory pages of the shadow stack onto a single physical memory page. The computing device may execute a translated return routine that pops the native return address from the native stack, adds the constant offset to the stack pointer, and executes a native return instruction. Other embodiments are described and claimed.

Abstract translation: 用于阴影堆栈管理的技术包括计算设备，当在翻译的二进制文件中执行转换的调用例程时，将本地返回地址推送到计算设备的本机堆栈，向计算设备的堆栈指针添加恒定偏移量，对转换后的呼叫目标执行本机调用指令，执行本地调用指令后，从堆栈指针中减去常量偏移量。执行本地调用指令将转换后的返回地址推送到计算设备的影子栈上。计算设备可以将阴影栈的两个或多个虚拟存储器页面映射到单个物理存储器页面上。计算设备可以执行翻译的返回例程，其从本机堆栈弹出本地返回地址，将常量偏移量添加到堆栈指针，并执行本地返回指令。描述和要求保护其他实施例。

16.

发明授权
System, method and apparatus for improving transactional memory (TM) throughput using TM region indicators 有权
Title translation: 使用TM区域指标改善事务记忆（TM）吞吐量的系统，方法和装置

公开(公告)号：US09411739B2

公开(公告)日：2016-08-09

申请号：US13691218

申请日：2012-11-30

Applicant: Intel Corporation

Inventor： Omar M. Shaikh , Ravi Rajwar , Paul Caprioli , Muawya M. Al-Otoom

IPC: G06F12/08 , G06F9/46 , G06F9/38

CPC classification number: G06F9/3855 , G06F9/3004 , G06F9/30043 , G06F9/3016 , G06F9/3802 , G06F9/384 , G06F9/3842 , G06F9/3857 , G06F9/3863 , G06F9/467 , G06F11/1448 , G06F11/1469 , G06F12/0828 , G06F12/084 , G06F12/0842 , G06F12/0875 , G06F2201/84 , G06F2212/1016 , G06F2212/452 , G06F2212/507 , G06F2212/6042 , G06F2212/62 , G06F2212/621 , G06F2213/0026

Abstract: Systems, apparatuses, and methods for improving transactional memory (TM) throughput using a TM region indicator (or color) are described. Through the use of TM region indicators younger TM regions can have their instructions retired while waiting for older TM regions to commit. A copy-on-write (COW) buffer may be used to maintain a mapping from checkpointed architectural registers to physical registers, wherein the COW buffer maintains a plurality of register checkpoints for a plurality of TM regions by marking separations between TM regions using pointers, a first pointer to identify a position in the COW buffer of the last committed instruction, a retirement pointer to identify a boundary between a youngest TM region and a currently retiring position.

Abstract translation: 描述了使用TM区域指示符（或颜色）来提高事务性存储器（TM）吞吐量的系统，装置和方法。通过使用TM区域指标，年龄较小的TM区域可以在等待旧TM区域提交时，退出指令。可以使用写时复制（COW）缓冲器来维护从检查点架构寄存器到物理寄存器的映射，其中COW缓冲器通过使用指针标记TM区域之间的分隔来维护多个TM区域的多个寄存器检查点，用于识别最后提交的指令的COW缓冲器中的位置的第一指针，用于识别最小TM区域和当前退休位置之间的边界的退休指针。

17.

发明授权
Systems, methods, and apparatuses for heterogeneous computing 有权

公开(公告)号：US12135981B2

公开(公告)日：2024-11-05

申请号：US18207870

申请日：2023-06-09

Applicant: Intel Corporation

Inventor： Rajesh M. Sankaran , Gilbert Neiger , Narayan Ranganathan , Stephen R. Van Doren , Joseph Nuzman , Niall D. McDonnell , Michael A. O'Hanlon , Lokpraveen B. Mosur , Tracy Garrett Drysdale , Eriko Nurvitadhi , Asit K. Mishra , Ganesh Venkatesh , Deborah T. Marr , Nicholas P. Carter , Jonathan D. Pearce , Edward T. Grochowski , Richard J. Greco , Robert Valentine , Jesus Corbal , Thomas D. Fletcher , Dennis R. Bradford , Dwight P. Manley , Mark J. Charney , Jeffrey J. Cook , Paul Caprioli , Koichi Yamada , Kent D. Glossop , David B. Sheffield

IPC: G06F9/48 , G06F9/30 , G06F9/38

Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.

18.

发明授权
Systems, methods, and apparatuses for heterogeneous computing 有权

公开(公告)号：US11416281B2

公开(公告)日：2022-08-16

申请号：US16474978

申请日：2016-12-31

Applicant: Intel Corporation

Inventor： Rajesh M. Sankaran , Gilbert Neiger , Narayan Ranganathan , Stephen R. Van Doren , Joseph Nuzman , Niall D. McDonnell , Michael A. O'Hanlon , Lokpraveen B. Mosur , Tracy Garrett Drysdale , Eriko Nurvitadhi , Asit K. Mishra , Ganesh Venkatesh , Deborah T. Marr , Nicholas P. Carter , Jonathan D. Pearce , Edward T. Grochowski , Richard J. Greco , Robert Valentine , Jesus Corbal , Thomas D. Fletcher , Dennis R. Bradford , Dwight P. Manley , Mark J. Charney , Jeffrey J. Cook , Paul Caprioli , Koichi Yamada , Kent D. Glossop , David B. Sheffield

IPC: G06F9/48 , G06F9/30 , G06F9/38

Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.

19.

发明授权
Technologies for translation cache management in binary translation systems 有权

公开(公告)号：US10983773B2

公开(公告)日：2021-04-20

申请号：US16378641

申请日：2019-04-09

Applicant: Intel Corporation

Inventor： Paul Caprioli , Jeffrey J. Cook

IPC: G06F9/45 , G06F8/52 , G06F9/455 , G06F12/02 , G06F9/30 , G06F11/34

Abstract: Technologies for optimized binary translation include a computing device that determines a cost-benefit metric associated with each translated code block of a translation cache. The cost-benefit metric is indicative of translation cost and performance benefit associated with the translated code block. The translation cost may be determined by measuring translation time of the translated code block. The cost-benefit metric may be calculated using a weighted cost-benefit function based on an expected workload of the computing device. In response to determining to free space in the translation cache, the computing device determines whether to discard each translated code block as a function of the cost-benefit metric. In response to determining to free space in the translation cache, the computing device may increment an iteration count and skip each translated code block if the iteration count modulo the corresponding cost-benefit metric is non-zero. Other embodiments are described and claimed.

20.

发明申请
HARDWARE FOR MISS HANDLING FROM A TRANSLATION PROTECTION DATA STRUCTURE 审中-公开

公开(公告)号：US20180285283A1

公开(公告)日：2018-10-04

申请号：US15475646

申请日：2017-03-31

Applicant: INTEL CORPORATION

Inventor： Paul Caprioli , Jeffrey J. Cook

IPC: G06F12/128 , G06F12/1045 , G06F12/122 , G06F12/0831 , G06F12/1009

Abstract: A processor includes a memory to store original code and a fingerprint data structure, which stores, in a way thereof, an entry including a physical address for a page and a stored fingerprint generated from the page of the original code. A core includes a translation protection data structure (TPDS) to detect modification to the page, wherein the core is to, upon execution of a translation check instruction included within a translated page code corresponding to the page, transmit, to the TPDS, a modification check request having the physical address of the page in the memory and the way of the fingerprint data structure. A hardware TPDS miss handler is coupled to the core and is to process a miss request received from the TPDS responsive to the physical address not being present in the TPDS.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification