APPARATUSES, METHODS, AND SYSTEMS FOR SWIZZLE OPERATIONS IN A CONFIGURABLE SPATIAL ACCELERATOR

    公开(公告)号:US20200310797A1

    公开(公告)日:2020-10-01

    申请号:US16370915

    申请日:2019-03-30

    Abstract: Systems, methods, and apparatuses relating to swizzle operations and disable operations in a configurable spatial accelerator (CSA) are described. Certain embodiments herein provide for an encoding system for a specific set of swizzle primitives across a plurality of packed data elements in a CSA. In one embodiment, a CSA includes a plurality of processing elements, a circuit switched interconnect network between the plurality of processing elements, and a configuration register within each processing element to store a configuration value having a first portion that, when set to a first value that indicates a first mode, causes the processing element to pass an input value to operation circuitry of the processing element without modifying the input value, and, when set to a second value that indicates a second mode, causes the processing element to perform a swizzle operation on the input value to form a swizzled input value before sending the swizzled input value to the operation circuitry of the processing element, and a second portion that causes the processing element to perform an operation indicated by the second portion the configuration value on the input value in the first mode and the swizzled input value in the second mode with the operation circuitry.

    Apparatus and method for architectural performance monitoring in binary translation systems

    公开(公告)号:US10387159B2

    公开(公告)日:2019-08-20

    申请号:US14614264

    申请日:2015-02-04

    Abstract: Methods and apparatuses relate to emulating architectural performance monitoring in a binary translation system. In one embodiment, a processor includes an architectural performance counter to maintain an architectural value associated with instruction execution, a register to store the architectural value of the architectural performance counter, binary translation logic to embed an architectural value from the architectural performance counter into a stream of translated instructions having a transactional code region and to store the architectural value into the register, and an execution unit to execute the transactional code region of the stream of translated instructions. The binary translation logic is configured to add the architectural value from the register to the architectural performance counter upon completion of the transactional code region of the stream of translated instructions. In one embodiment, a binary translation system overcomes software incompatibilities by using microarchitectural support to transparently and accurately emulate architectural performance counter behavior.

    Technologies for translation cache management in binary translation systems

    公开(公告)号:US10282182B2

    公开(公告)日:2019-05-07

    申请号:US15274624

    申请日:2016-09-23

    Abstract: Technologies for optimized binary translation include a computing device that determines a cost-benefit metric associated with each translated code block of a translation cache. The cost-benefit metric is indicative of translation cost and performance benefit associated with the translated code block. The translation cost may be determined by measuring translation time of the translated code block. The cost-benefit metric may be calculated using a weighted cost-benefit function based on an expected workload of the computing device. In response to determining to free space in the translation cache, the computing device determines whether to discard each translated code block as a function of the cost-benefit metric. In response to determining to free space in the translation cache, the computing device may increment an iteration count and skip each translated code block if the iteration count modulo the corresponding cost-benefit metric is non-zero. Other embodiments are described and claimed.

    Technologies for shadow stack manipulation for binary translation systems
    15.
    发明授权
    Technologies for shadow stack manipulation for binary translation systems 有权
    二进制翻译系统的影子栈操作技术

    公开(公告)号:US09477453B1

    公开(公告)日:2016-10-25

    申请号:US14748363

    申请日:2015-06-24

    CPC classification number: G06F8/52 G06F9/4486 G06F12/08 G06F2212/451

    Abstract: Technologies for shadow stack management include a computing device that, when executing a translated call routine in a translated binary, pushes a native return address on to a native stack of the computing device, adds a constant offset to a stack pointer of the computing device, executes a native call instruction to a translated call target, and, after executing the native call instruction, subtracts the constant offset from the stack pointer. Executing the native call instruction pushes a translated return address onto a shadow stack of the computing device. The computing device may map two or more virtual memory pages of the shadow stack onto a single physical memory page. The computing device may execute a translated return routine that pops the native return address from the native stack, adds the constant offset to the stack pointer, and executes a native return instruction. Other embodiments are described and claimed.

    Abstract translation: 用于阴影堆栈管理的技术包括计算设备,当在翻译的二进制文件中执行转换的调用例程时,将本地返回地址推送到计算设备的本机堆栈,向计算设备的堆栈指针添加恒定偏移量, 对转换后的呼叫目标执行本机调用指令,执行本地调用指令后,从堆栈指针中减去常量偏移量。 执行本地调用指令将转换后的返回地址推送到计算设备的影子栈上。 计算设备可以将阴影栈的两个或多个虚拟存储器页面映射到单个物理存储器页面上。 计算设备可以执行翻译的返回例程,其从本机堆栈弹出本地返回地址,将常量偏移量添加到堆栈指针,并执行本地返回指令。 描述和要求保护其他实施例。

    Technologies for translation cache management in binary translation systems

    公开(公告)号:US10983773B2

    公开(公告)日:2021-04-20

    申请号:US16378641

    申请日:2019-04-09

    Abstract: Technologies for optimized binary translation include a computing device that determines a cost-benefit metric associated with each translated code block of a translation cache. The cost-benefit metric is indicative of translation cost and performance benefit associated with the translated code block. The translation cost may be determined by measuring translation time of the translated code block. The cost-benefit metric may be calculated using a weighted cost-benefit function based on an expected workload of the computing device. In response to determining to free space in the translation cache, the computing device determines whether to discard each translated code block as a function of the cost-benefit metric. In response to determining to free space in the translation cache, the computing device may increment an iteration count and skip each translated code block if the iteration count modulo the corresponding cost-benefit metric is non-zero. Other embodiments are described and claimed.

    HARDWARE FOR MISS HANDLING FROM A TRANSLATION PROTECTION DATA STRUCTURE

    公开(公告)号:US20180285283A1

    公开(公告)日:2018-10-04

    申请号:US15475646

    申请日:2017-03-31

    Abstract: A processor includes a memory to store original code and a fingerprint data structure, which stores, in a way thereof, an entry including a physical address for a page and a stored fingerprint generated from the page of the original code. A core includes a translation protection data structure (TPDS) to detect modification to the page, wherein the core is to, upon execution of a translation check instruction included within a translated page code corresponding to the page, transmit, to the TPDS, a modification check request having the physical address of the page in the memory and the way of the fingerprint data structure. A hardware TPDS miss handler is coupled to the core and is to process a miss request received from the TPDS responsive to the physical address not being present in the TPDS.

Patent Agency Ranking