-
31.
公开(公告)号:US20160019038A1
公开(公告)日:2016-01-21
申请号:US14867950
申请日:2015-09-28
Applicant: Intel Corporation
Inventor: Mauricio Breternitz, JR. , Youfeng Wu , Cheng Wang , Edson Borin , Shiliang Hu , Criag B. Zilles
CPC classification number: G06F8/443 , G06F8/52 , G06F9/3004 , G06F9/30072 , G06F9/30087 , G06F9/30116 , G06F9/3842 , G06F9/3857 , G06F9/466 , G06F11/3672 , G06F11/3688
Abstract: An apparatus and method is described herein for conditionally committing and/or speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.
-
公开(公告)号:US10540178B2
公开(公告)日:2020-01-21
申请号:US15265587
申请日:2016-09-14
Applicant: Intel Corporation
Inventor: Vineeth Mekkat , Youfeng Wu , Sebastian Winkel , Oleg Margulis
IPC: G06F12/0875 , G06F9/30 , G06F9/38 , G06F9/32
Abstract: A processor for redundant stores includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, a binary translator, and a memory execution unit. The binary translator includes circuitry to identify a first region of the instruction stream including a redundant store, mark a first starting instruction of the first region with a protection designator, mark a first ending instruction of the first region with a clear designator, and store an amended instruction stream with the markings. The memory execution unit includes circuitry to track the first redundant store based on the protection designator and the clear designator to eliminate the first redundant store.
-
公开(公告)号:US10303525B2
公开(公告)日:2019-05-28
申请号:US14582717
申请日:2014-12-24
Applicant: Intel Corporation
Inventor: Elmoustapha Ould-Ahmed-Vall , Christopher J. Hughes , Robert Valentine , Milind B. Girkar , Hideki Ido , Youfeng Wu , Cheng Wang
Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for performing DSX comprises a hardware decoder to decode an instruction, the instruction to include an opcode and an operand to store a portion of a fallback address, execution hardware to execute the decoded instruction to initiate a data speculative execution (DSX) region by activating DSX tracking hardware to track speculative memory accesses and detect ordering violations in the DSX region, and storing the fallback address.
-
公开(公告)号:US09940229B2
公开(公告)日:2018-04-10
申请号:US14496621
申请日:2014-09-25
Applicant: Intel Corporation
Inventor: Xipeng Shen , Youfeng Wu , Cheng Wang , Hyunchul Park , Hongbo Rong
IPC: G06F12/02
CPC classification number: G06F12/0238 , G06F2212/7201 , G06F2212/7202 , G06F2212/7207
Abstract: Technologies for persistent memory programming include a computing device having a persistent memory including one or more nonvolatile regions. The computing device may assign a virtual memory address of a target location in persistent memory to a persistent memory pointer using persistent pointer strategy, and may dereference the pointer using the same strategy. Persistent pointer strategies include off-holder, ID-in-value, optimistic rectification, and pessimistic rectification. The computing device may log changes to persistent memory during the execution of a data consistency section, and commit changes to the persistent memory when the last data consistency section ends. Data consistency sections may be grouped by log group identifier. Using type metadata stored in the nonvolatile region, the computing device may identify the type of a root object within the nonvolatile region and then recursively identify the type of all objects referenced by the root object. Other embodiments are described and claimed.
-
公开(公告)号:US20180074827A1
公开(公告)日:2018-03-15
申请号:US15265587
申请日:2016-09-14
Applicant: Intel Corporation
Inventor: Vineeth Mekkat , Youfeng Wu , Sebastian Winkel , Oleg Margulis
IPC: G06F9/38 , G06F9/30 , G06F12/0875
Abstract: A processor for redundant stores includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, a binary translator, and a memory execution unit. The binary translator includes circuitry to identify a first region of the instruction stream including a redundant store, mark a first starting instruction of the first region with a protection designator, mark a first ending instruction of the first region with a clear designator, and store an amended instruction stream with the markings. The memory execution unit includes circuitry to track the first redundant store based on the protection designator and the clear designator to eliminate the first redundant store.
-
公开(公告)号:US09910650B2
公开(公告)日:2018-03-06
申请号:US14497157
申请日:2014-09-25
Applicant: Intel Corporation
Inventor: Albert Hartono , Nalini Vasudevan , Sara S. Baghsorkhi , Cheng Wang , Youfeng Wu
CPC classification number: G06F8/452 , G06F9/4552
Abstract: A computer-implemented method for managing loop code in a compiler includes using a conflict detection procedure that detects across-iteration dependency for arrays of single memory addresses to determine whether a potential across-iteration dependency exists for arrays of memory addresses for ranges of memory accessed by the loop code.
-
公开(公告)号:US20170337137A1
公开(公告)日:2017-11-23
申请号:US15672425
申请日:2017-08-09
Applicant: Intel Corporation
Inventor: Marcelo S. Cintra , Cheng Wang , Youfeng Wu , Alexandre Xavier DuChateau
IPC: G06F12/1009 , G06F12/02
CPC classification number: G06F12/1009 , G06F8/447 , G06F9/3836 , G06F9/44568 , G06F12/0238 , G06F12/0292 , G06F2212/1044
Abstract: Technologies for persistent memory pointer access include a computing device having a persistent memory including one or more nonvolatile regions. The computing device may load a persistent memory pointer having a static region identifier, a segment identifier, and an offset from the persistent memory. The computing device may map the static region identifier to a dynamic region identifier and determine a virtual memory address of the persistent memory pointer target based on the dynamic region identifier, the segment identifier, and the offset. The computing device may load an in-storage representation of a persistent-export pointer from the persistent memory, map the in-storage representation to a runtime representation, and determine a target address of a persistent external data object based on the runtime representation. The computing device may include a compiler to generate output code including persistent memory pointer and/or persistent-export pointer accesses. Other embodiments are described and claimed.
-
公开(公告)号:US09720667B2
公开(公告)日:2017-08-01
申请号:US14222040
申请日:2014-03-21
Applicant: Intel Corporation
Inventor: Sara S. Baghsorkhi , Albert Hartono , Youfeng Wu , Nalini Vasudevan , Cheng Wang
IPC: G06F9/45
CPC classification number: G06F8/452
Abstract: Technologies for automatic loop vectorization include a computing device with an optimizing compiler. During an optimization pass, the compiler identifies a loop and generates a transactional code segment including a vectorized implementation of the loop body including one or more vector memory read instructions capable of generating an exception. The compiler also generates a non-transactional fallback code segment including a scalar implementation of the loop body that is executed in response to an exception generated within the transactional code segment. The compiler may detect whether the loop contains a memory read dependent on a condition that may be updated in a previous iteration or whether the loop contains a potential data dependence between two iterations. The compiler may generate a dynamic check for an actual data dependence and an explicit transactional abort instruction to be executed when an actual data dependence exists. Other embodiments are described and claimed.
-
公开(公告)号:US09710279B2
公开(公告)日:2017-07-18
申请号:US14497833
申请日:2014-09-26
Applicant: Intel Corporation
Inventor: Nalini Vasudevan , Cheng Wang , Youfeng Wu , Albert Hartono , Sara S. Baghsorkhi
CPC classification number: G06F9/3842 , G06F9/30032 , G06F9/30036 , G06F9/3004 , G06F9/30043 , G06F9/3013 , G06F9/30174 , G06F9/3824 , G06F9/3834 , G06F9/3838 , G06F15/8053
Abstract: An apparatus and method for speculative vectorization. For example, one embodiment of a processor comprises: a queue comprising a set of locations for storing addresses associated with vectorized memory access instructions; and execution logic to execute a first vectorized memory access instruction to access the queue and to compare a new address associated with the first vectorized memory access instruction with existing addresses stored within a specified range of locations within the queue to detect whether a conflict exists, the existing addresses having been previously stored responsive to one or more prior vectorized memory access instructions.
-
40.
公开(公告)号:US09501135B2
公开(公告)日:2016-11-22
申请号:US14169955
申请日:2014-01-31
Applicant: INTEL CORPORATION
Inventor: Youfeng Wu , Shiliang Hu , Edson Borin , Cheng Wang
CPC classification number: G06F1/329 , G06F1/3287 , G06F9/3851 , G06F9/445 , G06F9/4893 , G06F9/5027 , G06F9/5094 , G06F11/3409 , G06F11/3452 , G06F11/3466 , G06F2201/81 , G06F2201/865 , G06F2201/88 , G06F2209/501 , Y02D10/171 , Y02D10/22 , Y02D10/34 , Y02D50/20
Abstract: Dynamically switching cores on a heterogeneous multi-core processing system may be performed by executing program code on a first processing core. Power up of a second processing core may be signaled. A first performance metric of the first processing core executing the program code may be collected. When the first performance metric is better than a previously determined core performance metric, power down of the second processing core may be signaled and execution of the program code may be continued on the first processing core. When the first performance metric is not better than the previously determined core performance metric, execution of the program code may be switched from the first processing core to the second processing core.
Abstract translation: 可以通过在第一处理核上执行程序代码来执行异构多核处理系统上的动态切换核。 可以用信号通知第二处理核心的加电。 可以收集执行程序代码的第一处理核心的第一性能度量。 当第一性能指标优于先前确定的核心性能指标时,可以发信号通知第二处理核心的掉电,并且可以在第一处理核心上继续执行程序代码。 当第一性能度量不比先前确定的核心性能度量好时,程序代码的执行可以从第一处理核心切换到第二处理核心。
-
-
-
-
-
-
-
-
-