-
公开(公告)号:US11550632B2
公开(公告)日:2023-01-10
申请号:US15775249
申请日:2015-12-24
Applicant: INTEL CORPORATION
Inventor: Yuanyuan Li , Yong Jiang , Linghyi Kong
Abstract: A mechanism is described for facilitating efficient communication and data processing across clusters of computing machines in a heterogenous computing environment. A method includes detecting a request for processing of data using a programming framework and a programming model; facilitating interfacing between the programming framework and the programming model, wherein interfacing includes merging the programming model into the programming framework, wherein interfacing further includes integrating the programming framework with a distribution framework hosting the programming model; and calling on the distribution framework to schedule processing of a plurality of jobs based on the request.
-
公开(公告)号:US20210334127A1
公开(公告)日:2021-10-28
申请号:US17197304
申请日:2021-03-10
Applicant: Intel Corporation
Inventor: Yong JIANG , Yuanyuan Li , Jianghong Du , Kuilin Chen , Thomas A. Tetzlaff
Abstract: Embodiments described herein provide a system, method, and apparatus to accelerate reduce operations in a graphics processor. One embodiment provides an apparatus including one or more processors, the one or more processors including a first logic unit to perform a merged write, barrier, and read operation in response to a barrier synchronization request from a set of threads in a work group, synchronize the set of threads, and report a result of an operation specified in association with the barrier synchronization request.
-
公开(公告)号:US20200302568A1
公开(公告)日:2020-09-24
申请号:US15779368
申请日:2016-12-06
Applicant: INTEL CORPORATION
Inventor: Yuanyuan Li , Hai Bai , Guizi Li
Abstract: A system and method for distributed computing including a compute node having a graphics processing unit (GPU) to execute tasks of a distributed computing job. A distributed-computing programming framework executes the tasks on the compute node. A GPU-daemon process shares GPU resources between the tasks executing on the GPU of the compute node.
-
公开(公告)号:US12271319B2
公开(公告)日:2025-04-08
申请号:US17054762
申请日:2018-09-27
Applicant: Intel Corporation
Inventor: Yanjie Pan , Yong Jiang , Yuanyuan Li , Yong Zhang
IPC: G06F12/123 , G06F5/06 , G06F5/12 , G06F12/0893
Abstract: Systems, methods, and computer-readable media are provided for variable precision first in, first out (FIFO) buffers (VPFB) that dynamically changes the amount of data to be stored in the VPFB based on a current amount of data stored in the VPFB and/or based on a current amount of available memory space of the VPFB. The currently unavailable memory space (or the current available memory space) is used to select the size of a next data block to be stored in the VPFB. Other embodiments are disclosed and/or claimed.
-
公开(公告)号:US10949251B2
公开(公告)日:2021-03-16
申请号:US16066652
申请日:2016-04-01
Applicant: Intel Corporation
Inventor: Yong Jiang , Yuanyuan Li , Jianghong Du , Kuilin Chen , Thomas A. Tetzlaff
Abstract: Embodiments described herein provide a system, method, and apparatus to accelerate reduce operations in a graphics processor. One embodiment provides an apparatus including one or more processors, the one or more processors including a first logic unit to perform a merged write, barrier, and read operation in response to a barrier synchronization request from a set of threads in a work group, synchronize the set of threads, and report a result of an operation specified in association with the barrier synchronization request.
-
公开(公告)号:US20200320196A1
公开(公告)日:2020-10-08
申请号:US16650643
申请日:2017-12-13
Applicant: INTEL CORPORATION
Inventor: Danyu Bi , Salmin Sultana , Yuanyuan Li , Yong Jiang , Pramod Pesara , Selvakumar Panneer , Ravi Sahita
IPC: G06F21/56 , G06F9/30 , G06F9/448 , G06F11/36 , G06F12/1009
Abstract: A system for detecting malware includes a processor to collect processor trace information corresponding to an application being executed by the processor (202). The processor can also detect an invalid indirect branch instruction from the processor trace information (204) and detect at least one malware instruction being executed by the application in response to analyzing modified memory values corresponding to the invalid indirect branch (206). Additionally, the processor can block the application from accessing or modifying memory (208).
-
公开(公告)号:US20190042412A1
公开(公告)日:2019-02-07
申请号:US15757727
申请日:2015-09-25
Applicant: Intel Corporation
Inventor: Jianghong Du , Yong Jiang , Lei Shen , Yuanyuan Li , Ruijia Li , Lingyi Kong
IPC: G06F12/084
Abstract: Methods and apparatus to improve shared memory efficiency are described. In an embodiment, a first version of a code to access one or more registers as shared local memory is compiled. A second version of the same code is also compiled to access a cache as the shared local memory. The first version of the code is executed in response to comparison of a work group size of the code with a threshold value. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20240272933A1
公开(公告)日:2024-08-15
申请号:US18626689
申请日:2024-04-04
Applicant: Intel Corporation
Inventor: Yong Jiang , Yuanyuan Li , Jianghong Du , Kuilin Chen , Thomas A. Tetzlaff
CPC classification number: G06F9/4843 , G06F9/3009 , G06F9/522 , G06T1/20 , G06F8/458 , G06F9/30087
Abstract: Embodiments described herein provide a system, method, and apparatus to accelerate reduce operations in a graphics processor. One embodiment provides an apparatus including one or more processors, the one or more processors including a first logic unit to perform a merged write, barrier, and read operation in response to a barrier synchronization request from a set of threads in a work group, synchronize the set of threads, and broadcast a result of an operation specified in association with the barrier synchronization request.
-
公开(公告)号:US20240265232A1
公开(公告)日:2024-08-08
申请号:US18565972
申请日:2021-09-24
Applicant: Intel Corporation
Inventor: Darren Crews , Yong Jiang , Yuanyuan Li , Xu Qian , Peiqing Jiang , Haiyun Hong
IPC: G06N3/04
CPC classification number: G06N3/04
Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed. An example apparatus includes at least one memory, instructions in the apparatus, and processor circuitry to execute the instructions to detect a pattern of an upsampled input submatrix, generate a transformed input submatrix by selecting four elements of the upsampled input submatrix, select a transformed weight submatrix based on the pattern, and convolve the transformed input submatrix and the transformed weight submatrix.
-
公开(公告)号:US11068401B2
公开(公告)日:2021-07-20
申请号:US15757727
申请日:2015-09-25
Applicant: Intel Corporation
Inventor: Jianghong Du , Yong Jiang , Lei Shen , Yuanyuan Li , Ruijia Li , Lingyi Kong
IPC: G06F12/084
Abstract: Methods and apparatus to improve shared memory efficiency are described. In an embodiment, a first version of a code to access one or more registers as shared local memory is compiled. A second version of the same code is also compiled to access a cache as the shared local memory. The first version of the code is executed in response to comparison of a work group size of the code with a threshold value. Other embodiments are also disclosed and claimed.
-
-
-
-
-
-
-
-
-