-
公开(公告)号:US20210248087A1
公开(公告)日:2021-08-12
申请号:US17054762
申请日:2018-09-27
Applicant: Intel Corporation
Inventor: Yanjie PAN , Yong JIANG , Yuanyuan LI , Yong ZHANG
IPC: G06F12/123 , G06F12/0893
Abstract: Systems, methods, and computer-readable media are provided for variable precision first in, first out (FIFO) buffers (VPFB) that dynamically changes the amount of data to be stored in the VPFB based on a current amount of data stored in the VPFB and/or based on a current amount of available memory space of the VPFB. The currently unavailable memory space (or the current available memory space) is used to select the size of a next data block to be stored in the VPFB. Other embodiments are disclosed and/or claimed.
-
公开(公告)号:US20180341526A1
公开(公告)日:2018-11-29
申请号:US15775249
申请日:2015-12-24
Applicant: INTEL CORPORATION
Inventor: Yuanyuan LI , Yong JIANG , Linghyi KONG
Abstract: A mechanism is described for facilitating efficient communication and data processing across clusters of computing machines in a heterogeneous computing environment. A method includes detecting a request for processing of data using a programming framework and a programming model; facilitating interfacing between the programming framework and the programming model, wherein interfacing includes merging the programming model into the programming framework, wherein interfacing further includes integrating the programming framework with a distribution framework hosting the programming model; and calling on the distribution framework to schedule processing of a plurality of jobs based on the request.
-
公开(公告)号:US20240289598A1
公开(公告)日:2024-08-29
申请号:US18566481
申请日:2021-11-15
Applicant: Intel Corporation
Inventor: Xu QIAN , Darren CREWS , Yuanyuan LI
IPC: G06N3/0495
CPC classification number: G06N3/0495
Abstract: Provided herein are apparatus and methods for reinforcement learning based post-training sparsification. An apparatus includes: a memory; and processor circuitry coupled with the memory, wherein the processor circuitry is to: obtain a first correction parameter indicating a mean shift of a set of weights after sparsification of a model with respect to that before the sparsification of the model; obtain a second correction parameter indicating a variance shift of the set of weights after the sparsification of the model with respect to that before the sparsification of the model; and correct the set of weights at least partially based on the first correction parameter and the second correction parameter, and wherein the memory is to store the corrected set of weights. Other embodiments may also be disclosed and claimed.
-
公开(公告)号:US20190026149A1
公开(公告)日:2019-01-24
申请号:US16066652
申请日:2016-04-01
Applicant: INTEL CORPORATION
Inventor: Yong JIANG , Yuanyuan LI , Jianghong DU , Kuilin CHEN , Thomas A. TETZLAFF
Abstract: Embodiments described herein provide a system, method, and apparatus to accelerate reduce operations in a graphics processor. One embodiment provides an apparatus including one or more processors, the one or more processors including a first logic unit to perform a merged write, barrier, and read operation in response to a barrier synchronization request from a set of threads in a work group, synchronize the set of threads, and report a result of an operation specified in association with the barrier synchronization request.
-
公开(公告)号:US20240256838A1
公开(公告)日:2024-08-01
申请号:US18565996
申请日:2021-11-25
Applicant: Intel Corporation
Inventor: Xu QIAN , Haiyun HONG , Peiqing JIANG , Yuanyuan LI , Sijia LOU
IPC: G06N3/0464
CPC classification number: G06N3/0464
Abstract: An apparatus, method, device, and medium for accelerating computation of a process engine are provided. The apparatus includes interface circuitry configured to receive weight data and activation data stored in a batch-height-width-channel (NHWC) memory layout; and processor circuitry configured to in response to that a input channel size is not an integer multiple of a process capacity of a process engine, pad a number of zeroes after a last element of weight data belonging to a filter and a last element of corresponding activation data respectively, slice all weight data elements belonging to the filter and padded zeroes into weight data slices, and corresponding activation data elements and padded zeroes into corresponding activation data slices, in a scale of the process capacity, and feed the process engine with each weight data slice and a corresponding activation data slice sequentially.
-
-
-
-