System and method for parallelization of data processing in a processor

    公开(公告)号:US10558466B2

    公开(公告)日:2020-02-11

    申请号:US15191257

    申请日:2016-06-23

    Abstract: Systems, apparatuses, and methods for adjusting group sizes to match a processor lane width are described. In early iterations of an algorithm, a processor partitions a dataset into groups of data points which are integer multiples of the processing lane width of the processor. For example, when performing a K-means clustering algorithm, the processor determines that a first plurality of data points belong to a first group during a given iteration. If the first plurality of data points is not an integer multiple of the number of processing lanes, then the processor reassigns a first number of data points from the first plurality of data points to one or more other groups. The processor then performs the next iteration with these first number of data points assigned to other groups even though the first number of data points actually meets the algorithmic criteria for belonging to the first group.

    ADAPTIVE QUANTIZATION FOR NEURAL NETWORKS
    2.
    发明公开

    公开(公告)号:US20240054332A1

    公开(公告)日:2024-02-15

    申请号:US18496411

    申请日:2023-10-27

    CPC classification number: G06N3/063 G06N3/08

    Abstract: Methods, devices, systems, and instructions for adaptive quantization in an artificial neural network (ANN) calculate a distribution of ANN information; select a quantization function from a set of quantization functions based on the distribution; apply the quantization function to the ANN information to generate quantized ANN information; load the quantized ANN information into the ANN; and generate an output based on the quantized ANN information. Some examples recalculate the distribution of ANN information and reselect the quantization function from the set of quantization functions based on the resampled distribution if the output does not sufficiently correlate with a known correct output. In some examples, the ANN information includes a set of training data. In some examples, the ANN information includes a plurality of link weights.

    GRAPH MATCHING FOR OPTIMIZED DEEP NETWORK PROCESSING

    公开(公告)号:US20180314945A1

    公开(公告)日:2018-11-01

    申请号:US15498943

    申请日:2017-04-27

    Abstract: Systems, apparatuses, and methods for enhanced resolution video and security via machine learning are disclosed. A system is configured to receive a source code representation of a neural network. In one embodiment, the source code representation is a directed acyclic graph (DAG). The system determines if the source code representation includes any of one or more patterns, with each pattern including two or more adjacent layers. The system also identifies, for each pattern, a combined layer with which to replace the detected pattern. If any occurrences of the one or more patterns are detected in the source code representation, the system replaces each pattern with a corresponding combined layer. Additionally, the system generates an optimized representation of the neural network, wherein the optimized representation includes replacements for any detected patterns. The optimized representation can be utilized to generate an executable version of the neural network.

    METHOD AND APPARATUS FOR PERFORMING A SEARCH OPERATION ON HETEROGENEOUS COMPUTING SYSTEMS
    5.
    发明申请
    METHOD AND APPARATUS FOR PERFORMING A SEARCH OPERATION ON HETEROGENEOUS COMPUTING SYSTEMS 审中-公开
    在异构计算系统上执行搜索操作的方法和装置

    公开(公告)号:US20160378791A1

    公开(公告)日:2016-12-29

    申请号:US14749063

    申请日:2015-06-24

    Inventor: Mayank Daga

    CPC classification number: G06F17/30519

    Abstract: A method and apparatus for performing a top-down Breadth-First Search (BFS) includes performing a first determination whether to convert to a bottom-up BFS. A second determination is performed whether to convert to the bottom-up BFS, based upon the first determination being positive. The bottom-up BFS is performed, based upon the first determination and the second determination being positive. A third determination is made whether to convert from the bottom-up BFS to the top-down BFS, based upon the third determination being positive.

    Abstract translation: 用于执行自上而下的宽度优先搜索(BFS)的方法和装置包括执行是否转换成自底向上的BFS的第一确定。 基于第一确定为正,执行第二确定是否转换到自底向上的BFS。 基于第一确定并且第二确定为正,执行自底向上的BFS。 第三个确定是基于第三个确定是正的,从自下而上的BFS转换到自上而下的BFS。

    EFFICIENT SPARSE MATRIX-VECTOR MULTIPLICATION ON PARALLEL PROCESSORS
    6.
    发明申请
    EFFICIENT SPARSE MATRIX-VECTOR MULTIPLICATION ON PARALLEL PROCESSORS 有权
    并行处理器的有效空间矩阵矢量多项式

    公开(公告)号:US20160140084A1

    公开(公告)日:2016-05-19

    申请号:US14542003

    申请日:2014-11-14

    CPC classification number: G06F17/16

    Abstract: A method of multiplication of a sparse matrix and a vector to obtain a new vector and a system for implementing the method are claimed. Embodiments of the method are intended to optimize the performance of sparse matrix-vector multiplication in highly parallel processors, such as GPUs. The sparse matrix is stored in compressed sparse row (CSR) format.

    Abstract translation: 要求一种稀疏矩阵和向量的乘法以获得新的向量的方法和用于实现该方法的系统。 该方法的实施例旨在优化在诸如GPU的高度并行处理器中的稀疏矩阵向量乘法的性能。 稀疏矩阵以压缩稀疏行(CSR)格式存储。

    Adaptive quantization for neural networks

    公开(公告)号:US11803734B2

    公开(公告)日:2023-10-31

    申请号:US15849617

    申请日:2017-12-20

    CPC classification number: G06N3/063 G06N3/08

    Abstract: Methods, devices, systems, and instructions for adaptive quantization in an artificial neural network (ANN) calculate a distribution of ANN information; select a quantization function from a set of quantization functions based on the distribution; apply the quantization function to the ANN information to generate quantized ANN information; load the quantized ANN information into the ANN; and generate an output based on the quantized ANN information. Some examples recalculate the distribution of ANN information and reselect the quantization function from the set of quantization functions based on the resampled distribution if the output does not sufficiently correlate with a known correct output. In some examples, the ANN information includes a set of training data. In some examples, the ANN information includes a plurality of link weights.

    ADAPTIVE QUANTIZATION FOR NEURAL NETWORKS
    8.
    发明申请

    公开(公告)号:US20190188557A1

    公开(公告)日:2019-06-20

    申请号:US15849617

    申请日:2017-12-20

    CPC classification number: G06N3/063 G06N3/08

    Abstract: Methods, devices, systems, and instructions for adaptive quantization in an artificial neural network (ANN) calculate a distribution of ANN information; select a quantization function from a set of quantization functions based on the distribution; apply the quantization function to the ANN information to generate quantized ANN information; load the quantized ANN information into the ANN; and generate an output based on the quantized ANN information. Some examples recalculate the distribution of ANN information and reselect the quantization function from the set of quantization functions based on the resampled distribution if the output does not sufficiently correlate with a known correct output. In some examples, the ANN information includes a set of training data. In some examples, the ANN information includes a plurality of link weights.

    Method and apparatus for performing a search operation on heterogeneous computing systems

    公开(公告)号:US10031947B2

    公开(公告)日:2018-07-24

    申请号:US14749063

    申请日:2015-06-24

    Inventor: Mayank Daga

    Abstract: A method and apparatus for performing a top-down Breadth-First Search (BFS) includes performing a first determination whether to convert to a bottom-up BFS. A second determination is performed whether to convert to the bottom-up BFS, based upon the first determination being positive. The bottom-up BFS is performed, based upon the first determination and the second determination being positive. A third determination is made whether to convert from the bottom-up BFS to the top-down BFS, based upon the third determination being positive.

    SYSTEM AND METHOD FOR PROCESSING DATA IN A COMPUTING SYSTEM

    公开(公告)号:US20170371665A1

    公开(公告)日:2017-12-28

    申请号:US15191257

    申请日:2016-06-23

    Abstract: Systems, apparatuses, and methods for adjusting group sizes to match a processor lane width are described. In early iterations of an algorithm, a processor partitions a dataset into groups of data points which are integer multiples of the processing lane width of the processor. For example, when performing a K-means clustering algorithm, the processor determines that a first plurality of data points belong to a first group during a given iteration. If the first plurality of data points is not an integer multiple of the number of processing lanes, then the processor reassigns a first number of data points from the first plurality of data points to one or more other groups. The processor then performs the next iteration with these first number of data points assigned to other groups even though the first number of data points actually meets the algorithmic criteria for belonging to the first group.

Patent Agency Ranking