Patent search ap:("Microsoft Technology Licensing Page LLC") AND inv:"Daniel Lo"

1.

发明授权
Neural network training with decreased memory consumption and processor utilization 有权

公开(公告)号：US11526761B2

公开(公告)日：2022-12-13

申请号：US16550229

申请日：2019-08-24

Applicant: Microsoft Technology Licensing LLC

Inventor： Taesik Na , Daniel Lo , Haishan Zhu , Eric Sen Chung

IPC: G06N3/08

Abstract: Bounding box quantization can reduce the quantity of bits utilized to express numerical values prior to the multiplication of matrices comprised of such numerical values, thereby reducing both memory consumption and processor utilization. Stochastic rounding can provide sufficient precision to enable the storage of weight values in reduced-precision formats without having to separately store weight values in a full-precision format. Alternatively, other rounding mechanisms, such as round to nearest, can be utilized to exchange weight values in reduced-precision formats, while also storing weight values in full-precision formats for subsequent updating. To facilitate conversion, reduced-precision formats such as brain floating-point format can be utilized.

2.

发明授权
Adjusting activation compression for neural network training 有权

公开(公告)号：US12165038B2

公开(公告)日：2024-12-10

申请号：US16276395

申请日：2019-02-14

Applicant: Microsoft Technology Licensing, LLC

Inventor： Daniel Lo , Bita Darvish Rouhani , Eric S. Chung , Yiren Zhao , Amar Phanishayee , Ritchie Zhao

IPC: G06N3/08 , G06F9/30 , G06F18/21 , G06N3/063 , G06N3/084

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and, in particular, for adjusting floating-point formats used to store activation values during training. In certain examples of the disclosed technology, a computing system includes processors, memory, and a floating-point compressor in communication with the memory. The computing system is configured to produce a neural network comprising activation values expressed in a first floating-point format, select a second floating-point format for the neural network based on a performance metric, convert at least one of the activation values to the second floating-point format, and store the compressed activation values in the memory. Aspects of the second floating-point format that can be adjusted include the number of bits used to express mantissas, exponent format, use of non-uniform mantissas, and/or use of outlier values to express some of the mantissas.

3.

发明授权
Neural network activation compression with outlier block floating-point 有权

公开(公告)号：US12045724B2

公开(公告)日：2024-07-23

申请号：US16237202

申请日：2018-12-31

Applicant: Microsoft Technology Licensing, LLC

Inventor： Daniel Lo , Amar Phanishayee , Eric S. Chung , Yiren Zhao , Ritchie Zhao

IPC: G06N3/084 , G06F7/499 , G06F9/30 , G06F9/50 , G06N5/046 , G06N20/00

CPC classification number: G06N3/084 , G06F7/49915 , G06F9/30025 , G06F9/5027 , G06N5/046 , G06N20/00

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats having outlier values are disclosed, and in particular for storing activation values from a neural network in a compressed format for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a narrower numerical precision than the first block floating-point format. Outlier values, comprising additional bits of mantissa and/or exponent are stored in ancillary storage for subset of the activation values. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

4.

发明授权
Training neural networks using mixed precision computations 有权

公开(公告)号：US11741362B2

公开(公告)日：2023-08-29

申请号：US15974637

申请日：2018-05-08

Applicant: Microsoft Technology Licensing, LLC

Inventor： Daniel Lo , Eric Sen Chung , Bita Darvish Rouhani

IPC: G06N3/08 , G06N3/084 , G06F7/544 , G06N3/082 , H03M7/24 , G06F18/214

CPC classification number: G06N3/084 , G06F7/5443 , G06F18/214 , G06N3/082 , H03M7/24

Abstract: A system for training a neural network receives training data and performing lower precision format training calculations using lower precision format data at one or more training phases. One or more results from the lower precision format training calculations are converted to higher precision format data, and higher precision format training calculations are performed using the higher precision format data at one or more additional training phases. The neural network is modified using the results from the one or more additional training phases. The mixed precision format training calculations train the neural network more efficiently, while maintaining an overall accuracy.

5.

发明公开
TRAINING NEURAL NETWORK ACCELERATORS USING MIXED PRECISION DATA FORMATS 审中-公开

公开(公告)号：US20230267319A1

公开(公告)日：2023-08-24

申请号：US18141272

申请日：2023-04-28

Applicant: Microsoft Technology Licensing, LLC

Inventor： Bita Darvish Rouhani , Taesik Na , Eric S. Chung , Daniel Lo , Douglas C. Burger

IPC: G06N3/063 , G06F17/15 , G06F17/16 , G06N3/084 , G06N20/00

CPC classification number: G06N3/063 , G06F17/15 , G06F17/16 , G06N3/084 , G06N20/00

Abstract: Technology related to training a neural network accelerator using mixed precision data formats is disclosed. In one example of the disclosed technology, a neural network accelerator is configured to accelerate a given layer of a multi-layer neural network. An input tensor for the given layer can be converted from a normal-precision floating-point format to a quantized-precision floating-point format. A tensor operation can be performed using the converted input tensor. A result of the tensor operation can be converted from the block floating-point format to the normal-precision floating-point format. The converted result can be used to generate an output tensor of the layer of the neural network, where the output tensor is in normal-precision floating-point format.

6.

发明申请
NEURAL NETWORK LAYER PROCESSING WITH SCALED QUANTIZATION 审中-公开

公开(公告)号：US20200272881A1

公开(公告)日：2020-08-27

申请号：US16284407

申请日：2019-02-25

Applicant: Microsoft Technology Licensing, LLC

Inventor： Daniel Lo

IPC: G06N3/04 , G06F17/16 , G06F7/483

Abstract: Processors and methods for neural network processing are provided. A method includes receiving a subset of data corresponding to a layer of a neural network. The method further includes prior to performing any matrix operations using the subset of the data, scaling the subset of the data by a scaling factor to generate a scaled subset of data. The method further includes quantizing the scaled subset of the data to generate a scaled and quantized subset of data. The method further includes performing the matrix operations using the scaled and quantized subset of the data to generate a subset of results of the matrix operations. The method further includes descaling the subset of the results of the matrix operations, by multiplying the subset of the results of the matrix operations with an inverse of the scaling factor, to generate a descaled subset of results of the matrix operations.

7.

发明申请
ADJUSTING PRECISION AND TOPOLOGY PARAMETERS FOR NEURAL NETWORK TRAINING BASED ON A PERFORMANCE METRIC 审中-公开

公开(公告)号：US20200210840A1

公开(公告)日：2020-07-02

申请号：US16237308

申请日：2018-12-31

Applicant: Microsoft Technology Licensing, LLC

Inventor： Bita Darvish Rouhani , Eric S. Chung , Daniel Lo , Douglas C. Burger

IPC: G06N3/08 , G06K9/62 , G06F9/30 , G06N3/04

Abstract: Apparatus and methods for training neural networks based on a performance metric, including adjusting numerical precision and topology as training progresses are disclosed. In some examples, block floating-point formats having relatively lower accuracy are used during early stages of training. Accuracy of the floating-point format can be increased as training progresses based on a determined performance metric. In some examples, values for the neural network are transformed to normal precision floating-point formats. The performance metric can be determined based on entropy of values for the neural network, accuracy of the neural network, or by other suitable techniques. Accelerator hardware can be used to implement certain implementations, including hardware having direct support for block floating-point formats.

8.

发明申请
NEURAL NETWORK ACTIVATION COMPRESSION WITH OUTLIER BLOCK FLOATING-POINT 审中-公开

公开(公告)号：US20200210839A1

公开(公告)日：2020-07-02

申请号：US16237202

申请日：2018-12-31

Applicant: Microsoft Technology Licensing, LLC

Inventor： Daniel Lo , Amar Phanishayee , Eric S. Chung , Yiren Zhao , Ritchie Zhao

IPC: G06N3/08 , G06F7/499 , G06F9/50 , G06N5/04 , G06N20/00 , G06F9/30

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats having outlier values are disclosed, and in particular for storing activation values from a neural network in a compressed format for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a narrower numerical precision than the first block floating-point format. Outlier values, comprising additional bits of mantissa and/or exponent are stored in ancillary storage for subset of the activation values. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

9.

发明授权
Block floating point computations using reduced bit-width vectors 有权

公开(公告)号：US10691413B2

公开(公告)日：2020-06-23

申请号：US15971904

申请日：2018-05-04

Applicant: Microsoft Technology Licensing, LLC

Inventor： Daniel Lo , Eric S. Chung , Douglas C. Burger

IPC: G06F7/483 , G06F7/544 , G06F17/16 , G06N3/04

Abstract: A system for block floating point computation in a neural network receives a block floating point number comprising a mantissa portion. A bit-width of the block floating point number is reduced by decomposing the block floating point number into a plurality of numbers each having a mantissa portion with a bit-width that is smaller than a bit-width of the mantissa portion of the block floating point number. One or more dot product operations are performed separately on each of the plurality of numbers to obtain individual results, which are summed to generate a final dot product value. The final dot product value is used to implement the neural network. The reduced bit width computations allow higher precision mathematical operations to be performed on lower-precision processors with improved accuracy.

10.

发明申请
Reduced Memory Nucleotide Sequence Comparison 审中-公开

公开(公告)号：US20180137085A1

公开(公告)日：2018-05-17

申请号：US15351372

申请日：2016-11-14

Applicant: Microsoft Technology Licensing, LLC

Inventor： Daniel Lo , Eric Chung , Kalin Ovtcharov , Ravindra Pandya , David Heckerman , Roman Snytsar

IPC: G06F17/16

CPC classification number: G06F17/16 , G06F19/22

Abstract: Comparisons between two nucleotide sequences can be performed by customized integrated circuity that can implement a Smith Waterman analysis in a reduced memory footprint, storing and referencing only individual portions, or subsections, of a two-dimensional matrix that is representative of the comparison between the two nucleotide sequences. As the backtracking proceeds, backtracking metadata corresponding to a cell from a subsection that is not currently retained in memory can be required. Such a subsection can be regenerated from previously generated scores associated with checkpoint cells of the two-dimensional matrix that comprise two edges of the subsection being regenerated. Moreover, to further reduce memory consumption, the backtracking metadata stored for each cell can comprise four binary digits: two indicative of a directional assignment, one indicative of whether the corresponding cell is part of a deletion stretching across multiple contiguous cells, and one analogously indicative of insertions stretching across multiple contiguous cells.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification