专利检索 ap:("Microsoft Technology Licensing, LLC") AND inv:"George PETRE" 第 2 页

11.

发明申请
DYNAMIC SEQUENCING OF DATA PARTITIONS FOR OPTIMIZING MEMORY UTILIZATION AND PERFORMANCE OF NEURAL NETWORKS 有权

公开(公告)号：US20220147833A1

公开(公告)日：2022-05-12

申请号：US17583499

申请日：2022-01-25

申请人： Microsoft Technology Licensing, LLC

发明人： Kent D. CEDOLA , Larry Marvin WALL , Boris BOBROV , George PETRE , Chad Balling McBRIDE , Amol Ashok AMBARDEKAR

IPC分类号： G06N3/10 , G06F12/02 , G06F9/48

摘要： Optimized memory usage and management is crucial to the overall performance of a neural network (NN) or deep neural network (DNN) computing environment. Using various characteristics of the input data dimension, an apportionment sequence is calculated for the input data to be processed by the NN or DNN that optimizes the efficient use of the local and external memory components. The apportionment sequence can describe how to parcel the input data (and its associated processing parameters—e.g., processing weights) into one or more portions as well as how such portions of input data (and its associated processing parameters) are passed between the local memory, external memory, and processing unit components of the NN or DNN. Additionally, the apportionment sequence can include instructions to store generated output data in the local and/or external memory components so as to optimize the efficient use of the local and/or external memory components.

12.

发明申请
REDUCING POWER CONSUMPTION IN A NEURAL NETWORK ENVIRONMENT USING DATA MANAGEMENT 有权

公开(公告)号：US20210232205A1

公开(公告)日：2021-07-29

申请号：US17233379

申请日：2021-04-16

申请人： MICROSOFT TECHNOLOGY LICENSING, LLC

发明人： Amol Ashok AMBARDEKAR , Chad Balling MCBRIDE , George PETRE , Kent D. CEDOLA , Larry Marvin WALL

IPC分类号： G06F1/3234 , G06N3/04 , G06N3/063

摘要： Techniques to provide for improved (i.e., reduced) power consumption in an exemplary neural network (NN) and/or Deep Neural Network (DNN) environment using data management. Improved power consumption in the NN/DNN may be achieved by reducing a number of bit flips needed to process operands associated with one or more storages. Reducing the number bit flips associated with the NN/DNN may be achieved by multiplying an operand associated with a first storage with a plurality of individual operands associated with a plurality of kernels of the NN/DNN. The operand associated with the first storage may be neuron input data and the plurality of individual operands associated with the second storage may be weight values for multiplication with the neuron input data. The plurality of kernels may be arranged or sorted and subsequently processed in a manner that improves power consumption in the NN/DNN.

13.

发明申请
QUEUE MANAGEMENT FOR DIRECT MEMORY ACCESS 审中-公开

公开(公告)号：US20180300634A1

公开(公告)日：2018-10-18

申请号：US15702311

申请日：2017-09-12

申请人： Microsoft Technology Licensing, LLC

发明人： Chad Balling McBRIDE , Amol Ashok AMBARDEKAR , Kent D. CEDOLA , George PETRE , Larry Marvin Wall , Boris BOBROV

IPC分类号： G06N3/10 , G06N3/04 , H04L12/715 , G06N3/063

摘要： A direct memory access (DMA) engine may be responsible to enable and control DMA data flow within a computing system. The DMA engine moves blocks of data, associated with descriptors in a plurality of queues, from a source to a destination memory location or address, autonomously from control by a computer system's processor. Based on analysis of the data blocks linked to the descriptors in the queues, the DMA engine and its associated DMA fragmenter ensure that data blocks stored linked to descriptors in the queues do not remain idle for an exorbitant period of time. The DMA fragmenter may divide large data blocks into smaller data blocks to ensure that the processing of large data blocks does not preclude the timely processing of smaller data blocks associated with one or more descriptors in the queues. The data blocks stored may be two-dimensional data blocks.

14.

发明申请
POWER-EFFICIENT DEEP NEURAL NETWORK MODULE CONFIGURED FOR PARALLEL KERNEL AND PARALLEL INPUT PROCESSING 审中-公开

公开(公告)号：US20180300615A1

公开(公告)日：2018-10-18

申请号：US15951690

申请日：2018-04-12

申请人： Microsoft Technology Licensing, LLC

发明人： Amol Ashok AMBARDEKAR , Chad Balling McBRIDE , George PETRE , Larry Marvin WALL , Kent D. CEDOLA , Boris BOBROV

IPC分类号： G06N3/063 , G06F1/32

摘要： A deep neural network (DNN) module utilizes parallel kernel and parallel input processing to decrease bandwidth utilization, reduce power consumption, improve neuron multiplier stability, and provide other technical benefits. Parallel kernel processing enables the DNN module to load input data only once for processing by multiple kernels. Parallel input processing enables the DNN module to load kernel data only once for processing with multiple input data. The DNN module can implement other power-saving techniques like clock-gating (i.e. removing the clock from) and power-gating (i.e. removing the power from) banks of accumulators based upon usage of the accumulators. For example, individual banks of accumulators can be power-gated when all accumulators in a bank are not in use, and do not store data for a future calculation. Banks of accumulators can also be clock-gated when all accumulators in a bank are not in use, but store data for a future calculation.

15.

发明申请
PROCESSING DISCONTIGUOUS MEMORY AS CONTIGUOUS MEMORY TO IMPROVE PERFORMANCE OF A NEURAL NETWORK ENVIRONMENT 审中-公开

公开(公告)号：US20180300613A1

公开(公告)日：2018-10-18

申请号：US15829832

申请日：2017-12-01

申请人： Microsoft Technology Licensing, LLC

发明人： George PETRE , Chad Balling McBRIDE , Amol Ashok AMBARDEKAR , Kent D. CEDOLA , Larry Marvin WALL , Boris BOBROV

IPC分类号： G06N3/063 , G06F12/10 , G06N3/04 , G06F13/16

摘要： The performance of a neural network (NN) can be limited by the number of operations being performed. Using a line buffer that is directed to shift a memory block by a selected shift stride for cooperating neurons, data that is operatively residing memory and which would require multiple write cycles into a cooperating line buffer can be processed as in a single line buffer write cycle thereby enhancing the performance of a NN/DNN. A controller and/or iterator can generate one or more instructions having the memory block shifting values for communication to the line buffer. The shifting values can be calculated using various characteristics of the input data as well as the NN/DNN inclusive of the data dimensions. The line buffer can read data for processing, shift the data of the memory block and write the data in the line buffer for subsequent processing.

16.

发明申请
ENHANCING PROCESSING PERFORMANCE OF A DNN MODULE BY BANDWIDTH CONTROL OF FABRIC INTERFACE 审中-公开

公开(公告)号：US20180299943A1

公开(公告)日：2018-10-18

申请号：US15950644

申请日：2018-04-11

申请人： Microsoft Technology Licensing, LLC

发明人： Chad Balling McBRIDE , Timothy Hume HEIL , Amol Ashok AMBARDEKAR , George PETRE , Kent D. CEDOLA , Larry Marvin WALL , Boris BOBROV

IPC分类号： G06F1/32 , G06N3/08

摘要： An exemplary computing environment having a DNN module can maintain one or more bandwidth throttling mechanisms. Illustratively, a first throttling mechanism can specify the number of cycles to wait between transactions on a cooperating fabric component (e.g., data bus). Illustratively, a second throttling mechanism can be a transaction count limiter that operatively sets a threshold of a number of transactions to be processed during a given transaction sequence and limits the number of transactions such as multiple transactions in flight to not exceed the set threshold. In an illustrative operation, in executing these two exemplary calculated throttling parameters, the average bandwidth usage and the peak bandwidth usage can be limited. Operatively, with this fabric bandwidth control, the processing units of the DNN are optimized to process data across each transaction cycle resulting in enhanced processing and lower power consumption.

17.

发明公开
INCREASED PRECISION NEURAL PROCESSING ELEMENT 审中-公开

公开(公告)号：US20230196086A1

公开(公告)日：2023-06-22

申请号：US18174021

申请日：2023-02-24

申请人： MICROSOFT TECHNOLOGY LICENSING, LLC

发明人： Amol A AMBARDEKAR , Boris BOBROV , Kent D. CEDOLA , Chad Balling MCBRIDE , George PETRE , Larry Marvin WALL

IPC分类号： G06N3/063 , G06F7/499 , G06F7/523 , G06F7/57 , G06F9/30

CPC分类号： G06N3/063 , G06F7/57 , G06F7/523 , G06F7/49994 , G06F9/30029

摘要： Neural processing elements are configured with a hardware AND gate configured to perform a logical AND operation between a sign extend signal and a most significant bit (“MSB”) of an operand. The state of the sign extend signal can be based upon a type of a layer of a deep neural network (“DNN”) that generate the operand. If the sign extend signal is logical FALSE, no sign extension is performed. If the sign extend signal is logical TRUE, a concatenator concatenates the output of the hardware AND gate and the operand, thereby extending the operand from an N-bit unsigned binary value to an N+1 bit signed binary value. The neural processing element can also include another hardware AND gate and another concatenator for processing another operand similarly. The outputs of the concatenators for both operands are provided to a hardware binary multiplier.

18.

发明申请
NEURAL NETWORK PROCESSOR USING COMPRESSION AND DECOMPRESSION OF ACTIVATION DATA TO REDUCE MEMORY BANDWIDTH UTILIZATION 有权

公开(公告)号：US20230071352A1

公开(公告)日：2023-03-09

申请号：US18054851

申请日：2022-11-11

申请人： MICROSOFT TECHNOLOGY LICENSING, LLC

发明人： Joseph Leon CORKERY , Benjamin Eliot LUNDELL , Larry Marvin WALL , Chad Balling McBRIDE , Amol Ashok AMBARDEKAR , George PETRE , Kent D. CEDOLA , Boris BOBROV

IPC分类号： H03M7/30 , G06N3/04 , G06N3/063 , G06F12/0862 , G06F9/46 , G06F1/324 , G06F3/06 , G06F9/38 , G06F12/08 , G06F12/10 , G06F15/80 , G06F17/15 , G06N3/06 , G06N3/08 , G06N3/10 , H04L45/02 , H04L67/02 , G06F9/30 , H04L67/1001 , G06F9/48 , G06F12/02 , G06F13/16 , G06F1/3234 , G06F13/28

摘要： A deep neural network (“DNN”) module can compress and decompress neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit can receive an uncompressed chunk of data generated by a neuron in the DNN module. The compression unit generates a mask portion and a data portion of a compressed output chunk. The mask portion encodes the presence and location of the zero and non-zero bytes in the uncompressed chunk of data. The data portion stores truncated non-zero bytes from the uncompressed chunk of data. A decompression unit can receive a compressed chunk of data from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion and the data portion. This can reduce memory bus utilization, allow a DNN module to complete processing operations more quickly, and reduce power consumption.

19.

发明申请
DATA PROCESSING PERFORMANCE ENHANCEMENT FOR NEURAL NETWORKS USING A VIRTUALIZED DATA ITERATOR 审中-公开

公开(公告)号：US20180300633A1

公开(公告)日：2018-10-18

申请号：US15694663

申请日：2017-09-01

申请人： Microsoft Technology Licensing, LLC

发明人： Chad Balling MCBRIDE , George PETRE , Amol Ashok AMBARDEKAR , Kent D. CEDOLA , Larry Marvin WALL , Boris BOBROV

IPC分类号： G06N3/10

摘要： The performance of a neural network (NN) and/or deep neural network (DNN) can limited by the number of operations being performed as well as management of data among the various memory components of the NN/DNN. Using virtualized hardware iterators, data for processing by the NN/DNN can be traversed and configured to optimize the number of operations as well as memory utilization to enhance the overall performance of a NN/DNN. Operatively, an iterator controller can generate instructions for execution by the NN/DNN representative of one more desired iterator operation types and to perform one or more iterator operations. Data can be iterated according to a selected iterator operation and communicated to one or more neuron processors of the NN/DD for processing and output to a destination memory. The iterator operations can be applied to various volumes of data (e.g., blobs) in parallel or multiple slices of the same volume.

20.

发明申请
NEURAL NETWORK PROCESSOR USING COMPRESSION AND DECOMPRESSION OF ACTIVATION DATA TO REDUCE MEMORY BANDWIDTH UTILIZATION 审中-公开

公开(公告)号：US20180300606A1

公开(公告)日：2018-10-18

申请号：US15953356

申请日：2018-04-13

申请人： Microsoft Technology Licensing, LLC

发明人： Joseph Leon CORKERY , Benjamin Eliot LUNDELL , Larry Marvin WALL , Chad Balling McBRIDE , Amol Ashok AMBARDEKAR , George PETRE , Kent D. CEDOLA , Boris BOBROV

IPC分类号： G06N3/04 , G06N3/063 , H03M7/30

摘要： A deep neural network (“DNN”) module can compress and decompress neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit can receive an uncompressed chunk of data generated by a neuron in the DNN module. The compression unit generates a mask portion and a data portion of a compressed output chunk. The mask portion encodes the presence and location of the zero and non-zero bytes in the uncompressed chunk of data. The data portion stores truncated non-zero bytes from the uncompressed chunk of data. A decompression unit can receive a compressed chunk of data from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion and the data portion. This can reduce memory bus utilization, allow a DNN module to complete processing operations more quickly, and reduce power consumption.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类