Patent search ap:("Microsoft Technology Licensing Page LLC") AND inv:"Eric Chung"

11.

发明授权
In-line network accelerator 有权

公开(公告)号：US10129153B2

公开(公告)日：2018-11-13

申请号：US15595925

申请日：2017-05-15

Applicant: Microsoft Technology Licensing, LLC

Inventor： Adrian Caulfield , Eric Chung , Doug Burger , Derek Chiou

IPC: H04L12/801 , H04L29/06 , H04L12/803 , H04L12/26 , H04L12/833

Abstract: A smart NIC (Network Interface Card) is provided with features to enable the smart NIC to operate as an in-line NIC between a host's NIC and a network. The smart NIC provides pass-through transmission of network flows for the host. Packets sent to and from the host pass through the smart NIC. As a pass-through point, the smart NIC is able to accelerate the performance of the pass-through network flows by analyzing packets, inserting packets, dropping packets, inserting or recognizing congestion information, and so forth. In addition, the smart NIC provides a lightweight transport protocol (LTP) module that enables it to establish connections with other smart NICs. The LTP connections allow the smart NICs to exchange data without passing network traffic through their respective hosts.

12.

发明申请
Customized Integrated Circuit For Serial Performance Of Smith Waterman Analysis 审中-公开

公开(公告)号：US20180137237A1

公开(公告)日：2018-05-17

申请号：US15349725

申请日：2016-11-11

Applicant: Microsoft Technology Licensing, LLC

Inventor： Daniel Lo , Eric Chung , Kalin Ovtcharov , Ravindra Pandya , David Heckerman

IPC: G06F19/22 , G06F17/16 , G06F7/02

CPC classification number: G16B30/00 , G06F7/02 , G06F17/10 , G16B30/10

Abstract: Comparisons between two nucleotide sequences can be performed by customized integrated circuitry that can implement a Smith Waterman analysis in series, as opposed to the parallel implementations known in the art. Series performance enables such customized integrated circuitry to take advantage of optimizations, including enveloping thresholds that demarcate between cells of a two-dimensional matrix for which nucleotide comparisons are to be performed, and cells of the two-dimensional matrix for which no such comparison need be performed, and, instead, a value of zero can simply be entered. Additionally, such customized integrated circuitry facilitates the combination of multiple control units, each directing the comparison of a unique pair of nucleotides, with a single calculation engine that can generate values for individual cells of the two-dimensional matrices by which such pairs of nucleotides are compared.

13.

发明申请
MACHINE LEARNING CLASSIFICATION ON HARDWARE ACCELERATORS WITH STACKED MEMORY 审中-公开
Title translation: 硬件加密机的机器学习分类与堆叠存储器

公开(公告)号：US20160379137A1

公开(公告)日：2016-12-29

申请号：US14754323

申请日：2015-06-29

Applicant: Microsoft Technology Licensing, LLC

Inventor： Douglas C. Burger , Derek Chiou , Eric Chung , Andrew R. Putnam

IPC: G06N99/00

CPC classification number: G06N20/00 , G06F9/46 , G06F9/50 , Y02D10/22

Abstract: A method is provided for processing on an acceleration component a machine learning classification model. The machine learning classification model includes a plurality of decision trees, the decision trees including a first amount of decision tree data. The acceleration component includes an acceleration component die and a memory stack disposed in an integrated circuit package. The memory die includes an acceleration component memory having a second amount of memory less than the first amount of decision tree data. The memory stack includes a memory bandwidth greater than about 50 GB/sec and a power efficiency of greater than about 20 MB/sec/mW. The method includes slicing the model into a plurality of model slices, each of the model slices having a third amount of decision tree data less than or equal to the second amount of memory, storing the plurality of model slices on the memory stack, and for each of the model slices, copying the model slice to the acceleration component memory, and processing the model slice using a set of input data on the acceleration component to produce a slice result.

Abstract translation: 提供了一种用于对加速度分量进行机器学习分类模型的处理的方法。机器学习分类模型包括多个决策树，决策树包括第一数量的决策树数据。加速度分量包括设置在集成电路封装中的加速度分量模具和存储器堆叠。存储器管芯包括加速度分量存储器，其具有小于第一量决策树数据的第二存储量。存储器堆栈包括大于约50GB /秒的存储器带宽和大于约20MB / sec / mW的功率效率。该方法包括将模型切片成多个模型切片，每个模型切片具有小于或等于第二量存储器的第三量决策树数据，将多个模型切片存储在存储器堆栈上，并且每个模型切片，将模型切片复制到加速度分量存储器，以及使用加速度分量上的一组输入数据来处理模型切片以产生切片结果。

14.

发明申请
CONVOLUTIONAL NEURAL NETWORKS ON HARDWARE ACCELERATORS 审中-公开
Title translation: 硬件加速器的连接神经网络

公开(公告)号：US20160379109A1

公开(公告)日：2016-12-29

申请号：US14754367

申请日：2015-06-29

Applicant: Microsoft Technology Licensing, LLC

Inventor： Eric Chung , Karin Strauss , Kalin Ovtcharov , Joo-Young Kim , Olatunji Ruwase

IPC: G06N3/063 , G06N3/04

CPC classification number: G06N3/063 , G06F15/7803 , G06N3/04 , G06N3/0454

Abstract: A hardware acceleration component is provided for implementing a convolutional neural network. The hardware acceleration component includes an array of N rows and M columns of functional units, an array of N input data buffers configured to store input data, and an array of M weights data buffers configured to store weights data. Each of the N input data buffers is coupled to a corresponding one of the N rows of functional units. Each of the M weights data buffers is coupled to a corresponding one of the M columns of functional units. Each functional unit in a row is configured to receive a same set of input data. Each functional unit in a column is configured to receive a same set of weights data from the weights data buffer coupled to the row. Each of the functional units is configured to perform a convolution of the received input data and the received weights data, and the M columns of functional units are configured to provide M planes of output data.

Abstract translation: 提供硬件加速组件来实现卷积神经网络。硬件加速组件包括功能单元的N行和M列的阵列，被配置为存储输入数据的N个输入数据缓冲器的阵列，以及被配置为存储加权数据的M个权重数据缓冲器的阵列。 N个输入数据缓冲器中的每一个耦合到N行功能单元中的相应一个。 M个权重数据缓冲器中的每一个耦合到功能单元的M列中的相应一个。一行中的每个功能单元被配置为接收相同的一组输入数据。列中的每个功能单元被配置为从耦合到该行的权重数据缓冲器接收相同的一组加权数据。每个功能单元被配置为执行所接收的输入数据和所接收的权重数据的卷积，并且功能单元的M列被配置为提供输出数据的M个平面。

15.

发明授权
Deep neural network processing on hardware accelerators with stacked memory 有权

公开(公告)号：US10540588B2

公开(公告)日：2020-01-21

申请号：US14754344

申请日：2015-06-29

Applicant: Microsoft Technology Licensing, LLC

Inventor： Douglas C. Burger , Derek Chiou , Eric Chung , Andrew R. Putnam

IPC: G06F17/50 , G06F15/78 , G06N3/08 , G06N3/063 , G06N5/02

Abstract: A method is provided for processing on an acceleration component a deep neural network. The method includes configuring the acceleration component to perform forward propagation and backpropagation stages of the deep neural network. The acceleration component includes an acceleration component die and a memory stack disposed in an integrated circuit package. The memory stack has a memory bandwidth greater than about 50 GB/sec and a power efficiency of greater than about 20 MB/sec/mW.

16.

发明授权
Machine learning classification on hardware accelerators with stacked memory 有权

公开(公告)号：US10452995B2

公开(公告)日：2019-10-22

申请号：US14754323

申请日：2015-06-29

Applicant: Microsoft Technology Licensing, LLC

Inventor： Douglas C. Burger , Derek Chiou , Eric Chung , Andrew R. Putnam

IPC: G06F15/18 , G06N20/00 , G06F9/46 , G06F9/50

Abstract: A method is provided for processing on an acceleration component a machine learning classification model. The machine learning classification model includes a plurality of decision trees, the decision trees including a first amount of decision tree data. The acceleration component includes an acceleration component die and a memory stack disposed in an integrated circuit package. The memory die includes an acceleration component memory having a second amount of memory less than the first amount of decision tree data. The memory stack includes a memory bandwidth greater than about 50 GB/sec and a power efficiency of greater than about 20 MB/sec/mW. The method includes slicing the model into a plurality of model slices, each of the model slices having a third amount of decision tree data less than or equal to the second amount of memory, storing the plurality of model slices on the memory stack, and for each of the model slices, copying the model slice to the acceleration component memory, and processing the model slice using a set of input data on the acceleration component to produce a slice result.

17.

发明授权
Lightweight transport protocol 有权

公开(公告)号：US09888095B2

公开(公告)日：2018-02-06

申请号：US14752713

申请日：2015-06-26

Applicant: Microsoft Technology Licensing, LLC

Inventor： Adrian Caulfield , Eric Chung , Doug Burger , Derek Chiou

IPC: H04L29/06 , H04L12/741 , H04L12/931 , H04L12/947 , H04L12/935

CPC classification number: H04L69/165 , H04L12/4633 , H04L49/25 , H04L49/30

Abstract: A smart NIC (Network Interface Card) is provided with features to enable the smart NIC to operate as an in-line NIC between a host's NIC and a network. The smart NIC provides pass-through transmission of network flows for the host. Packets sent to and from the host pass through the smart NIC. As a pass-through point, the smart NIC is able to accelerate the performance of the pass-through network flows by analyzing packets, inserting packets, dropping packets, inserting or recognizing congestion information, and so forth. In addition, the smart NIC provides a lightweight transport protocol (LTP) module that enables it to establish connections with other smart NICs. The LTP connections allow the smart NICs to exchange data without passing network traffic through their respective hosts.

18.

发明申请
SERVER SYSTEMS WITH HARDWARE ACCELERATORS INCLUDING STACKED MEMORY 审中-公开
Title translation: 带有硬件加速器的服务器系统，包括堆叠存储器

公开(公告)号：US20160379686A1

公开(公告)日：2016-12-29

申请号：US14754295

申请日：2015-06-29

Applicant: Microsoft Technology Licensing, LLC

Inventor： Douglas C. Burger , Andrew R. Putnam , Eric Chung

IPC: G11C5/02 , G06F3/06

CPC classification number: G11C5/02 , G06F3/0604 , G06F3/0631 , G06F3/0683 , G06F15/7821 , G06N3/063 , G06N3/084 , G06N5/025 , Y02D10/12 , Y02D10/13

Abstract: A server unit component is provided that includes a host component including a CPU, and an acceleration component coupled to the host component. The acceleration component includes an acceleration component die and a memory stack. The acceleration component die and the memory stack are disposed in an integrated circuit package. The memory stack has a memory bandwidth greater than about 50 GB/sec and a power efficiency of greater than about 20 MB/sec/mW.

Abstract translation: 提供了一种服务器单元组件，其包括包括CPU的主机组件和耦合到主机组件的加速组件。加速度分量包括加速度分量模具和存储器堆叠。加速度分量芯片和存储器堆叠被布置在集成电路封装中。存储器堆栈具有大于约50GB /秒的存储器带宽和大于约20MB / sec / mW的功率效率。

19.

发明授权
Convolutional neural networks on hardware accelerators 有权

公开(公告)号：US11200486B2

公开(公告)日：2021-12-14

申请号：US16440948

申请日：2019-06-13

Applicant: Microsoft Technology Licensing, LLC

Inventor： Eric Chung , Karin Strauss , Kalin Ovtcharov , Joo-Young Kim , Olatunji Ruwase

IPC: G06N3/063 , G06F15/78 , G06N3/04

Abstract: A hardware acceleration component is provided for implementing a convolutional neural network. The hardware acceleration component includes an array of N rows and M columns of functional units, an array of N input data buffers configured to store input data, and an array of M weights data buffers configured to store weights data. Each of the N input data buffers is coupled to a corresponding one of the N rows of functional units. Each of the M weights data buffers is coupled to a corresponding one of the M columns of functional units. Each functional unit in a row is configured to receive a same set of input data. Each functional unit in a column is configured to receive a same set of weights data from the weights data buffer coupled to the row. Each of the functional units is configured to perform a convolution of the received input data and the received weights data, and the M columns of functional units are configured to provide M planes of output data.

20.

发明授权
Deep neural network partitioning on servers 有权

公开(公告)号：US10452971B2

公开(公告)日：2019-10-22

申请号：US14754384

申请日：2015-06-29

Applicant: Microsoft Technology Licensing, LLC

Inventor： Eric Chung , Karin Strauss , Kalin Ovtcharov , Joo-Young Kim , Olatunji Ruwase

IPC: G06N3/04 , G06N3/063

Abstract: A method is provided for implementing a deep neural network on a server component that includes a host component including a CPU and a hardware acceleration component coupled to the host component. The deep neural network includes a plurality of layers. The method includes partitioning the deep neural network into a first segment and a second segment, the first segment including a first subset of the plurality of layers, the second segment including a second subset of the plurality of layers, configuring the host component to implement the first segment, and configuring the hardware acceleration component to implement the second segment.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification