Decreasing neural network inference times using softmax approximation

Invention Grant

US10671909B2 Decreasing neural network inference times using softmax approximation 审中-公开

Please log in to see more content

Patent Title: Decreasing neural network inference times using softmax approximation
Application No.: US16586702

Application Date: 2019-09-27
Publication No.: US10671909B2

Publication Date: 2020-06-02
Inventor: Yang Li , Sanjiv Kumar , Pei-Hung Chen , Si Si , Cho-Jui Hsieh
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: Google LLC
Current Assignee: Google LLC
Current Assignee Address: US CA Mountain View
Agency: Fish & Richardson P.C.
Main IPC: G06N3/04
IPC: G06N3/04 ; G06F17/16 ; G06F17/18 ; G06K9/62

Decreasing neural network inference times using softmax approximation

Abstract:

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for decreasing neural network inference times using softmax approximation. One of the methods includes maintaining data specifying a respective softmax weight vector for each output in a vocabulary of possible neural network outputs; receiving a neural network input; processing the neural network input using one or more initial neural network layers to generate a context vector for the neural network input; and generating an approximate score distribution over the vocabulary of possible neural network outputs for the neural network input, comprising: processing the context vector using a screening model configured to predict a proper subset of the vocabulary for the context input; and generating a respective logit for each output that is in the proper subset, comprising applying the softmax weight vector for the output to the context vector.

Public/Granted literature

US20200104686A1 DECREASING NEURAL NETWORK INFERENCE TIMES USING SOFTMAX APPROXIMATION Public/Granted day:2020-04-02

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N3/00	基于生物学模型的计算机系统
G06N3/02	.采用神经网络模型
G06N3/04	..体系结构，例如，互连拓扑