EFFICIENT CALCULATIONS OF NEGATIVE CURVATURE IN A HESSIAN FREE DEEP LEARNING FRAMEWORK

Invention Application

US20180101766A1 EFFICIENT CALCULATIONS OF NEGATIVE CURVATURE IN A HESSIAN FREE DEEP LEARNING FRAMEWORK 审中-公开

Please log in to see more content

Patent Title: EFFICIENT CALCULATIONS OF NEGATIVE CURVATURE IN A HESSIAN FREE DEEP LEARNING FRAMEWORK
Application No.: US15290154

Application Date: 2016-10-11
Publication No.: US20180101766A1

Publication Date: 2018-04-12
Inventor: Xi He , Ioannis Akrotirianakis , Amit Chakraborty
Applicant: Siemens Aktiengesellschaft
Main IPC: G06N3/08
IPC: G06N3/08

EFFICIENT CALCULATIONS OF NEGATIVE CURVATURE IN A HESSIAN FREE DEEP LEARNING FRAMEWORK

Abstract:

A method for training a deep learning network includes defining a loss function corresponding to the network. Training samples are received and current parameter values are set to initial parameter values. Then, a computing platform is used to perform an optimization method which iteratively minimizes the loss function. Each iteration comprises the following steps. An eigCG solver is applied to determine a descent direction by minimizing a local approximated quadratic model of the loss function with respect to current parameter values and the training dataset. An approximate leftmost eigenvector and eigenvalue is determined while solving the Newton system. The approximate leftmost eigenvector is used as negative curvature direction to prevent the optimization method from converging to saddle points. Curvilinear and adaptive line-searches are used to guide the optimization method to a local minimum. At the end of the iteration, the current parameter values are updated based on the descent direction.

Public/Granted literature

US10713566B2 Efficient calculations of negative curvature in a hessian free deep learning framework Public/Granted day:2020-07-14

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N3/00	基于生物学模型的计算机系统
G06N3/02	.采用神经网络模型
G06N3/08	..学习方法