-
1.
公开(公告)号:US20230325673A1
公开(公告)日:2023-10-12
申请号:US18209337
申请日:2023-06-13
发明人: Eugene Indenbom , Daniil Anastasiev
IPC分类号: G06N3/084 , G06F40/284
CPC分类号: G06N3/084 , G06F40/284 , G10L17/18
摘要: Systems and methods for neural network training utilizing loss functions reflecting neighbor token dependencies. An example method comprises: receiving a training dataset comprising a plurality of labeled tokens; determining, by a neural network, a first tag associated with a current token processed by the neural network, a second tag associated with a previous token which has been processed by the neural network before processing the current token, and a third tag associated with a next token to be processed by the neural network after processing the current token; computing, for the training dataset, a value of a loss function reflecting a first loss value, a second loss value, and a third loss value, wherein the first loss value is represented by a first difference of the first tag and a first label associated with the current token by the training dataset, wherein the second loss value is represented by a second difference of the second tag and a second label associated with the previous token by the training dataset, and wherein the third loss value is represented by a third difference of the third tag and a third label associated with the next token by the training dataset; and adjusting a parameter of the neural network based on the value of the loss function.
-
公开(公告)号:US11715008B2
公开(公告)日:2023-08-01
申请号:US16236382
申请日:2018-12-29
发明人: Eugene Indenbom , Daniil Anastasiev
IPC分类号: G06F40/205 , G06N3/084 , G06F40/284 , G10L17/18
CPC分类号: G06N3/084 , G06F40/284 , G10L17/18
摘要: Systems and methods for neural network training utilizing loss functions reflecting neighbor token dependencies. An example method comprises: receiving a training dataset comprising a plurality of labeled tokens; determining, by a neural network, a first tag associated with a current token processed by the neural network, a second tag associated with a previous token which has been processed by the neural network before processing the current token, and a third tag associated with a next token to be processed by the neural network after processing the current token; computing, for the training dataset, a value of a loss function reflecting a first loss value, a second loss value, and a third loss value, wherein the first loss value is represented by a first difference of the first tag and a first label associated with the current token by the training dataset, wherein the second loss value is represented by a second difference of the second tag and a second label associated with the previous token by the training dataset, and wherein the third loss value is represented by a third difference of the third tag and a third label associated with the next token by the training dataset; and adjusting a parameter of the neural network based on the value of the loss function.
-