NEURAL NETWORK TRAINING UTILIZING LOSS FUNCTIONS REFLECTING NEIGHBOR TOKEN DEPENDENCIES

    公开(公告)号:US20230325673A1

    公开(公告)日:2023-10-12

    申请号:US18209337

    申请日:2023-06-13

    IPC分类号: G06N3/084 G06F40/284

    摘要: Systems and methods for neural network training utilizing loss functions reflecting neighbor token dependencies. An example method comprises: receiving a training dataset comprising a plurality of labeled tokens; determining, by a neural network, a first tag associated with a current token processed by the neural network, a second tag associated with a previous token which has been processed by the neural network before processing the current token, and a third tag associated with a next token to be processed by the neural network after processing the current token; computing, for the training dataset, a value of a loss function reflecting a first loss value, a second loss value, and a third loss value, wherein the first loss value is represented by a first difference of the first tag and a first label associated with the current token by the training dataset, wherein the second loss value is represented by a second difference of the second tag and a second label associated with the previous token by the training dataset, and wherein the third loss value is represented by a third difference of the third tag and a third label associated with the next token by the training dataset; and adjusting a parameter of the neural network based on the value of the loss function.

    Neural network training utilizing loss functions reflecting neighbor token dependencies

    公开(公告)号:US11715008B2

    公开(公告)日:2023-08-01

    申请号:US16236382

    申请日:2018-12-29

    摘要: Systems and methods for neural network training utilizing loss functions reflecting neighbor token dependencies. An example method comprises: receiving a training dataset comprising a plurality of labeled tokens; determining, by a neural network, a first tag associated with a current token processed by the neural network, a second tag associated with a previous token which has been processed by the neural network before processing the current token, and a third tag associated with a next token to be processed by the neural network after processing the current token; computing, for the training dataset, a value of a loss function reflecting a first loss value, a second loss value, and a third loss value, wherein the first loss value is represented by a first difference of the first tag and a first label associated with the current token by the training dataset, wherein the second loss value is represented by a second difference of the second tag and a second label associated with the previous token by the training dataset, and wherein the third loss value is represented by a third difference of the third tag and a third label associated with the next token by the training dataset; and adjusting a parameter of the neural network based on the value of the loss function.