Patent search ap:("salesforce.com Page inc.") AND inv:"Nitish Shirish Keskar"

21.

发明授权
Cross-lingual regularization for multilingual generalization 有权

公开(公告)号：US11829727B2

公开(公告)日：2023-11-28

申请号：US17239297

申请日：2021-04-23

Applicant: salesforce.com, inc.

Inventor： Jasdeep Singh , Nitish Shirish Keskar , Bryan McCann

IPC: G06F40/00 , G06F40/58 , G06N20/00 , G06N3/08 , G06F40/51

CPC classification number: G06F40/58 , G06F40/51 , G06N3/08 , G06N20/00

Abstract: Approaches for cross-lingual regularization for multilingual generalization include a method for training a natural language processing (NLP) deep learning module. The method includes accessing a first dataset having a first training data entry, the first training data entry including one or more natural language input text strings in a first language; translating at least one of the one or more natural language input text strings of the first training data entry from the first language to a second language; creating a second training data entry by starting with the first training data entry and substituting the at least one of the natural language input text strings in the first language with the translation of the at least one of the natural language input text strings in the second language; adding the second training data entry to a second dataset; and training the deep learning module using the second dataset.

22.

发明授权
Systems and methods for unsupervised paraphrase generation 有权

公开(公告)号：US11829721B2

公开(公告)日：2023-11-28

申请号：US17161214

申请日：2021-01-28

Applicant: salesforce.com, inc.

Inventor： Tong Niu , Semih Yavuz , Yingbo Zhou , Nitish Shirish Keskar , Huan Wang , Caiming Xiong

IPC: G10L15/065 , G06N3/0455 , G06F18/20 , G06F40/20 , G06F40/289 , G06F40/45 , G06F40/284 , G06F40/242 , G06F18/22 , G06F18/214 , G06N7/01

CPC classification number: G06F40/284 , G06F18/214 , G06F18/22 , G06F40/242 , G06N7/01

Abstract: Embodiments described herein provide dynamic blocking, a decoding algorithm which enables large-scale pretrained language models to generate high-quality paraphrases in an un-supervised setting. Specifically, in order to obtain an alternative surface form, when the language model emits a token that is present in the source sequence, the language model is prevented from generating the next token that is the same as the subsequent source token in the source sequence at the next time step. In this way, the language model is forced to generate a paraphrased sequence of the input source sequence, but with mostly different wording.

23.

发明授权
Multitask learning as question answering 有权

公开(公告)号：US11615249B2

公开(公告)日：2023-03-28

申请号：US16996726

申请日：2020-08-18

Applicant: salesforce.com, inc.

Inventor： Bryan McCann , Nitish Shirish Keskar , Caiming Xiong , Richard Socher

IPC: G06F40/30 , G06N3/08 , G06N5/04 , G06N3/04 , G06F40/56 , G06F16/242 , G06F16/33 , G06F16/332 , G06N20/20 , G06N20/10 , G06N20/00 , G10L15/16 , G10L15/18 , G06N3/044 , G06N3/045

Abstract: Approaches for multitask learning as question answering include an input layer for encoding a context and a question, a self-attention based transformer including an encoder and a decoder, a first bi-directional long-term short-term memory (biLSTM) for further encoding an output of the encoder, a long-term short-term memory (LSTM) for generating a context-adjusted hidden state from the output of the decoder and a hidden state, an attention network for generating first attention weights based on an output of the first biLSTM and an output of the LSTM, a vocabulary layer for generating a distribution over a vocabulary, a context layer for generating a distribution over the context, and a switch for generating a weighting between the distributions over the vocabulary and the context, generating a composite distribution based on the weighting, and selecting a word of an answer using the composite distribution.

24.

发明申请
PARAMETER UTILIZATION FOR LANGUAGE PRE-TRAINING 有权

公开(公告)号：US20220391640A1

公开(公告)日：2022-12-08

申请号：US17532851

申请日：2021-11-22

Applicant: salesforce.com, inc.

Inventor： Chen Xing , Wenhao Liu , Chu Hong Hoi , Nitish Shirish Keskar , Caiming Xiong

IPC: G06K9/62 , G06F40/00

Abstract: Embodiments are directed to pre-training a transformer model using more parameters for sophisticated patterns (PSP++). The transformer model is divided into a held-out model and a main model. A forward pass and a backward pass are performed on the held-out model, where the forward pass determines self-attention hidden states of the held-out model and the backward pass determines loss of the held-out model. A forward pass on the main model is performed to determine a self-attention hidden states of the main model. The self-attention hidden states of the main model are concatenated with the self-attention hidden states of the held-out model. A backward pass is performed on the main model to determine a loss of the main model. The parameters of the held-out model are updated to reflect the loss of the held-out model and parameters of the main model are updated to reflect the loss of the main model.

25.

发明申请
SYSTEMS AND METHODS FOR UNSUPERVISED PARAPHRASE GENERATION 有权

公开(公告)号：US20220129629A1

公开(公告)日：2022-04-28

申请号：US17161214

申请日：2021-01-28

Applicant: salesforce.com, inc.

Inventor： Tong Niu , Semih Yavuz , Yingbo Zhou , Nitish Shirish Keskar , Huan Wang , Caiming Xiong

IPC: G06F40/284 , G06F40/242 , G06K9/62 , G06N7/00

Abstract: Embodiments described herein provide dynamic blocking, a decoding algorithm which enables large-scale pretrained language models to generate high-quality paraphrases in an un-supervised setting. Specifically, in order to obtain an alternative surface form, when the language model emits a token that is present in the source sequence, the language model is prevented from generating the next token that is the same as the subsequent source token in the source sequence at the next time step. In this way, the language model is forced to generate a paraphrased sequence of the input source sequence, but with mostly different wording.

26.

发明申请
Cross-Lingual Regularization for Multilingual Generalization 审中-公开

公开(公告)号：US20200285706A1

公开(公告)日：2020-09-10

申请号：US16399429

申请日：2019-04-30

Applicant: salesforce.com, inc.

Inventor： Jasdeep Singh , Nitish Shirish Keskar , Bryan McCann

IPC: G06F17/28 , G06N3/08 , G06N20/00

Abstract: Approaches for cross-lingual regularization for multilingual generalization include a method for training a natural language processing (NLP) deep learning module. The method includes accessing a first dataset having a first training data entry, the first training data entry including one or more natural language input text strings in a first language; translating at least one of the one or more natural language input text strings of the first training data entry from the first language to a second language; creating a second training data entry by starting with the first training data entry and substituting the at least one of the natural language input text strings in the first language with the translation of the at least one of the natural language input text strings in the second language; adding the second training data entry to a second dataset; and training the deep learning module using the second dataset.

27.

发明申请
SEQUENCE-TO-SEQUENCE PREDICTION USING A NEURAL NETWORK MODEL 审中-公开

公开(公告)号：US20190130273A1

公开(公告)日：2019-05-02

申请号：US15884125

申请日：2018-01-30

Applicant: salesforce.com, inc.

Inventor： Nitish Shirish Keskar , Karim Ahmed , Richard SOCHER

IPC: G06N3/08 , G06F17/28

Abstract: A method for sequence-to-sequence prediction using a neural network model includes generating an encoded representation based on an input sequence using an encoder of the neural network model and predicting an output sequence based on the encoded representation using a decoder of the neural network model. The neural network model includes a plurality of model parameters learned according to a machine learning process. At least one of the encoder or the decoder includes a branched attention layer. Each branch of the branched attention layer includes an interdependent scaling node configured to scale an intermediate representation of the branch by a learned scaling parameter. The learned scaling parameter depends on one or more other learned scaling parameters of one or more other interdependent scaling nodes of one or more other branches of the branched attention layer.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification