-
公开(公告)号:US10565493B2
公开(公告)日:2020-02-18
申请号:US15421016
申请日:2017-01-31
Applicant: salesforce.com, inc.
Inventor: Stephen Joseph Merity , Caiming Xiong , James Bradbury , Richard Socher
Abstract: The technology disclosed provides a so-called “pointer sentinel mixture architecture” for neural network sequence models that has the ability to either reproduce a token from a recent context or produce a token from a predefined vocabulary. In one implementation, a pointer sentinel-LSTM architecture achieves state of the art language modeling performance of 70.9 perplexity on the Penn Treebank dataset, while using far fewer parameters than a standard softmax LSTM.
-
公开(公告)号:US20190251431A1
公开(公告)日:2019-08-15
申请号:US15974075
申请日:2018-05-08
Applicant: salesforce.com, inc.
Inventor: Nitish Shirish Keskar , Bryan McCann , Caiming Xiong , Richard Socher
CPC classification number: G06N3/08 , G06F17/2785 , G06F17/2881 , G06N3/0445 , G06N3/0454 , G06N5/04
Abstract: Approaches for multitask learning as question answering include a method for training that includes receiving a plurality of training samples including training samples from a plurality of task types, presenting the training samples to a neural model to generate an answer, determining an error between the generated answer and the natural language ground truth answer for each training sample presented, and adjusting parameters of the neural model based on the error. Each of the training samples includes a natural language context, question, and ground truth answer. An order in which the training samples are presented to the neural model includes initially selecting the training samples according to a first training strategy and switching to selecting the training samples according to a second training strategy. In some embodiments the first training strategy is a sequential training strategy and the second training strategy is a joint training strategy.
-
公开(公告)号:US10282663B2
公开(公告)日:2019-05-07
申请号:US15237575
申请日:2016-08-15
Applicant: salesforce.com, inc.
Inventor: Richard Socher , Caiming Xiong , Kai Sheng Tai
Abstract: The technology disclosed uses a 3D deep convolutional neural network architecture (DCNNA) equipped with so-called subnetwork modules which perform dimensionality reduction operations on 3D radiological volume before the 3D radiological volume is subjected to computationally expensive operations. Also, the subnetworks convolve 3D data at multiple scales by subjecting the 3D data to parallel processing by different 3D convolutional layer paths. Such multi-scale operations are computationally cheaper than the traditional CNNs that perform serial convolutions. In addition, performance of the subnetworks is further improved through 3D batch normalization (BN) that normalizes the 3D input fed to the subnetworks, which in turn increases learning rates of the 3D DCNNA. After several layers of 3D convolution and 3D sub-sampling with 3D across a series of subnetwork modules, a feature map with reduced vertical dimensionality is generated from the 3D radiological volume and fed into one or more fully connected layers.
-
公开(公告)号:US20180373682A1
公开(公告)日:2018-12-27
申请号:US15982841
申请日:2018-05-17
Applicant: salesforce.com, inc.
Inventor: Bryan McCann , Caiming Xiong , Richard Socher
Abstract: A system is provided for natural language processing. In some embodiments, the system includes an encoder for generating context-specific word vectors for at least one input sequence of words. The encoder is pre-trained using training data for performing a first natural language processing task. A neural network performs a second natural language processing task on the at least one input sequence of words using the context-specific word vectors. The first natural language process task is different from the second natural language processing task and the neural network is separately trained from the encoder. In some embodiments, the first natural processing task can be machine translation, and the second natural processing task can be one of sentiment analysis, question classification, entailment classification, and question answering
-
公开(公告)号:US20180336453A1
公开(公告)日:2018-11-22
申请号:US15953265
申请日:2018-04-13
Applicant: salesforce.com, inc.
Inventor: Stephen Joseph Merity , Richard Socher , James Bradbury , Caiming Xiong
Abstract: A system automatically generates recurrent neural network (RNN) architectures for performing specific tasks, for example, machine translation. The system represents RNN architectures using a domain specific language (DSL). The system generates candidate RNN architectures. The system predicts performances of the generated candidate RNN architectures, for example, using a neural network. The system filters the candidate RNN architectures based on their predicted performance. The system generates code for selected a candidate architectures. The generated code represents an RNN that is configured to perform the specific task. The system executes the generated code, for example, to evaluate an RNN or to use the RNN in an application.
-
公开(公告)号:US20180096219A1
公开(公告)日:2018-04-05
申请号:US15835261
申请日:2017-12-07
Applicant: salesforce.com, inc.
Inventor: Richard Socher
CPC classification number: G06F17/2715 , G06K9/00677 , G06K9/6219 , G06K9/6292 , G06N3/0445 , G06N3/0454
Abstract: Deep learning is applied to combined image and text analysis of messages that include images and text. A convolutional neural network is trained against the images and a recurrent neural network against the text. A classifier predicts human response to the message, including classifying reactions to the image, to the text, and overall to the message. Visualizations are provided of neural network analytic emphasis on parts of the images and text. Other types of media in messages can also be analyzed by a combination of specialized neural networks.
-
公开(公告)号:US20160350653A1
公开(公告)日:2016-12-01
申请号:US15170884
申请日:2016-06-01
Applicant: salesforce.com, inc.
Inventor: Richard Socher , Ankit Kumar , Ozan Irsoy , Mohit Iyyer , Caiming Xiong , Stephen Merity , Romain Paulus
CPC classification number: G06N5/04 , G06N3/0445
Abstract: A novel unified neural network framework, the dynamic memory network, is disclosed. This unified framework reduces every task in natural language processing to a question answering problem over an input sequence. Inputs and questions are used to create and connect deep memory sequences. Answers are then generated based on dynamically retrieved memories.
Abstract translation: 公开了一种新颖的统一神经网络框架,动态存储网络。 这个统一框架将自然语言处理中的每个任务都减少到一个输入序列中的问题回答问题。 输入和问题用于创建和连接深层记忆序列。 然后基于动态检索的存储器生成答案。
-
公开(公告)号:US11822897B2
公开(公告)日:2023-11-21
申请号:US17463227
申请日:2021-08-31
Applicant: salesforce.com, inc.
Inventor: Kazuma Hashimoto , Raffaella Buschiazzo , James Bradbury , Teresa Anna Marshall , Caiming Xiong , Richard Socher
Abstract: Approaches for the translation of structured text include an embedding module for encoding and embedding source text in a first language, an encoder for encoding output of the embedding module, a decoder for iteratively decoding output of the encoder based on generated tokens in translated text from previous iterations, a beam module for constraining output of the decoder with respect to possible embedded tags to include in the translated text for a current iteration using a beam search, and a layer for selecting a token to be included in the translated text for the current iteration. The translated text is in a second language different from the first language. In some embodiments, the approach further includes scoring and pointer modules for selecting the token based on the output of the beam module or copied from the source text or reference text from a training pair best matching the source text.
-
79.
公开(公告)号:US11783164B2
公开(公告)日:2023-10-10
申请号:US17080656
申请日:2020-10-26
Applicant: salesforce.com, inc.
Inventor: Kazuma Hashimoto , Caiming Xiong , Richard Socher
IPC: G06N3/04 , G06N3/084 , G06F40/30 , G06F40/205 , G06F40/216 , G06F40/253 , G06F40/284 , G06N3/044 , G06N3/045 , G06N3/047 , G06N3/063 , G06N3/08 , G10L15/18 , G10L25/30 , G10L15/16 , G06F40/00
CPC classification number: G06N3/04 , G06F40/205 , G06F40/216 , G06F40/253 , G06F40/284 , G06F40/30 , G06N3/044 , G06N3/045 , G06N3/047 , G06N3/063 , G06N3/08 , G06N3/084 , G06F40/00 , G10L15/16 , G10L15/18 , G10L25/30
Abstract: The technology disclosed provides a so-called “joint many-task neural network model” to solve a variety of increasingly complex natural language processing (NLP) tasks using growing depth of layers in a single end-to-end model. The model is successively trained by considering linguistic hierarchies, directly connecting word representations to all model layers, explicitly using predictions in lower tasks, and applying a so-called “successive regularization” technique to prevent catastrophic forgetting. Three examples of lower level model layers are part-of-speech (POS) tagging layer, chunking layer, and dependency parsing layer. Two examples of higher level model layers are semantic relatedness layer and textual entailment layer. The model achieves the state-of-the-art results on chunking, dependency parsing, semantic relatedness and textual entailment.
-
公开(公告)号:US11669712B2
公开(公告)日:2023-06-06
申请号:US16559196
申请日:2019-09-03
Applicant: salesforce.com, inc.
Inventor: Lichao Sun , Kazuma Hashimoto , Jia Li , Richard Socher , Caiming Xiong
IPC: G06N3/08 , G06F40/232 , G06N3/045 , G06N3/008 , G06N3/044
CPC classification number: G06N3/008 , G06F40/232 , G06N3/044 , G06N3/045 , G06N3/08
Abstract: A method for evaluating robustness of one or more target neural network models using natural typos. The method includes receiving one or more natural typo generation rules associated with a first task associated with a first input document type, receiving a first target neural network model, and receiving a first document and corresponding its ground truth labels. The method further includes generating one or more natural typos for the first document based on the one or more natural typo generation rules, and providing, to the first target neural network model, a test document generated based on the first document and the one or more natural typos as an input document to generate a first output. A robustness evaluation result of the first target neural network model is generated based on a comparison between the output and the ground truth labels.
-
-
-
-
-
-
-
-
-