Systems and methods for unifying question answering and text classification via span extraction

    公开(公告)号:US11657233B2

    公开(公告)日:2023-05-23

    申请号:US17673709

    申请日:2022-02-16

    摘要: Systems and methods for unifying question answering and text classification via span extraction include a preprocessor for preparing a source text and an auxiliary text based on a task type of a natural language processing task, an encoder for receiving the source text and the auxiliary text from the preprocessor and generating an encoded representation of a combination of the source text and the auxiliary text, and a span-extractive decoder for receiving the encoded representation and identifying a span of text within the source text that is a result of the NLP task. The task type is one of entailment, classification, or regression. In some embodiments, the source text includes one or more of text received as input when the task type is entailment, a list of classifications when the task type is entailment or classification, or a list of similarity options when the task type is regression.

    Multitask learning as question answering

    公开(公告)号:US11600194B2

    公开(公告)日:2023-03-07

    申请号:US16006691

    申请日:2018-06-12

    摘要: Approaches for natural language processing include a multi-layer encoder for encoding words from a context and words from a question in parallel, a multi-layer decoder for decoding the encoded context and the encoded question, a pointer generator for generating distributions over the words from the context, the words from the question, and words in a vocabulary based on an output from the decoder, and a switch. The switch generates a weighting of the distributions over the words from the context, the words from the question, and the words in the vocabulary, generates a composite distribution based on the weighting of the distribution over the first words from the context, the distribution over the second words from the question, and the distribution over the words in the vocabulary, and selects words for inclusion in an answer using the composite distribution.

    Hierarchical and interpretable skill acquisition in multi-task reinforcement learning

    公开(公告)号:US11562287B2

    公开(公告)日:2023-01-24

    申请号:US15885727

    申请日:2018-01-31

    摘要: The disclosed technology reveals a hierarchical policy network, for use by a software agent, to accomplish an objective that requires execution of multiple tasks. A terminal policy learned by training the agent on a terminal task set, serves as a base task set of the intermediate task set. An intermediate policy learned by training the agent on an intermediate task set serves as a base policy of the top policy. A top policy learned by training the agent on a top task set serves as a base task set of the top task set. The agent is configurable to accomplish the objective by traversal of the hierarchical policy network. A current task in a current task set is executed by executing a previously-learned task selected from a corresponding base task set governed by a corresponding base policy, or performing a primitive action selected from a library of primitive actions.

    Probability-based guider
    5.
    发明授权

    公开(公告)号:US11354565B2

    公开(公告)日:2022-06-07

    申请号:US15853530

    申请日:2017-12-22

    摘要: The technology disclosed proposes using a combination of computationally cheap, less-accurate bag of words (BoW) model and computationally expensive, more-accurate long short-term memory (LSTM) model to perform natural processing tasks such as sentiment analysis. The use of cheap, less-accurate BoW model is referred to herein as “skimming”. The use of expensive, more-accurate LSTM model is referred to herein as “reading”. The technology disclosed presents a probability-based guider (PBG). PBG combines the use of BoW model and the LSTM model. PBG uses a probability thresholding strategy to determine, based on the results of the BoW model, whether to invoke the LSTM model for reliably classifying a sentence as positive or negative. The technology disclosed also presents a deep neural network-based decision network (DDN) that is trained to learn the relationship between the BoW model and the LSTM model and to invoke only one of the two models.

    SYSTEM AND METHODS FOR TRAINING TASK-ORIENTED DIALOGUE (TOD) LANGUAGE MODELS

    公开(公告)号:US20220139384A1

    公开(公告)日:2022-05-05

    申请号:US17088206

    申请日:2020-11-03

    IPC分类号: G10L15/18 G10L15/06

    摘要: Embodiments described herein provide methods and systems for training task-oriented dialogue (TOD) language models. In some embodiments, a TOD language model may receive a TOD dataset including a plurality of dialogues and a model input sequence may be generated from the dialogues using a first token prefixed to each user utterance and a second token prefixed to each system response of the dialogues. In some embodiments, the first token or the second token may be randomly replaced with a mask token to generate a masked training sequence and a masked language modeling (MLM) loss may be computed using the masked training sequence. In some embodiments, the TOD language model may be updated based on the MLM loss.

    Interpretable counting in visual question answering

    公开(公告)号:US11270145B2

    公开(公告)日:2022-03-08

    申请号:US16781179

    申请日:2020-02-04

    摘要: Approaches for interpretable counting for visual question answering include a digital image processor, a language processor, and a counter. The digital image processor identifies objects in an image, maps the identified objects into an embedding space, generates bounding boxes for each of the identified objects, and outputs the embedded objects paired with their bounding boxes. The language processor embeds a question into the embedding space. The scorer determines scores for the identified objects. Each respective score determines how well a corresponding one of the identified objects is responsive to the question. The counter determines a count of the objects in the digital image that are responsive to the question based on the scores. The count and a corresponding bounding box for each object included in the count are output. In some embodiments, the counter determines the count interactively based on interactions between counted and uncounted objects.

    Dynamic coattention network for question answering

    公开(公告)号:US10963782B2

    公开(公告)日:2021-03-30

    申请号:US15421193

    申请日:2017-01-31

    摘要: The technology disclosed relates to an end-to-end neural network for question answering, referred to herein as “dynamic coattention network (DCN)”. Roughly described, the DCN includes an encoder neural network and a coattentive encoder that capture the interactions between a question and a document in a so-called “coattention encoding”. The DCN also includes a decoder neural network and highway maxout networks that process the coattention encoding to estimate start and end positions of a phrase in the document that responds to the question.

    Dense video captioning
    10.
    发明授权

    公开(公告)号:US10958925B2

    公开(公告)日:2021-03-23

    申请号:US16687405

    申请日:2019-11-18

    摘要: Systems and methods for dense captioning of a video include a multi-layer encoder stack configured to receive information extracted from a plurality of video frames, a proposal decoder coupled to the encoder stack and configured to receive one or more outputs from the encoder stack, a masking unit configured to mask the one or more outputs from the encoder stack according to one or more outputs from the proposal decoder, and a decoder stack coupled to the masking unit and configured to receive the masked one or more outputs from the encoder stack. Generating the dense captioning based on one or more outputs of the decoder stack. In some embodiments, the one or more outputs from the proposal decoder include a differentiable mask. In some embodiments, during training, error in the dense captioning is back propagated to the decoder stack, the encoder stack, and the proposal decoder.