Multi-stage image querying
    11.
    发明授权

    公开(公告)号:US10997233B2

    公开(公告)日:2021-05-04

    申请号:US15097086

    申请日:2016-04-12

    Abstract: In some examples, a computing device refines feature information of query text. The device repeatedly determines attention information based at least in part on feature information of the image and the feature information of the query text, and modifies the feature information of the query text based at least in part on the attention information. The device selects at least one of a predetermined plurality of outputs based at least in part on the refined feature information of the query text. In some examples, the device operates a convolutional computational model to determine feature information of the image. The device network computational models (NCMs) to determine feature information of the query and to determine attention information based at least in part on the feature information of the image and the feature information of the query. Examples include a microphone to detect audio corresponding to the query text.

    Multiple-action computational model training and operation

    公开(公告)号:US10909450B2

    公开(公告)日:2021-02-02

    申请号:US15084113

    申请日:2016-03-29

    Abstract: A processing unit can determine a first feature value corresponding to a session by operating a first network computational model (NCM) based part on information of the session. The processing unit can determine respective second feature values corresponding to individual actions of a plurality of actions by operating a second NCM. The second NCM can use a common set of parameters in determining the second feature values. The processing unit can determine respective expectation values of some of the actions of the plurality of actions based on the first feature value and the respective second feature values. The processing unit can select a first action of the plurality of actions based on at least one of the expectation values. In some examples, the processing unit can operate an NCM to determine expectation values based on information of a session and information of respective actions.

    End-to-end learning of dialogue agents for information access

    公开(公告)号:US10546066B2

    公开(公告)日:2020-01-28

    申请号:US15406425

    申请日:2017-01-13

    Abstract: Described herein are systems, methods, and techniques by which a processing unit can build an end-to-end dialogue agent model for end-to-end learning of dialogue agents for information access and apply the end-to-end dialogue agent model with soft attention over knowledge base entries to make the dialogue system differentiable. In various examples the processing unit can apply the end-to-end dialogue agent model to a source of input, fill slots for output from the knowledge base entries, induce a posterior distribution over the entities in a knowledge base or induce a posterior distribution of a target of the requesting user over entities from a knowledge base, develop an end-to-end differentiable model of a dialogue agent, use supervised and/or imitation learning to initialize network parameters, calculate a modified version of an episodic algorithm. e.g., the REINFORCE algorithm, for training an end-to-end differentiable model based on user feedback.

    Training and operating multi-layer computational models

    公开(公告)号:US10445650B2

    公开(公告)日:2019-10-15

    申请号:US14949156

    申请日:2015-11-23

    Abstract: A processing unit can successively operate layers of a multilayer computational graph (MCG) according to a forward computational order to determine a topic value associated with a document based at least in part on content values associated with the document. The processing unit can successively determine, according to a reverse computational order, layer-specific deviation values associated with the layers based at least in part on the topic value, the content values, and a characteristic value associated with the document. The processing unit can determine a model adjustment value based at least in part on the layer-specific deviation values. The processing unit can modify at least one parameter associated with the MCG based at least in part on the model adjustment value. The MCG can be operated to provide a result characteristic value associated with test content values of a test document.

    KNOWLEDGE-GUIDED STRUCTURAL ATTENTION PROCESSING

    公开(公告)号:US20190303440A1

    公开(公告)日:2019-10-03

    申请号:US16444616

    申请日:2019-06-18

    Abstract: Systems and methods for determining knowledge-guided information for a recurrent neural networks (RNN) to guide the RNN in semantic tagging of an input phrase are presented. A knowledge encoding module of a Knowledge-Guided Structural Attention Process (K-SAP) receives an input phrase and, in conjunction with additional sub-components or cooperative components generates a knowledge-guided vector that is provided with the input phrase to the RNN for linguistic semantic tagging. Generating the knowledge-guided vector comprises at least parsing the input phrase and generating a corresponding hierarchical linguistic structure comprising one or more discrete sub-structures. The sub-structures may be encoded into vectors along with attention weighting identifying those sub-structures that have greater importance in determining the semantic meaning of the input phrase.

    Discriminative pretraining of deep neural networks

    公开(公告)号:US10325200B2

    公开(公告)日:2019-06-18

    申请号:US14873166

    申请日:2015-10-01

    Abstract: Discriminative pretraining technique embodiments are presented that pretrain the hidden layers of a Deep Neural Network (DNN). In general, a one-hidden-layer neural network is trained first using labels discriminatively with error back-propagation (BP). Then, after discarding an output layer in the previous one-hidden-layer neural network, another randomly initialized hidden layer is added on top of the previously trained hidden layer along with a new output layer that represents the targets for classification or recognition. The resulting multiple-hidden-layer DNN is then discriminatively trained using the same strategy, and so on until the desired number of hidden layers is reached. This produces a pretrained DNN. The discriminative pretraining technique embodiments have the advantage of bringing the DNN layer weights close to a good local optimum, while still leaving them in a range with a high gradient so that they can be fine-tuned effectively.

    JOINT LANGUAGE UNDERSTANDING AND DIALOGUE MANAGEMENT

    公开(公告)号:US20180157638A1

    公开(公告)日:2018-06-07

    申请号:US15368380

    申请日:2016-12-02

    Abstract: A processing unit can operate an end-to-end recurrent neural network (RNN) with limited contextual dialogue memory that can be jointly trained by supervised signals—user slot tagging, intent prediction and/or system action prediction. The end-to-end RNN, or joint model has shown advantages over separate models for natural language understanding (NLU) and dialogue management and can capture expressive feature representations beyond conventional aggregation of slot tags and intents, to mitigate effects of noisy output from NLU. The joint model can apply a supervised signal from system actions to refine the NLU model. By back-propagating errors associated with system action prediction to the NLU model, the joint model can use machine learning to predict user intent, and perform slot tagging, and make system action predictions based on user input, e.g., utterances across a number of domains.

    MULTI-DOMAIN JOINT SEMANTIC FRAME PARSING
    19.
    发明申请

    公开(公告)号:US20170372199A1

    公开(公告)日:2017-12-28

    申请号:US15228990

    申请日:2016-08-04

    CPC classification number: G06N3/08 G06N3/0445 G10L15/16 G10L15/1822 G10L15/22

    Abstract: A processing unit can train a model as a joint multi-domain recurrent neural network (JRNN), such as a bi-directional recurrent neural network (bRNN) and/or a recurrent neural network with long-short term memory (RNN-LSTM) for spoken language understanding (SLU). The processing unit can use the trained model to, e.g., jointly model slot filling, intent determination, and domain classification. The joint multi-domain model described herein can estimate a complete semantic frame per query, and the joint multi-domain model enables multi-task deep learning leveraging the data from multiple domains. The joint multi-domain recurrent neural network (JRNN) can leverage semantic intents (such as, finding or identifying, e.g., a domain specific goal) and slots (such as, dates, times, locations, subjects, etc.) across multiple domains.

    MULTI-STAGE IMAGE QUERYING
    20.
    发明申请

    公开(公告)号:US20170293638A1

    公开(公告)日:2017-10-12

    申请号:US15097086

    申请日:2016-04-12

    Abstract: In some examples, a computing device refines feature information of query text. The device repeatedly determines attention information based at least in part on feature information of the image and the feature information of the query text, and modifies the feature information of the query text based at least in part on the attention information. The device selects at least one of a predetermined plurality of outputs based at least in part on the refined feature information of the query text. In some examples, the device operates a convolutional computational model to determine feature information of the image. The device network computational models (NCMs) to determine feature information of the query and to determine attention information based at least in part on the feature information of the image and the feature information of the query. Examples include a microphone to detect audio corresponding to the query text.

Patent Agency Ranking