-
公开(公告)号:US10997233B2
公开(公告)日:2021-05-04
申请号:US15097086
申请日:2016-04-12
Applicant: Microsoft Technology Licensing, LLC
Inventor: Xiaodong He , Li Deng , Jianfeng Gao , Alex Smola , Zichao Yang
IPC: G06F16/583 , G06F16/683 , G06F16/33 , G06F16/335 , G06N3/08 , G06N3/04
Abstract: In some examples, a computing device refines feature information of query text. The device repeatedly determines attention information based at least in part on feature information of the image and the feature information of the query text, and modifies the feature information of the query text based at least in part on the attention information. The device selects at least one of a predetermined plurality of outputs based at least in part on the refined feature information of the query text. In some examples, the device operates a convolutional computational model to determine feature information of the image. The device network computational models (NCMs) to determine feature information of the query and to determine attention information based at least in part on the feature information of the image and the feature information of the query. Examples include a microphone to detect audio corresponding to the query text.
-
公开(公告)号:US10909450B2
公开(公告)日:2021-02-02
申请号:US15084113
申请日:2016-03-29
Applicant: Microsoft Technology Licensing, LLC
Inventor: Jianshu Chen , Li Deng , Jianfeng Gao , Xiadong He , Lihong Li , Ji He , Mari Ostendorf
Abstract: A processing unit can determine a first feature value corresponding to a session by operating a first network computational model (NCM) based part on information of the session. The processing unit can determine respective second feature values corresponding to individual actions of a plurality of actions by operating a second NCM. The second NCM can use a common set of parameters in determining the second feature values. The processing unit can determine respective expectation values of some of the actions of the plurality of actions based on the first feature value and the respective second feature values. The processing unit can select a first action of the plurality of actions based on at least one of the expectation values. In some examples, the processing unit can operate an NCM to determine expectation values based on information of a session and information of respective actions.
-
公开(公告)号:US10546066B2
公开(公告)日:2020-01-28
申请号:US15406425
申请日:2017-01-13
Applicant: Microsoft Technology Licensing, LLC
Inventor: Lihong Li , Bhuwan Dhingra , Jianfeng Gao , Xiujun Li , Yun-Nung Chen , Li Deng , Faisal Ahmed
Abstract: Described herein are systems, methods, and techniques by which a processing unit can build an end-to-end dialogue agent model for end-to-end learning of dialogue agents for information access and apply the end-to-end dialogue agent model with soft attention over knowledge base entries to make the dialogue system differentiable. In various examples the processing unit can apply the end-to-end dialogue agent model to a source of input, fill slots for output from the knowledge base entries, induce a posterior distribution over the entities in a knowledge base or induce a posterior distribution of a target of the requesting user over entities from a knowledge base, develop an end-to-end differentiable model of a dialogue agent, use supervised and/or imitation learning to initialize network parameters, calculate a modified version of an episodic algorithm. e.g., the REINFORCE algorithm, for training an end-to-end differentiable model based on user feedback.
-
公开(公告)号:US10474950B2
公开(公告)日:2019-11-12
申请号:US14754474
申请日:2015-06-29
Applicant: Microsoft Technology Licensing, LLC
Inventor: Xiaodong He , Jianshu Chen , Brendan W L Clement , Li Deng , Jianfeng Gao , Bochen Jin , Prabhdeep Singh , Sandeep P. Solanki , LuMing Wang , Hanjun Xian , Yilei Zhang , Mingyang Zhao , Zijian Zheng
Abstract: A processing unit can acquire datasets from respective data sources, each having a respective unique data domain. The processing unit can determine values of a plurality of features based on the plurality of datasets. The processing unit can modify input-specific parameters or history parameters of a computational model based on the values of the features. In some examples, the processing unit can determine an estimated value of a target feature based at least in part on the modified computational model and values of one or more reference features. In some examples, the computational model can include neural networks for several input sets. An output layer of at least one of the neural networks can be connected to the respective hidden layer(s) of one or more other(s) of the neural networks. In some examples, the neural networks can be operated to provide transformed feature value(s) for respective times.
-
公开(公告)号:US10445650B2
公开(公告)日:2019-10-15
申请号:US14949156
申请日:2015-11-23
Applicant: Microsoft Technology Licensing, LLC
Inventor: Jianfeng Gao , Li Deng , Xiaodong He , Lin Xiao , Xinying Song , Yelong Shen , Ji He , Jianshu Chen
IPC: G06N7/00
Abstract: A processing unit can successively operate layers of a multilayer computational graph (MCG) according to a forward computational order to determine a topic value associated with a document based at least in part on content values associated with the document. The processing unit can successively determine, according to a reverse computational order, layer-specific deviation values associated with the layers based at least in part on the topic value, the content values, and a characteristic value associated with the document. The processing unit can determine a model adjustment value based at least in part on the layer-specific deviation values. The processing unit can modify at least one parameter associated with the MCG based at least in part on the model adjustment value. The MCG can be operated to provide a result characteristic value associated with test content values of a test document.
-
公开(公告)号:US20190303440A1
公开(公告)日:2019-10-03
申请号:US16444616
申请日:2019-06-18
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yun-Nung Vivian Chen , Dilek Z. Hakkani-Tur , Gokhan Tur , Asli Celikyilmaz , Jianfeng Gao , Li Deng
IPC: G06F17/27
Abstract: Systems and methods for determining knowledge-guided information for a recurrent neural networks (RNN) to guide the RNN in semantic tagging of an input phrase are presented. A knowledge encoding module of a Knowledge-Guided Structural Attention Process (K-SAP) receives an input phrase and, in conjunction with additional sub-components or cooperative components generates a knowledge-guided vector that is provided with the input phrase to the RNN for linguistic semantic tagging. Generating the knowledge-guided vector comprises at least parsing the input phrase and generating a corresponding hierarchical linguistic structure comprising one or more discrete sub-structures. The sub-structures may be encoded into vectors along with attention weighting identifying those sub-structures that have greater importance in determining the semantic meaning of the input phrase.
-
公开(公告)号:US10325200B2
公开(公告)日:2019-06-18
申请号:US14873166
申请日:2015-10-01
Applicant: Microsoft Technology Licensing, LLC
Inventor: Dong Yu , Li Deng , Frank Torsten Bernd Seide , Gang Li
Abstract: Discriminative pretraining technique embodiments are presented that pretrain the hidden layers of a Deep Neural Network (DNN). In general, a one-hidden-layer neural network is trained first using labels discriminatively with error back-propagation (BP). Then, after discarding an output layer in the previous one-hidden-layer neural network, another randomly initialized hidden layer is added on top of the previously trained hidden layer along with a new output layer that represents the targets for classification or recognition. The resulting multiple-hidden-layer DNN is then discriminatively trained using the same strategy, and so on until the desired number of hidden layers is reached. This produces a pretrained DNN. The discriminative pretraining technique embodiments have the advantage of bringing the DNN layer weights close to a good local optimum, while still leaving them in a range with a high gradient so that they can be fine-tuned effectively.
-
公开(公告)号:US20180157638A1
公开(公告)日:2018-06-07
申请号:US15368380
申请日:2016-12-02
Applicant: Microsoft Technology Licensing, LLC
Inventor: Xiujun Li , Paul Anthony Crook , Li Deng , Jianfeng Gao , Yun-Nung Chen , Xuesong Yang
CPC classification number: G06F17/279 , G06N3/08 , G10L15/063 , G10L15/18 , G10L15/1822 , G10L15/22 , G10L25/30
Abstract: A processing unit can operate an end-to-end recurrent neural network (RNN) with limited contextual dialogue memory that can be jointly trained by supervised signals—user slot tagging, intent prediction and/or system action prediction. The end-to-end RNN, or joint model has shown advantages over separate models for natural language understanding (NLU) and dialogue management and can capture expressive feature representations beyond conventional aggregation of slot tags and intents, to mitigate effects of noisy output from NLU. The joint model can apply a supervised signal from system actions to refine the NLU model. By back-propagating errors associated with system action prediction to the NLU model, the joint model can use machine learning to predict user intent, and perform slot tagging, and make system action predictions based on user input, e.g., utterances across a number of domains.
-
公开(公告)号:US20170372199A1
公开(公告)日:2017-12-28
申请号:US15228990
申请日:2016-08-04
Applicant: Microsoft Technology Licensing, LLC
Inventor: Dilek Z Hakkani-Tur , Asli Celikyilmaz , Yun-Nung Chen , Li Deng , Jianfeng Gao , Gokhan Tur , Ye-Yi Wang
CPC classification number: G06N3/08 , G06N3/0445 , G10L15/16 , G10L15/1822 , G10L15/22
Abstract: A processing unit can train a model as a joint multi-domain recurrent neural network (JRNN), such as a bi-directional recurrent neural network (bRNN) and/or a recurrent neural network with long-short term memory (RNN-LSTM) for spoken language understanding (SLU). The processing unit can use the trained model to, e.g., jointly model slot filling, intent determination, and domain classification. The joint multi-domain model described herein can estimate a complete semantic frame per query, and the joint multi-domain model enables multi-task deep learning leveraging the data from multiple domains. The joint multi-domain recurrent neural network (JRNN) can leverage semantic intents (such as, finding or identifying, e.g., a domain specific goal) and slots (such as, dates, times, locations, subjects, etc.) across multiple domains.
-
公开(公告)号:US20170293638A1
公开(公告)日:2017-10-12
申请号:US15097086
申请日:2016-04-12
Applicant: Microsoft Technology Licensing, LLC
Inventor: Xiaodong He , Li Deng , Jianfeng Gao , Alex Smola , Zichao Yang
CPC classification number: G06F16/5846 , G06F16/334 , G06F16/335 , G06F16/583 , G06F16/683 , G06N3/0445 , G06N3/0454 , G06N3/084
Abstract: In some examples, a computing device refines feature information of query text. The device repeatedly determines attention information based at least in part on feature information of the image and the feature information of the query text, and modifies the feature information of the query text based at least in part on the attention information. The device selects at least one of a predetermined plurality of outputs based at least in part on the refined feature information of the query text. In some examples, the device operates a convolutional computational model to determine feature information of the image. The device network computational models (NCMs) to determine feature information of the query and to determine attention information based at least in part on the feature information of the image and the feature information of the query. Examples include a microphone to detect audio corresponding to the query text.
-
-
-
-
-
-
-
-
-