-
公开(公告)号:US12032627B2
公开(公告)日:2024-07-09
申请号:US17526806
申请日:2021-11-15
发明人: Jinchao Li , Lars H. Liden , Baolin Peng , Thomas Park , Swadheen Kumar Shukla , Jianfeng Gao
摘要: Systems and methods are provided for determining a response to a query in a dialog. An entity extractor extracts rules and conditions associated with the query and determines a particular task. The disclosed technology generates a transformer-based dialog embedding by pre-training a transformer using dialog corpora including a plurality of tasks. A task-specific classifier generates a first set of candidate responses based on rules and conditions associated with the task. The transformer-based dialog embedding generates a second set of candidate responses to the query. The classifier accommodates changes made to a task by an interactive dialog editor as machine teaching. A response generator generates a response based on the first and second sets of candidate responses using an optimization function. The disclosed technology leverages both a data-driven, generative model (a transformer) based on dialog corpora and a user-driven, task-specific rule-based classifier that accommodating updates in rules and conditions associated with a particular task.
-
公开(公告)号:US12008459B2
公开(公告)日:2024-06-11
申请号:US16443440
申请日:2019-06-17
发明人: Weizhu Chen , Pengcheng He , Xiaodong Liu , Jianfeng Gao
摘要: This document relates to architectures and training procedures for multi-task machine learning models, such as neural networks. One example method involves providing a multi-task machine learning model having one or more shared layers and two or more task-specific layers. The method can also involve performing a pretraining stage on the one or more shared layers using one or more unsupervised prediction tasks. The method can also involve performing a tuning stage on the one or more shared layers and the two or more task-specific layers using respective task-specific objectives.
-
公开(公告)号:US20240013055A1
公开(公告)日:2024-01-11
申请号:US18373051
申请日:2023-09-26
发明人: Xiaodong Liu , Hao Cheng , Yu Wang , Jianfeng Gao , Weizhu Chen , Pengcheng He , Hoifung Poon
CPC分类号: G06N3/084 , G06N20/00 , G06N3/08 , G06N3/088 , G06V10/82 , G06F18/24 , G06V10/7784 , G06F40/284
摘要: This document relates to training of machine learning models. One example method involves providing a machine learning model having one or more mapping layers. The one or more mapping layers can include at least a first mapping layer configured to map components of pretraining examples into first representations in a space. The example method also includes performing a pretraining stage on the one or more mapping layers using the pretraining examples. The pretraining stage can include adding noise to the first representations of the components of the pretraining examples to obtain noise-adjusted first representations. The pretraining stage can also include performing a self-supervised learning process to pretrain the one or more mapping layers using at least the first representations of the training data items and the noise-adjusted first representations of the training data items.
-
公开(公告)号:US11783173B2
公开(公告)日:2023-10-10
申请号:US15228990
申请日:2016-08-04
发明人: Dilek Z Hakkani-Tur , Asli Celikyilmaz , Yun-Nung Chen , Li Deng , Jianfeng Gao , Gokhan Tur , Ye-Yi Wang
CPC分类号: G06N3/08 , G06N3/044 , G10L15/16 , G10L15/1822 , G10L15/22
摘要: A processing unit can train a model as a joint multi-domain recurrent neural network (JRNN), such as a bi-directional recurrent neural network (bRNN) and/or a recurrent neural network with long-short term memory (RNN-LSTM) for spoken language understanding (SLU). The processing unit can use the trained model to, e.g., jointly model slot filling, intent determination, and domain classification. The joint multi-domain model described herein can estimate a complete semantic frame per query, and the joint multi-domain model enables multi-task deep learning leveraging the data from multiple domains. The joint multi-domain recurrent neural network (JRNN) can leverage semantic intents (such as, finding or identifying, e.g., a domain specific goal) and slots (such as, dates, times, locations, subjects, etc.) across multiple domains.
-
5.
公开(公告)号:US11526679B2
公开(公告)日:2022-12-13
申请号:US16910508
申请日:2020-06-24
发明人: Pengcheng He , Xiaodong Liu , Jianfeng Gao , Weizhu Chen
摘要: Systems and methods are provided for facilitating the building and use of natural language understanding models. The systems and methods identify a plurality of tokens and use them to generate one or more pre-trained natural language models using a transformer. The transformer disentangles the content embedding and positional embedding in the computation of its attention matrix. Systems and methods are also provided to facilitate self-training of the pre-trained natural language model by utilizing multi-step decoding to better reconstruct masked tokens and improve pre-training convergence.
-
公开(公告)号:US10768908B1
公开(公告)日:2020-09-08
申请号:US16285180
申请日:2019-02-25
发明人: Yu Wang , Yu Hu , Haiyuan Cao , Hui Su , Jinchao Li , Xinying Song , Jianfeng Gao
IPC分类号: G06F8/35 , G06F16/901 , G06F8/70
摘要: A workflow engine tool is disclosed that enables scientists and engineers to programmatically author workflows (e.g., a directed acyclic graph, “DAG”) with nearly no overhead, using a simpler script that needs almost no modifications for portability among multiple different workflow engines. This permits users to focus on the business logic of the project, avoiding the distracting tedious overhead related to workflow management (such as uploading modules, drawing edges, setting parameters, and other tasks). The workflow engine tool provides an abstraction layer on top of workflow engines, introducing a binding function that converts a programming language function (e.g., a normal python function) into a workflow module definition. The workflow engine tool infers module instances and induces edge dependencies automatically by inferring from a programming language script to build a DAG.
-
公开(公告)号:US10592519B2
公开(公告)日:2020-03-17
申请号:US15084366
申请日:2016-03-29
发明人: Xiaodong He , Li Deng , Jianfeng Gao , Wen-tau Yih , Moontae Lee , Paul Smolensky
IPC分类号: G06F16/2458 , G06F16/2453
摘要: A processing unit can determine multiple representations associated with a statement, e.g., subject or predicate representations. In some examples, the representations can lack representation of semantics of the statement. The computing device can determine a computational model of the statement based at least in part on the representations. The computing device can receive a query, e.g., via a communications interface. The computing device can determine at least one query representation, e.g., a subject, predicate, or entity representation. The computing device can then operate the model using the query representation to provide a model output. The model output can represent a relationship between the query representations and information in the model. The computing device can, e.g., transmit an indication of the model output via the communications interface. The computing device can determine mathematical relationships between subject representations and attribute representations for multiple statements, and determine the model using the relationships.
-
公开(公告)号:US10536402B2
公开(公告)日:2020-01-14
申请号:US16112611
申请日:2018-08-24
发明人: Michel Galley , Alessandro Sordoni , Christopher John Brockett , Jianfeng Gao , William Brennan Dolan , Yangfeng Ji , Michael Auli , Margaret Ann Mitchell , Jian-Yun Nie
摘要: Examples are generally directed towards context-sensitive generation of conversational responses. Context-message-response n-tuples are extracted from at least one source of conversational data to generate a set of training context-message-response n-tuples. A response generation engine is trained on the set of training context-message-response n-tuples. The trained response generation engine automatically generates a context-sensitive response based on a user generated input message and conversational context data. A digital assistant utilizes the trained response generation engine to generate context-sensitive, natural language responses that are pertinent to user queries.
-
公开(公告)号:US10264081B2
公开(公告)日:2019-04-16
申请号:US14806281
申请日:2015-07-22
发明人: Chenlei Guo , Jianfeng Gao , Xinying Song , Byungki Byun , Yelong Shen , Ye-Yi Wang , Brian D. Remick , Edward Thiele , Mohammed Aatif Ali , Marcus Gois , Xiaodong He , Jianshu Chen , Divya Jetley , Stephen Friesen
摘要: Techniques for providing a people recommendation system for predicting and recommending relevant people (or other entities) to include in a conversation based on contextual indicators. In an exemplary embodiment, email recipient recommendations may be suggested based on contextual signals, e.g., project names, body text, existing recipients, current date and time, etc. In an aspect, a plurality of properties including ranked key phrases are associated with profiles corresponding to personal entities. Aggregated profiles are analyzed using first- and second-layer processing techniques. The recommendations may be provided to the user reactively, e.g., in response to a specific query by the user to the people recommendation system, or proactively, e.g., based on the context of what the user is currently working on, in the absence of a specific query by the user.
-
公开(公告)号:US20180253637A1
公开(公告)日:2018-09-06
申请号:US15446870
申请日:2017-03-01
发明人: Feng Zhu , Xinying Song , Chao Zhong , Shijing Fang , Ryan Bouchard , Valentine N. Fontama , Prabhdeep Singh , Jianfeng Gao , Li Deng
CPC分类号: G06N3/0445 , G06N3/0454 , G06N3/08 , H04L67/22
摘要: A method to predict churn includes obtaining static features representative of a customer of a service, obtaining time series features representative of the customers interaction with the service, using a deep neural network to process the static features, using a recurrent neural network to process the time series features; and combining outputs from the deep neural network and the recurrent neural network to predict likelihood of customer churn.
-
-
-
-
-
-
-
-
-