-
公开(公告)号:US20240153499A1
公开(公告)日:2024-05-09
申请号:US18532969
申请日:2023-12-07
Applicant: Amazon Technologies, Inc.
Inventor: Angeliki Metallinou , Rahul Goel , Vishal Ishwar
CPC classification number: G10L15/16 , G06F3/167 , G10L15/02 , G10L15/144 , G10L15/197 , G10L15/26 , G10L2015/025
Abstract: Multi-modal natural language processing systems are provided. Some systems are context-aware systems that use multi-modal data to improve the accuracy of natural language understanding as it is applied to spoken language input. Machine learning architectures are provided that jointly model spoken language input (“utterances”) and information displayed on a visual display (“on-screen information”). Such machine learning architectures can improve upon, and solve problems inherent in, existing spoken language understanding systems that operate in multi-modal contexts.
-
公开(公告)号:US11194973B1
公开(公告)日:2021-12-07
申请号:US16363363
申请日:2019-03-25
Applicant: Amazon Technologies, Inc.
Inventor: Rahul Goel , Chandra Prakash Khatri , Tagyoung Chung , Raefer Christopher Gabriel , Anushree Venkatesh , Behnam Hedayatnia , Sanghyun Yi
IPC: G06F40/35 , G10L15/26 , G06F40/289 , H04L12/58 , G06N20/00
Abstract: A system that can engage in a dialog with a user may select a system response to a user input based on how the system estimates a user may respond to a potential system response. Models may be trained to evaluate a potential system response in view of various available data including dialog history, entity data, etc. Each model may score the potential system response for various qualitative aspects such as whether the response is likely to be comprehensible, on-topic, interesting, likely to lead to the dialog continuing, etc. Such scores may be combined to other scores such as whether the potential response is coherent or engaging. The models may be trained using previous dialog/chatbot evaluation data. At runtime the scores may be used to select a system response to a user input as part of the dialog.
-
公开(公告)号:US20200251098A1
公开(公告)日:2020-08-06
申请号:US16723762
申请日:2019-12-20
Applicant: Amazon Technologies, Inc.
Inventor: Angeliki Metallinou , Rahul Goel , Vishal Ishwar
Abstract: Multi-modal natural language processing systems are provided. Some systems are context-aware systems that use multi-modal data to improve the accuracy of natural language understanding as it is applied to spoken language input. Machine learning architectures are provided that jointly model spoken language input (“utterances”) and information displayed on a visual display (“on-screen information”). Such machine learning architectures can improve upon, and solve problems inherent in, existing spoken language understanding systems that operate in multi-modal contexts.
-
公开(公告)号:US10515625B1
公开(公告)日:2019-12-24
申请号:US15828174
申请日:2017-11-30
Applicant: Amazon Technologies, Inc.
Inventor: Angeliki Metallinou , Rahul Goel , Vishal Ishwar
Abstract: Multi-modal natural language processing systems are provided. Some systems are context-aware systems that use multi-modal data to improve the accuracy of natural language understanding as it is applied to spoken language input. Machine learning architectures are provided that jointly model spoken language input (“utterances”) and information displayed on a visual display (“on-screen information”). Such machine learning architectures can improve upon, and solve problems inherent in, existing spoken language understanding systems that operate in multi-modal contexts.
-
公开(公告)号:US20220246139A1
公开(公告)日:2022-08-04
申请号:US17659612
申请日:2022-04-18
Applicant: Amazon Technologies, Inc.
Inventor: Angeliki Metallinou , Rahul Goel , Vishal Ishwar
Abstract: Multi-modal natural language processing systems are provided. Some systems are context-aware systems that use multi-modal data to improve the accuracy of natural language understanding as it is applied to spoken language input. Machine learning architectures are provided that jointly model spoken language input (“utterances”) and information displayed on a visual display (“on-screen information”). Such machine learning architectures can improve upon, and solve problems inherent in, existing spoken language understanding systems that operate in multi-modal contexts.
-
公开(公告)号:US11227585B2
公开(公告)日:2022-01-18
申请号:US16815188
申请日:2020-03-11
Applicant: AMAZON TECHNOLOGIES, INC.
Inventor: Alexandra R. Shapiro , Melanie Chie Bomke Gens , Spyridon Matsoukas , Kellen Gillespie , Rahul Goel
Abstract: Methods and systems for determining an intent of an utterance using contextual information associated with a requesting device are described herein. Voice activated electronic devices may, in some embodiments, be capable of displaying content using a display screen. Entity data representing the content rendered by the display screen may describe entities having similar attributes as an identified intent from natural language understanding processing. Natural language understanding processing may attempt to resolve one or more declared slots for a particular intent and may generate an initial list of intent hypotheses ranked to indicate which are most likely to correspond to the utterance. The entity data may be compared with the declared slots for the intent hypotheses, and the list of intent hypothesis may be re-ranked to account for matching slots from the contextual metadata. The top ranked intent hypothesis after re-ranking may then be selected as the utterance's intent.
-
公开(公告)号:US11842727B2
公开(公告)日:2023-12-12
申请号:US17659612
申请日:2022-04-18
Applicant: Amazon Technologies, Inc.
Inventor: Angeliki Metallinou , Rahul Goel , Vishal Ishwar
IPC: G10L15/16 , G10L15/183 , G10L15/14 , G10L15/197 , G06F3/16 , G10L15/02 , G10L15/26
CPC classification number: G10L15/16 , G06F3/167 , G10L15/02 , G10L15/144 , G10L15/197 , G10L15/26 , G10L2015/025
Abstract: Multi-modal natural language processing systems are provided. Some systems are context-aware systems that use multi-modal data to improve the accuracy of natural language understanding as it is applied to spoken language input. Machine learning architectures are provided that jointly model spoken language input (“utterances”) and information displayed on a visual display (“on-screen information”). Such machine learning architectures can improve upon, and solve problems inherent in, existing spoken language understanding systems that operate in multi-modal contexts.
-
公开(公告)号:US20200279555A1
公开(公告)日:2020-09-03
申请号:US16815188
申请日:2020-03-11
Applicant: AMAZON TECHNOLOGIES, INC.
Inventor: Alexandra R. Shapiro , Melanie Chie Bomke Gens , Spyridon Matsoukas , Kellen Gillespie , Rahul Goel
Abstract: Methods and systems for determining an intent of an utterance using contextual information associated with a requesting device are described herein. Voice activated electronic devices may, in some embodiments, be capable of displaying content using a display screen. Entity data representing the content rendered by the display screen may describe entities having similar attributes as an identified intent from natural language understanding processing. Natural language understanding processing may attempt to resolve one or more declared slots for a particular intent and may generate an initial list of intent hypotheses ranked to indicate which are most likely to correspond to the utterance. The entity data may be compared with the declared slots for the intent hypotheses, and the list of intent hypothesis may be re-ranked to account for matching slots from the contextual metadata. The top ranked intent hypothesis after re-ranking may then be selected as the utterance's intent.
-
公开(公告)号:US10600406B1
公开(公告)日:2020-03-24
申请号:US15463339
申请日:2017-03-20
Applicant: Amazon Technologies, Inc.
Inventor: Alexandra R. Shapiro , Melanie Chie Bomke Gens , Spyridon Matsoukas , Kellen Gillespie , Rahul Goel
Abstract: Methods and systems for determining an intent of an utterance using contextual information associated with a requesting device are described herein. Voice activated electronic devices may, in some embodiments, be capable of displaying content using a display screen. Entity data representing the content rendered by the display screen may describe entities having similar attributes as an identified intent from natural language understanding processing. Natural language understanding processing may attempt to resolve one or more declared slots for a particular intent and may generate an initial list of intent hypotheses ranked to indicate which are most likely to correspond to the utterance. The entity data may be compared with the declared slots for the intent hypotheses, and the list of intent hypothesis may be re-ranked to account for matching slots from the contextual metadata. The top ranked intent hypothesis after re-ranking may then be selected as the utterance's intent.
-
-
-
-
-
-
-
-