-
公开(公告)号:US20240161737A1
公开(公告)日:2024-05-16
申请号:US18055821
申请日:2022-11-15
申请人: SoundHound, Inc.
发明人: Jon GROSSMANN , Robert MACRAE , Scott HALSTVEDT , Keyvan MOHAJER
CPC分类号: G10L15/1815 , G06F3/167 , G06F40/30 , G10L15/22
摘要: A system and method of real-time feedback confirmation to solicit a virtual assistant response from an evolving semantic state of at least a portion of an utterance. A user accesses a virtual assistant on an electronic device having the system and/or method configured to capture a command, a question, and/or a fulfillment request from audio such as, the speech emitted from the speaking user. The speech may be intercepted by a speech engine configured to transcribe the speech into text that is matched with the fragment pattern's regular expression to generate a fragment and/or the speech may be processed with a machine learning model to identify fragments. The fragments are identified by a domain handler configured to update a data structure of the current semantic state of the utterance in real-time on an interface of an electronic device.
-
公开(公告)号:US20240153525A1
公开(公告)日:2024-05-09
申请号:US18507715
申请日:2023-11-13
CPC分类号: G10L25/51 , G10L15/1822 , G10L15/22 , G10L15/26 , G10L2015/223 , G10L2015/225
摘要: Techniques for performing conversation recovery of a system/user exchange are described. In response to determining that an action responsive to a user input cannot be performed, a system may determine a topic to recommend to a user. The topic may be unrelated to the original substance of the user input. The system may have access to various data representing a context in which a user provides an input to the system. The system may use these inputs and various data at runtime to make a determination regarding whether a user should be recommended a topic, as well as what that topic should be. The system may cause a question be output to the user, with the question asking the user about the topic, for example whether the user would like a song played, whether the user would like to hear information about a particular individual (e.g., artist), whether the user would like to know about a particular skill (e.g., a skill having a significantly high popularity among users of the system), or whether the user would like to know about some other topic. If the user responds affirmatively to the recommended topic, the system may pass the user experience off to an appropriate component of the system (e.g., one that is configured to perform an action related to the topic). If the user responds negatively, does not respond at all, or the system is unsure whether the user's response was affirmative or negative, the system may cease interaction with the user, thereby enabling the user to interact with the system as the user desires.
-
公开(公告)号:US20240153505A1
公开(公告)日:2024-05-09
申请号:US18490029
申请日:2023-10-19
发明人: Anjishnu Kumar , Xing Fan , Arpit Gupta , Ruhi Sarikaya
CPC分类号: G10L15/22 , G06F40/30 , G06N5/022 , G10L13/00 , G10L15/14 , G10L15/1815 , G10L17/00 , G06F40/295 , G10L2015/223
摘要: Techniques for determining a command or intent likely to be subsequently invoked by a user of a system are described. A user inputs a command (either via a spoken utterance or textual input) to a system. The system determines content responsive to the command. The system also determines a second command or corresponding intent likely to be invoked by the user subsequent to the previous command. Such determination may involve analyzing pairs of intents, with each pair being associated with a probability that one intent of the pair will be invoked by a user subsequent to a second intent of the pair. The system then outputs first content responsive to the first command and second content soliciting the user as to whether the system to execute the second command.
-
公开(公告)号:US20240153397A1
公开(公告)日:2024-05-09
申请号:US18104122
申请日:2023-01-31
IPC分类号: G09B7/02 , G06V40/16 , G06V40/20 , G10L13/02 , G10L15/05 , G10L15/18 , G10L15/22 , G10L15/30 , G10L25/57 , G10L25/78
CPC分类号: G09B7/02 , G06V40/176 , G06V40/20 , G10L13/02 , G10L15/05 , G10L15/1815 , G10L15/22 , G10L15/30 , G10L25/57 , G10L25/78
摘要: Methods and systems provide for virtual meeting coaching with content-based evaluation. In one embodiment, the system receives a set of coaching items including a number of questions each associated with an expected answer; connects to a coaching session including one or more participants and a virtual coaching agent; for each question and for at least a subset of the participants: transmitting the question, by the virtual coaching agent, to the client device used by the participant; receiving an answer to the question by the participant, the answer including media of the participant; receiving text of utterances spoken by the participant during the answer; generating one or more evaluation scores for the answer based on evaluating at least the content of the answer to the question; and transmitting an overall evaluation score for each of the subset of participants based on the generated evaluation scores for that participant.
-
75.
公开(公告)号:US20240152704A1
公开(公告)日:2024-05-09
申请号:US18405860
申请日:2024-01-05
申请人: Moonbeam, Inc.
CPC分类号: G06F40/35 , G06F40/289 , G06T13/40 , G06T15/00 , G10L15/1815 , G10L15/22
摘要: Introduced here is a computer program that is representative of a software-implemented collaboration platform that is designed to facilitate conversations in virtual environments, document those conversations, and analyze those conversations, all in real time. The collaboration platform can include or integrate tools for turning ideas—expressed through voice—into templatized, metadata-rich data structures called “knowledge objects.” Discourse throughout a conversation can be converted into a transcription (or simply “transcript”), parsed to identify topical shifts, and then segmented based on the topical shifts. Separately documenting each topic in the form of its own “knowledge object” allows the collaboration platform to not only better catalogue what was discussed in a single ideation session, but also monitor discussion of the same topic over multiple ideation sessions.
-
公开(公告)号:US11979360B2
公开(公告)日:2024-05-07
申请号:US17278632
申请日:2018-10-25
发明人: Li Zhou
摘要: The present disclosure provides method and apparatus for responding in a voice conversation by an electronic conversational agent. A voice input may be received in an audio upstream. In response to the voice input, a primary response and at least one supplementary response may be generated. A primary voice output may be generated based on the primary response. At least one supplementary voice output may be generated based on the at least one supplementary response. The primary voice output and the at least one supplementary voice output may be provided in an audio downstream, wherein the at least one supplementary voice output is provided during a time period adjacent to the primary voice output in the audio downstream.
-
公开(公告)号:US11978437B1
公开(公告)日:2024-05-07
申请号:US17119099
申请日:2020-12-11
发明人: Govindarajan Sundaram Thattai , Qing Ping , Feiyang Niu , Joel Joseph Chengottusseriyil , Prashanth Rajagopal , Qiaozi Gao , Aishwarya Naresh Reganti , Gokhan Tur , Dilek Hakkani-Tur , Rohit Prasad , Premkumar Natarajan
IPC分类号: G10L15/00 , G06F16/22 , G06F21/62 , G10L15/18 , G10L15/22 , G10L15/30 , G10L15/183 , G10L15/19
CPC分类号: G10L15/1815 , G06F16/22 , G06F21/6218 , G10L15/22 , G10L15/30 , G10L15/1822 , G10L15/183 , G10L15/19 , G10L2015/223
摘要: Devices and techniques are generally described for learning personalized concepts for natural language processing. In various examples, a first natural language input may be received. In some examples, a determination may be made that the first natural language input comprises non-actionable slot data. A dialog session may be initiated with the user. In some examples, first slot data that is indicated by the user during the dialog session may be determined. In various examples, data representing the first slot data may be stored in a database in association with the first natural language input.
-
公开(公告)号:US20240146878A1
公开(公告)日:2024-05-02
申请号:US18384219
申请日:2023-10-26
发明人: Ji Seon CHOI
摘要: Provided is a method for providing a speech bubble in a video conference. The method is performed by a user terminal and includes: receiving a first speech text converted from a voice signal of a first conference participant participating in a video conference into text; determining whether to activate a cartoon mode; displaying, based on determining to activate the cartoon mode, a conference screen including a first participant object and a first speech bubble, wherein the first participant object indicates the first conference participant and the first speech bubble is generated using the first speech text; and displaying, in response to a user input to select the first speech bubble, a sequence of speech texts of the video conference, the sequence including a speech text corresponding to the first speech bubble.
-
公开(公告)号:US20240146842A1
公开(公告)日:2024-05-02
申请号:US17975833
申请日:2022-10-28
CPC分类号: H04M3/4936 , G10L15/1822 , H04M3/5166 , H04M3/5175
摘要: At a contact center that operates to provide customer service to callers by receiving calls from the callers and directing the callers to desired endpoints that deliver information to the callers based upon an expressed intent of each caller associated with each call, a current call by the contact center is processed utilizing an Interactive Voice Response (IVR) model or a Conversational Menu (CM) model. The IVR model processes the current call by directing the current caller to a selected endpoint via navigation through a specific navigational path of nodes in a hierarchical IVR tree of nodes. The CM model processes the call by analyzing a verbal utterance of the current caller at an initiation of the current call and directing the caller to the selected endpoint so as to bypass nodes within the specific navigational path of nodes of the hierarchical IVR tree of nodes.
-
公开(公告)号:US11972751B2
公开(公告)日:2024-04-30
申请号:US17441809
申请日:2020-06-29
发明人: Joon-Hyuk Chang , Inyoung Hwang
CPC分类号: G10L15/02 , G06N3/045 , G06N3/063 , G10L15/01 , G10L15/1815 , G10L15/187 , G10L15/22 , G10L2015/025
摘要: Disclosed are a method and an apparatus for detecting a voice end point by using acoustic and language modeling information to accomplish strong voice recognition. A voice end point detection method according to an embodiment may comprise the steps of: inputting an acoustic feature vector sequence extracted from a microphone input signal into an acoustic embedding extraction unit, a phonemic embedding extraction unit, and a decoder embedding extraction unit, which are based on a recurrent neural network (RNN); combining acoustic embedding, phonemic embedding, and decoder embedding to configure a feature vector by the acoustic embedding extraction unit, the phonemic embedding extraction unit, and the decoder embedding extraction unit; and inputting the combined feature vector into a deep neural network (DNN)-based classifier to detect a voice end point.
-
-
-
-
-
-
-
-
-