-
公开(公告)号:US11172001B1
公开(公告)日:2021-11-09
申请号:US16365476
申请日:2019-03-26
Applicant: Amazon Technologies, Inc.
Inventor: Vaidyanathan Puthucode Krishnamoorthy , Tony Roy Hardie , Rohit Lohani , Roopali Vasant Kaujalgi
Abstract: Techniques for announcing a communications session after the communications session is established between multiple user devices are described. In an example, a computer system may instruct a first user device to establish a communications session with a second user device. The computer system may receive, from the second user device, data indicating a request of the first user device for the communications session. Based at least in part on the data, the computer system may generate content associated with the first user device. The computer system may also instruct the second user device to accept the request and present the content after the communications session is established between the first user device and the second user device.
-
公开(公告)号:US11823659B2
公开(公告)日:2023-11-21
申请号:US16711046
申请日:2019-12-11
Applicant: Amazon Technologies, Inc.
Inventor: Julia Reinspach , Oleg Rokhlenko , Ramakanthachary Gottumukkala , Giovanni Clemente , Ankit Agrawal , Swayam Bhardwaj , Guy Michaeli , Vaidyanathan Puthucode Krishnamoorthy , Costantino Vlachos , Nalledath P. Vinodkrishnan , Shaun M. Vickers , Sethuraman Ramachandran , Charles C. Moore
IPC: G10L15/06 , G10L15/01 , G10L15/02 , G10L15/187 , G10L15/22
CPC classification number: G10L15/063 , G10L15/01 , G10L15/02 , G10L15/187 , G10L15/22 , G10L2015/025 , G10L2015/0635 , G10L2015/223
Abstract: A request including audio data is received from a voice-enabled device. A string of phonemes present in the utterance is determined through speech recognition. At a later time, a subsequent user input corresponding to the request may be received, in which the user input is associated with one or more text keywords. The subsequent user input may be obtained in response to an active request. Alternatively, feedback may not be actively elicited, but rather collected passively. However it is obtained, the one or more keywords associated with the subsequent user input may be associated with the string of phonemes to indicate that the user is saying or mean those words when they product that string of phonemes. A user-specific speech recognition key for the user account is then updated to associate the string of phonemes with these words. A general speech recognition model can also be trained using the association.
-
公开(公告)号:US11600260B1
公开(公告)日:2023-03-07
申请号:US17093270
申请日:2020-11-09
Applicant: Amazon Technologies, Inc.
Inventor: Vaidyanathan Puthucode Krishnamoorthy , Deepak Babu P R , Ashwin Gopinath , Sethuraman Ramachandran , Ankit Tiwari
Abstract: Devices and techniques are generally described for generating and evaluating utterances. In some examples, an utterance generation and evaluation system can receive intent data and target data. The utterance generation and evaluation system can determine related target names and related intent names and, based on the related target names and related intent names, can generate an utterance phrase. The utterance generation and evaluation system can determine a confidence score associated with the utterance phrase and, based on the confidence score, determine the utterance phrase as a recommended utterance phrase.
-
公开(公告)号:US11694682B1
公开(公告)日:2023-07-04
申请号:US16711043
申请日:2019-12-11
Applicant: Amazon Technologies, Inc.
Inventor: Julia Reinspach , Oleg Rokhlenko , Ramakanthachary Gottumukkala , Giovanni Clemente , Ankit Agrawal , Swayam Bhardwaj , Guy Michaeli , Vaidyanathan Puthucode Krishnamoorthy , Costantino Vlachos , Nalledath P. Vinodkrishnan , Shaun M. Vickers , Sethuraman Ramachandran , Charles C. Moore
IPC: G10L15/22 , G10L15/30 , G06F3/16 , G06F3/0484 , G06Q30/06 , G06F3/0481 , G10L15/19 , G10L15/18 , G06Q30/0601
CPC classification number: G10L15/22 , G06F3/0481 , G06F3/0484 , G06F3/167 , G06Q30/0625 , G10L15/1815 , G10L15/19 , G10L15/222 , G10L15/30 , G10L2015/223
Abstract: In various embodiments, a voice command is associated with a plurality of processing steps to be performed. The plurality of processing steps may include analysis of audio data using automatic speech recognition, generating and selecting a search query from the utterance text, and conducting a search of database of items using a search query. The plurality of processing steps may include additional or different steps, depending on the type of the request. In performing one or more of these processing steps, an error or ambiguity may be detected. An error or ambiguity may either halt the processing step or create more than one path of actions. A model may be used to determine if and how to request additional user input to attempt to resolve the error or ambiguity. The voice-enabled device or a second client device is then causing to output a request for the additional user input.
-
公开(公告)号:US20210183366A1
公开(公告)日:2021-06-17
申请号:US16711046
申请日:2019-12-11
Applicant: Amazon Technologies, Inc.
Inventor: Julia Reinspach , Oleg Rokhlenko , Ramakanthachary Gottumukkala , Giovanni Clemente , Ankit Agrawal , Swayam Bhardwaj , Guy Michaeli , Vaidyanathan Puthucode Krishnamoorthy , Costantino Vlachos , Nalledath P. Vinodkrishnan , Shaun M. Vickers , Sethuraman Ramachandran , Charles C. Moore
IPC: G10L15/06 , G10L15/22 , G10L15/02 , G10L15/01 , G10L15/187
Abstract: A request including audio data is received from a voice-enabled device. A string of phonemes present in the utterance is determined through speech recognition. At a later time, a subsequent user input corresponding to the request may be received, in which the user input is associated with one or more text keywords. The subsequent user input may be obtained in response to an active request. Alternatively, feedback may not be actively elicited, but rather collected passively. However it is obtained, the one or more keywords associated with the subsequent user input may be associated with the string of phonemes to indicate that the user is saying or mean those words when they product that string of phonemes. A user-specific speech recognition key for the user account is then updated to associate the string of phonemes with these words. A general speech recognition model can also be trained using the association.
-
-
-
-