-
公开(公告)号:US11756574B2
公开(公告)日:2023-09-12
申请号:US17330862
申请日:2021-05-26
Applicant: Apple Inc.
Inventor: Sreeneel Maddika , Ahmed Serag El Din Hussen Abdelaziz , Chaitanya Mannemala , Srikanth Vishnubhotla , Garrett L. Weinberg
IPC: G10L25/78 , G10L25/51 , G10L21/0208 , G06V40/16
CPC classification number: G10L25/78 , G06V40/171 , G10L21/0208 , G10L25/51 , G10L2021/02082
Abstract: Systems and processes for operating an intelligent automated assistant are provided. For example, a first speech input is received from a user. In response to receiving the first speech input, a response is provided. A first output is provided corresponding to a digital assistant in a first state, and a second speech input is received from the user. A first plurality of values is obtained. Based on the first plurality of values, a first confidence level corresponding to the second speech input is obtained. In accordance with a determination that the first confidence level exceeds a first threshold confidence level, a second output is provided corresponding to the digital assistant in a second state. The second speech input continues to be received.
-
公开(公告)号:US12266380B2
公开(公告)日:2025-04-01
申请号:US18237362
申请日:2023-08-23
Applicant: Apple Inc.
Inventor: Sreeneel Maddika , Ahmed Serag El Din Hussen Abdelaziz , Chaitanya Mannemala , Srikanth Vishnubhotla , Garrett L. Weinberg
IPC: G10L25/78 , G06V40/16 , G10L21/0208 , G10L25/51
Abstract: Systems and processes for operating an intelligent automated assistant are provided. For example, a first speech input is received from a user. In response to receiving the first speech input, a response is provided. A first output is provided corresponding to a digital assistant in a first state, and a second speech input is received from the user. A first plurality of values is obtained. Based on the first plurality of values, a first confidence level corresponding to the second speech input is obtained. In accordance with a determination that the first confidence level exceeds a first threshold confidence level, a second output is provided corresponding to the digital assistant in a second state. The second speech input continues to be received.
-
公开(公告)号:US12190873B2
公开(公告)日:2025-01-07
申请号:US17952005
申请日:2022-09-23
Applicant: Apple Inc.
Inventor: Ahmed S. Hussen Abdelaziz , Saurabh Adya , Alexander W. Churchill , Pranay Dighe , Sachin S. Kajarekar , Chaitanya Mannemala , Erik Marchi , Seyedmahdad Mirsamadi , Ognjen Rudovic , Ahmed H. Tewfik , Barry-John Theobald , Srikanth Vishnubhotla
Abstract: An example process includes: receiving a speech input representing a user utterance; determining, based on a textual representation of the speech input, a first score corresponding to a type of the user utterance; determining, based on the textual representation of the speech input, a second score representing a correspondence between the user utterance and a domain recognized by a digital assistant; determining, based on the first score and the second score, whether the speech input is intended for the digital assistant; in accordance with a determination that the speech input is intended for the digital assistant: initiating, by the digital assistant, a task based on the speech input; and providing an output indicative of the initiated task.
-
公开(公告)号:US11620999B2
公开(公告)日:2023-04-04
申请号:US17123428
申请日:2020-12-16
Applicant: Apple Inc.
Inventor: Pranay Dighe , Erik Marchi , Srikanth Vishnubhotla , Sachin Kajarekar , Devang K. Naik
Abstract: An example process includes: receiving an audio stream; determining a plurality of acoustic representations of the audio stream, where each acoustic representation of the plurality of acoustic representations corresponds to a respective frame of the audio stream; obtaining a respective plurality of scores indicating whether each respective frame of the audio stream is directed to an electronic device, where the obtaining includes: determining, using a triggering model operating on the electronic device, for each acoustic representation, a score indicating whether the respective frame of the audio stream is directed to the electronic device; determining, based on the respective plurality of scores, a likelihood that the audio stream is directed to the electronic device; determining whether the likelihood is above or below a threshold; and in response to determining that the likelihood is below the threshold, ceasing to process the audio stream.
-
公开(公告)号:US12073831B1
公开(公告)日:2024-08-27
申请号:US17576419
申请日:2022-01-14
Applicant: Apple Inc.
Inventor: Saurabh Adya , Sameer Badaskar , Akanksha Bindal , Ahmed S. Hussen Abdelaziz , Xiaochuan Niu , Alkeshkumar M. Patel , Srikanth Vishnubhotla
CPC classification number: G10L15/22 , G06F18/214 , G06V10/82 , G06V20/50 , G10L15/063 , G10L15/16 , G10L15/18 , G10L15/24
Abstract: Systems and processes for operating a digital assistant are provided. An example method for processing an image include receiving an image, generating, based on the image, a question corresponding to a first object in the image, generating, based on the image, a caption corresponding to a second object of the image, receiving an utterance from a user, and determining a plurality of speech recognition results from the utterance based on the question and the caption.
-
公开(公告)号:US20200327887A1
公开(公告)日:2020-10-15
申请号:US16380504
申请日:2019-04-10
Applicant: Apple Inc.
Inventor: Sarmad Aziz Malik , Charles P. Clark , Devang K. Naik , Srikanth Vishnubhotla
Abstract: Audio signals produced by microphones can be processed to remove echo and reverberation. The processed signals can be mapped to each other with adaptively estimated impulse responses. One or more of the processed signals, one or more of the mapped signals, and one or more of the impulse responses can be fed to an automatic speech recognizer (ASR) having a deep neural network (DNN), to train the DNN or recognize speech in the input audio signals. Other aspects are described and claimed.
-
-
-
-
-