-
公开(公告)号:US12190873B2
公开(公告)日:2025-01-07
申请号:US17952005
申请日:2022-09-23
Applicant: Apple Inc.
Inventor: Ahmed S. Hussen Abdelaziz , Saurabh Adya , Alexander W. Churchill , Pranay Dighe , Sachin S. Kajarekar , Chaitanya Mannemala , Erik Marchi , Seyedmahdad Mirsamadi , Ognjen Rudovic , Ahmed H. Tewfik , Barry-John Theobald , Srikanth Vishnubhotla
Abstract: An example process includes: receiving a speech input representing a user utterance; determining, based on a textual representation of the speech input, a first score corresponding to a type of the user utterance; determining, based on the textual representation of the speech input, a second score representing a correspondence between the user utterance and a domain recognized by a digital assistant; determining, based on the first score and the second score, whether the speech input is intended for the digital assistant; in accordance with a determination that the speech input is intended for the digital assistant: initiating, by the digital assistant, a task based on the speech input; and providing an output indicative of the initiated task.
-
公开(公告)号:US11620999B2
公开(公告)日:2023-04-04
申请号:US17123428
申请日:2020-12-16
Applicant: Apple Inc.
Inventor: Pranay Dighe , Erik Marchi , Srikanth Vishnubhotla , Sachin Kajarekar , Devang K. Naik
Abstract: An example process includes: receiving an audio stream; determining a plurality of acoustic representations of the audio stream, where each acoustic representation of the plurality of acoustic representations corresponds to a respective frame of the audio stream; obtaining a respective plurality of scores indicating whether each respective frame of the audio stream is directed to an electronic device, where the obtaining includes: determining, using a triggering model operating on the electronic device, for each acoustic representation, a score indicating whether the respective frame of the audio stream is directed to the electronic device; determining, based on the respective plurality of scores, a likelihood that the audio stream is directed to the electronic device; determining whether the likelihood is above or below a threshold; and in response to determining that the likelihood is below the threshold, ceasing to process the audio stream.
-
公开(公告)号:US11423898B2
公开(公告)日:2022-08-23
申请号:US16815984
申请日:2020-03-11
Applicant: Apple Inc.
Inventor: Stephen H. Shum , Corey J. Peterson , Sachin S. Kajarekar , Benjamin S. Phipps , Erik Marchi , Jessica Peck , Anumita Biswas , Chaitanya Mannemala
Abstract: Systems and processes for operating an intelligent automated assistant are provided. An example method includes receiving, from one or more external electronic devices, a plurality of speaker profiles for a plurality of users; receiving a natural language speech input; determining, based on comparing the natural language speech input to the plurality of speaker profiles: a first likelihood that the natural language speech input corresponds to a first user of the plurality of users; and a second likelihood that the natural language speech input corresponds to a second user of the plurality of users; determining whether the first likelihood and the second likelihood are within a first threshold; and in accordance with determining that the first likelihood and the second likelihood are not within the first threshold: providing a response to the natural language speech input, the response being personalized for the first user.
-
公开(公告)号:US20200312315A1
公开(公告)日:2020-10-01
申请号:US16368403
申请日:2019-03-28
Applicant: Apple Inc.
Inventor: Feipeng Li , Mehrez Souden , Joshua D. Atkins , John Bridle , Charles P. Clark , Stephen H. Shum , Sachin S. Kajarekar , Haiying Xia , Erik Marchi
IPC: G10L15/20
Abstract: An acoustic environment aware method for selecting a high quality audio stream during multi-stream speech recognition. A number of input audio streams are processed to determine if a voice trigger is detected, and if so a voice trigger score is calculated for each stream. An acoustic environment measurement is also calculated for each audio stream. The trigger score and acoustic environment measurement are combined for each audio stream, to select as a preferred audio stream the audio stream with the highest combined score. The preferred audio stream is output to an automatic speech recognizer. Other aspects are also described and claimed.
-
-
-