-
公开(公告)号:US11810578B2
公开(公告)日:2023-11-07
申请号:US17073092
申请日:2020-10-16
Applicant: Apple Inc.
Inventor: Benjamin S. Phipps , Sachin Kajarekar , Eugene Ray , Mahesh Ramaray Shanbhag , Kisun You , Patrick L. Coffman
Abstract: Systems and processes for operating an intercom system via a digital assistant are provided. The intercom system is trigger-free, in that users communicate, in real-time, via devices without employing a trigger to speak. Acoustic fingerprints are employed to associate users with devices. Acoustic fingerprints include vector embeddings of speech input in an acoustic-feature vector space. Speech heard at multiple devices, as embedded in a fingerprint, may be clustered in the vector space, and the structure of the clusters is employed to associate users and devices. Based on the fingerprints, a device is mapped to a user, and the user employs that device to participate in a conversation, via the intercom service.
-
公开(公告)号:US11620999B2
公开(公告)日:2023-04-04
申请号:US17123428
申请日:2020-12-16
Applicant: Apple Inc.
Inventor: Pranay Dighe , Erik Marchi , Srikanth Vishnubhotla , Sachin Kajarekar , Devang K. Naik
Abstract: An example process includes: receiving an audio stream; determining a plurality of acoustic representations of the audio stream, where each acoustic representation of the plurality of acoustic representations corresponds to a respective frame of the audio stream; obtaining a respective plurality of scores indicating whether each respective frame of the audio stream is directed to an electronic device, where the obtaining includes: determining, using a triggering model operating on the electronic device, for each acoustic representation, a score indicating whether the respective frame of the audio stream is directed to the electronic device; determining, based on the respective plurality of scores, a likelihood that the audio stream is directed to the electronic device; determining whether the likelihood is above or below a threshold; and in response to determining that the likelihood is below the threshold, ceasing to process the audio stream.
-
公开(公告)号:US11593984B2
公开(公告)日:2023-02-28
申请号:US17153728
申请日:2021-01-20
Applicant: Apple Inc.
Inventor: Ahmed Serag El Din Hussen Abdelaziz , Justin Binder , Sachin Kajarekar , Anushree Prasanna Kumar , Chloé Ann Seivwright
Abstract: Systems and processes for animating an avatar are provided. An example process of animating an avatar includes at an electronic device having one or more processors and memory, receiving text, determining an emotional state, and generating, using a neural network, a speech data set representing the received text and a set of parameters representing one or more movements of an avatar based on the received text and the determined emotional state.
-
公开(公告)号:US10186282B2
公开(公告)日:2019-01-22
申请号:US14701147
申请日:2015-04-30
Applicant: Apple Inc.
Inventor: Devang K. Naik , Sachin Kajarekar
Abstract: Systems and processes for robust end-pointing of speech signals using speaker recognition are provided. In one example process, a stream of audio having a spoken user request can be received. A first likelihood that the stream of audio includes user speech can be determined. A second likelihood that the stream of audio includes user speech spoken by an authorized user can be determined. A start-point or an end-point of the spoken user request can be determined based at least in part on the first likelihood and the second likelihood.
-
-
-