Psychoacoustic adaptive notch filtering
    21.
    发明授权
    Psychoacoustic adaptive notch filtering 有权
    心理声学自适应陷波滤波

    公开(公告)号:US09386371B2

    公开(公告)日:2016-07-05

    申请号:US13790666

    申请日:2013-03-08

    Applicant: Apple Inc.

    CPC classification number: H04R3/002 G10L21/0324 H04R3/00 H04R5/04

    Abstract: Improved systems and methods for psychoacoustic adaptive notch filtering are provided. By accounting for psychoacoustic properties of an audio signal as well as finer characteristics of noise which may be present in the audio signal (e.g., the shape of the spectral density of the noise), more effective strategies for dealing with undesirable components of the audio signal may be realized.

    Abstract translation: 提供了用于心理声学自适应陷波滤波的改进的系统和方法。 通过考虑音频信号的心理声学特性以及可能存在于音频信号中的更精细的噪声特性(例如,噪声的频谱密度的形状),用于处理音频信号的不期望的分量的更有效的策略 可以实现。

    SYSTEM AND METHOD OF DOUBLE TALK DETECTION WITH ACOUSTIC ECHO AND NOISE CONTROL
    22.
    发明申请
    SYSTEM AND METHOD OF DOUBLE TALK DETECTION WITH ACOUSTIC ECHO AND NOISE CONTROL 有权
    双声道检测系统和方法与声学和噪声控制

    公开(公告)号:US20160127535A1

    公开(公告)日:2016-05-05

    申请号:US14880824

    申请日:2015-10-12

    Applicant: Apple Inc.

    Abstract: System of improving sound quality includes loudspeaker, microphone, accelerometer, acoustic-echo-cancellers (AEC), and double-talk detector (DTD). Loudspeaker outputs loudspeaker signal including downlink audio signal from far-end speaker. Microphone generates microphone uplink signal and receives at least one of: near-end speaker, ambient noise, and loudspeaker signals. Accelerometer generates accelerometer-uplink signal and receives at least one of: near-end speaker, ambient noise, and loudspeaker signals. First AEC receives downlink audio, microphone-uplink and double talk control signals, and generates AEC-microphone linear echo estimate and corrected AEC-microphone uplink signal. Second AEC receives downlink audio, accelerometer uplink and double talk control signals, and generates AEC-accelerometer linear echo estimate and corrected AEC-accelerometer uplink signal. DTD receives downlink audio signal, uplink signals, corrected uplink signals, linear echo estimates, and generates double-talk control signal. Uplink audio signal including at least one of corrected microphone-uplink signal and corrected accelerometer-uplink signal is generated. Other embodiments are described.

    Abstract translation: 提高音质的系统包括扬声器,麦克风,加速度计,声回波消除器(AEC)和双通话检测器(DTD)。 扬声器输出包括来自远端扬声器的下行音频信号的扬声器信号。 麦克风产生麦克风上行链路信号,并接收以下至少一个:近端扬声器,环境噪声和扬声器信号。 加速度计产生加速度计 - 上行链路信号并接收以下至少一种:近端扬声器,环境噪声和扬声器信号。 第一AEC接收下行音频,麦克风上行和双通话控制信号,并生成AEC麦克风线性回波估计和校正的AEC麦克风上行信号。 第二个AEC接收下行音频,加速度计上行链路和双通道控制信号,并生成AEC加速度计线性回波估计和校正的AEC加速度计上行信号。 DTD接收下行链路音频信号,上行链路信号,校正的上行链路信号,线性回波估计,并产生双向对话控制信号。 产生包括校正的麦克风 - 上行链路信号和校正的加速度计 - 上行链路信号中的至少一个的上行链路音频信号。 描述其他实施例。

    Spatial Audio Controller
    23.
    发明申请

    公开(公告)号:US20250080933A1

    公开(公告)日:2025-03-06

    申请号:US18949726

    申请日:2024-11-15

    Applicant: Apple Inc.

    Abstract: A method performed a local device that is communicatively coupled with several remote devices, the method includes: receiving, from each remote device with which the local device is engaged in a communication session, an input audio stream; receiving, for each remote device, a set parameters; determining, for each input audio stream, whether the input audio stream is to be 1) rendered individually or 2) rendered as a mix of input audio streams based on the set of parameters; for each input audio stream that is determined to be rendered individually, spatially rendering the input audio stream as an individual virtual sound source that contains only that input audio stream; and for input audio streams that are determined to be rendered as the mix of input audio streams, spatially rendering the mix of input audio streams as a single virtual sound source that contains the mix of input audio streams.

    Multi-microphone speech recognition systems and related techniques

    公开(公告)号:US10304462B2

    公开(公告)日:2019-05-28

    申请号:US15871836

    申请日:2018-01-15

    Applicant: Apple Inc.

    Abstract: A speech recognition system for resolving impaired utterances can have a speech recognition engine configured to receive a plurality of representations of an utterance and concurrently to determine a plurality of highest-likelihood transcription candidates corresponding to each respective representation of the utterance. The recognition system can also have a selector configured to determine a most-likely accurate transcription from among the transcription candidates. As but one example, the plurality of representations of the utterance can be acquired by a microphone array, and beamforming techniques can generate independent streams of the utterance across various look directions using output from the microphone array.

    VOICE EFFECTS BASED ON FACIAL EXPRESSIONS
    29.
    发明申请

    公开(公告)号:US20180336716A1

    公开(公告)日:2018-11-22

    申请号:US15908603

    申请日:2018-02-28

    Applicant: Apple Inc.

    Abstract: Embodiments of the present disclosure can provide systems, methods, and computer-readable medium for adjusting audio and/or video information of a video clip based at least in part on facial feature and/or voice feature characteristics extracted from hardware components. For example, in response to detecting a request to generate an avatar video clip of a virtual avatar, a video signal associated with a face in a field of view of a camera and an audio signal may be captured. Voice feature characteristics and facial feature characteristics may be extracted from the audio signal and the video signal, respectively. In some examples, in response to detecting a request to preview the avatar video clip, an adjusted audio signal may be generated based at least in part on the facial feature characteristics and the voice feature characteristics, and a preview of the video clip of the virtual avatar using the adjusted audio signal may be displayed.

    TECHNIQUES FOR PROVIDING AUDIO AND VIDEO EFFECTS

    公开(公告)号:US20180336713A1

    公开(公告)日:2018-11-22

    申请号:US16033111

    申请日:2018-07-11

    Applicant: Apple Inc.

    Abstract: Embodiments of the present disclosure can provide systems, methods, and computer-readable medium for providing audio and/or video effects based at least in part on facial features and/or voice feature characteristics of the user. For example, video and/or an audio signal of the user may be recorded by a device. Voice audio features and facial feature characteristics may be extracted from the voice audio signal and the video, respectively. The facial features of the user may be used to modify features of a virtual avatar to emulate the facial feature characteristics of the user. The extracted voice audio features may modified to generate an adjusted audio signal or an audio signal may be composed from the voice audio features. The adjusted/composed audio signal may simulate the voice of the virtual avatar. A preview of the modified video/audio may be provided at the user's device.

Patent Agency Ranking