Microphone array based deep learning for time-domain speech signal extraction

    公开(公告)号:US11508388B1

    公开(公告)日:2022-11-22

    申请号:US17100802

    申请日:2020-11-20

    Applicant: Apple Inc.

    Abstract: A device for processing audio signals in a time-domain includes a processor configured to receive multiple audio signals corresponding to respective microphones of at least two or more microphones of the device, at least one of the multiple audio signals comprising speech of a user of the device. The processor is configured to provide the multiple audio signals to a machine learning model, the machine learning model having been trained based at least in part on an expected position of the user of the device and expected positions of the respective microphones on the device. The processor is configured to provide an audio signal that is enhanced with respect to the speech of the user relative to the multiple audio signals, wherein the audio signal is a waveform output from the machine learning model.

    Visual content presentation with viewer position-based audio

    公开(公告)号:US12284508B2

    公开(公告)日:2025-04-22

    申请号:US18079669

    申请日:2022-12-12

    Applicant: APPLE INC.

    Abstract: Various implementations disclosed herein include devices, systems, and methods that display visual content as part of a 3D environment and add audio corresponding to the visual content. The audio may be spatialized to be from one or more audio source locations within the 3D environment. For example, a video may be presented on a virtual surface within an extended reality (XR) environment while audio associated with the video is spatialized to sound as if it is produced from an audio source location corresponding to that virtual surface. How the audio is provided may be determined based on the position of the viewer (e.g., the user or his/her device) relative to the presented visual content.

    Method and System for Selective Audio Playback on A Loudspeaker and A Headset

    公开(公告)号:US20250080911A1

    公开(公告)日:2025-03-06

    申请号:US18803081

    申请日:2024-08-13

    Applicant: Apple Inc.

    Abstract: A method that includes driving a first speaker of a first electronic device using a mix of a first audio signal and a second audio signal. The method determines that the first electronic device is within a threshold distance of a second electronic device within an environment in which the first electronic device is located, where the second electronic device includes a second speaker. Responsive to determining that the first electronic device is within the threshold distance, causing the second electronic device to playback the second audio signal through the speaker and driving the first speaker using the first audio signal instead of the mix.

    Spatial Blending of Audio
    16.
    发明公开

    公开(公告)号:US20240098442A1

    公开(公告)日:2024-03-21

    申请号:US18458077

    申请日:2023-08-29

    Applicant: Apple Inc.

    CPC classification number: H04S7/302 H04S2400/11

    Abstract: An audio processing system may obtain a size of a visual object to present to a display. The audio processing system may determine a virtual placement for each of a plurality of virtual speakers at least based on the size of the visual object. Each of the plurality of virtual speakers may be spatially rendered at each virtual placement through binaural audio, for playback through head-worn speakers. Other aspects are also described and claimed.

    Auditory origin synthesis
    19.
    发明授权

    公开(公告)号:US11758348B1

    公开(公告)日:2023-09-12

    申请号:US17570251

    申请日:2022-01-06

    Applicant: Apple Inc.

    CPC classification number: H04S7/303 H04S5/00 H04S2420/11

    Abstract: Each of a plurality of virtual loudspeaker arrays and their channels are produced, based on a corresponding microphone array and microphone signals thereof. Channels of a hallucinated loudspeaker array are determined based on the channels of the plurality of virtual loudspeaker arrays. The plurality of virtual loudspeaker arrays and the hallucinated loudspeaker array share a common geometry and orientation. Spatial audio is rendered based on the channels of the hallucinated loudspeaker array.

Patent Agency Ranking