Spatial characteristics of multi-channel source audio

    公开(公告)号:US11586411B2

    公开(公告)日:2023-02-21

    申请号:US17047333

    申请日:2018-08-30

    Inventor: Sunil Bharitkar

    Abstract: In some examples, an audio control system can include a first set of resources, a second set of resources and a controller. The first set of resources can generate a frequency energy band representation of a multi-channel source audio input. Additionally, the second set of resources can determine at least a value representing a strength of correlation between multiple channels of the multi-channel source audio input. Moreover, the audio output controller can determine a set of control parameters for tuning sound creation from an audio signal generator to reflect a set of spatial characteristics of the source audio input, based on the frequency energy band representation and the first value.

    Immersive audio rendering
    12.
    发明授权

    公开(公告)号:US11457329B2

    公开(公告)日:2022-09-27

    申请号:US17084319

    申请日:2020-10-29

    Inventor: Sunil Bharitkar

    Abstract: In some examples, immersive audio rendering may include determining whether an audio signal includes a first content format including stereo content, or a second content format including multichannel or object-based content. In response to a determination that the audio signal includes the first content format, the audio signal may be routed to a first block that includes a low-frequency extension and a stereo to multichannel upmix to generate a resulting audio signal. Alternatively, the audio signal may be routed to another low-frequency extension to generate the resulting audio signal. The audio signal may be further processed by performing spatial synthesis on the resulting audio signal, and crosstalk cancellation on the spatial synthesized audio signal. Further, multiband-range compression may be performed on the crosstalk cancelled audio signal, and an output stereo signal may be generated based on the multiband-range compressed audio signal.

    AUDIO SIGNAL DEREVERBERATION
    13.
    发明申请

    公开(公告)号:US20220114995A1

    公开(公告)日:2022-04-14

    申请号:US17419057

    申请日:2019-07-03

    Abstract: Audio signal dereverberation can be carried out in accordance instructions on a machine readable storage medium, using a processor. In an example, a location of a person in a room can be determined. An audio signal received from the location of the person can be captured using beamforming. Room properties can be determined based in part on a signal sweep of the room. A dereverberation parameter can be determined based in part on the location of the person and the room properties. The dereverberation parameter can be applied to the audio signal.

    ENCODED FEATURES AND RATE-BASED AUGMENTATION BASED SPEECH AUTHENTICATION

    公开(公告)号:US20210166715A1

    公开(公告)日:2021-06-03

    申请号:US16770724

    申请日:2018-02-16

    Inventor: Sunil Bharitkar

    Abstract: In some examples, with respect to encoded features and rate-based augmentation based speech authentication, a plurality of features of a registration speech signal for a user that is to be registered may be extracted. A speech rate of the registration speech signal may be modified to generate a rate-adjusted speech signal, and a plurality of features of the rate-adjusted speech signal may be extracted. The user may be registered by training, based on the plurality of extracted features of the registration speech signal and the plurality of extracted features of the rate-adjusted speech signal, a machine learning model. Further, based on the trained machine learning model, a determination may be made as to whether an authentication speech signal is authentic to authenticate the registered user.

    Crosstalk cancellation for speaker-based spatial rendering

    公开(公告)号:US10771896B2

    公开(公告)日:2020-09-08

    申请号:US16471893

    申请日:2017-04-14

    Inventor: Sunil Bharitkar

    Abstract: In some examples, crosstalk cancellation for speaker-based spatial rendering may include perceptually smoothing head-related transfer functions (HRTFs) corresponding to ipsilateral and contralateral transfer paths of sound emitted from first and second speakers to corresponding first and second destinations. The crosstalk cancellation may further include inserting an inter-aural time difference in the perceptually smoothed HRTFs corresponding to the contralateral transfer paths. A crosstalk canceller may be generated by inverting the perceptually smoothed HRTFs corresponding to the ipsilateral transfer paths and the perceptually smoothed HRTFs corresponding to the contralateral transfer paths including the inserted inter-aural time difference.

    IMMERSIVE AUDIO RENDERING
    17.
    发明申请

    公开(公告)号:US20200236488A1

    公开(公告)日:2020-07-23

    申请号:US16487882

    申请日:2017-04-28

    Inventor: Sunil Bharitkar

    Abstract: In some examples, immersive audio rendering may include determining whether an audio signal includes a first content format including stereo content, or a second content format including multichannel or object-based content. In response to a determination that the audio signal includes the first content format, the audio signal may be routed to a first block that includes a low-frequency extension and a stereo to multichannel upmix to generate a resulting audio signal. Alternatively, the audio signal may be routed to another low-frequency extension to generate the resulting audio signal. The audio signal may be further processed by performing spatial synthesis on the resulting audio signal, and crosstalk cancellation on the spatial synthesized audio signal. Further, multiband-range compression may be performed on the crosstalk cancelled audio signal, and an output stereo signal may be generated based on the multiband-range compressed audio signal.

    DOUBLE TALK DETECTORS
    18.
    发明公开

    公开(公告)号:US20230171346A1

    公开(公告)日:2023-06-01

    申请号:US17919059

    申请日:2020-04-15

    CPC classification number: H04M9/082 H04B3/234

    Abstract: In example implementations, an apparatus is provided. The apparatus includes an adaptive filter and a double talk detector in communication with the adaptive filter. The adaptive filter is to calculate a transfer function with coefficients for a particular time that is applied to an output signal of a microphone to cancel echoes caused by a reference signal in the output signal of the microphone. The double talk detector is to determine a peak of the coefficients, detect double talk based on a location of the peak of the coefficients, and transmit a pause signal to the adaptive filter in response to detection of the double talk, wherein the pause signal is to pause a calculation of updates to the coefficients by the adaptive filter.

    APPLYING DIRECTIONALITY TO AUDIO
    20.
    发明申请

    公开(公告)号:US20220101126A1

    公开(公告)日:2022-03-31

    申请号:US17426678

    申请日:2019-02-14

    Inventor: Sunil Bharitkar

    Abstract: The present disclosure describes techniques for adding a perception of directionality to audio. The method includes receiving a set of head related transfer functions (HRTFs). The method also includes training an artificial neural network based on the HRTFs to generate a trained artificial neural network, wherein the trained artificial neural network represents a subspace reconstruction model for generating interpolated HRTFs. The trained artificial neural network is generated using Bayesian optimization to determine a number of layers and a number of neurons per layer of the trained artificial neural network. The method also includes storing the trained artificial neural network, wherein the trained artificial neural network is used to reconstruct a new head related transfer function for a specified direction. The new head related transfer function is used to process an audio signal to produce a perception of directionality.

Patent Agency Ranking