Auditory selection method and device based on memory and attention model

    公开(公告)号:US10818311B2

    公开(公告)日:2020-10-27

    申请号:US16632373

    申请日:2018-11-14

    Abstract: An auditory selection method based on a memory and attention model, including: step S1, encoding an original speech signal into a time-frequency matrix; step S2, encoding and transforming the time-frequency matrix to convert the matrix into a speech vector; step S3, using a long-term memory unit to store a speaker and a speech vector corresponding to the speaker; step S4, obtaining a speech vector corresponding to a target speaker, and separating a target speech from the original speech signal through an attention selection model. A storage device includes a plurality of programs stored in the storage device. The plurality of programs are configured to be loaded by a processor and execute the auditory selection method based on the memory and attention model. A processing unit includes the processor and the storage device.

    Target speaker separation system, device and storage medium

    公开(公告)号:US11978470B2

    公开(公告)日:2024-05-07

    申请号:US17980473

    申请日:2022-11-03

    CPC classification number: G10L21/028 G10L17/02 G10L17/04 G10L17/06 H04S1/007

    Abstract: Disclosed are a target speaker separation system, an electronic device and a storage medium. The system includes: first, performing, jointly unified modeling on a plurality of cues based a masked pre-training strategy, to boost the inference capability of a model for missing cues and enhance the representation accuracy of disturbed cues; and second, constructing a hierarchical cue modulation module. A spatial cue is introduced into a primary cue modulation module for directional enhancement of a speech of a speaker; in an intermediate cue modulation module, the speech of the speaker is enhanced on the basis of temporal coherence of a dynamic cue and an auditory signal component; a steady-state cue is introduced into an advanced cue modulation module for selective filtering; and finally, the supervised learning capability of simulation data and the unsupervised learning effect of real mixed data are sufficiently utilized.

    Speech extraction method, system, and device based on supervised learning auditory attention

    公开(公告)号:US10923136B2

    公开(公告)日:2021-02-16

    申请号:US16645447

    申请日:2019-04-19

    Abstract: A speech extraction method based on the supervised learning auditory attention includes: converting an original overlapping speech signal into a two-dimensional time-frequency signal representation by a short-time Fourier transform to obtain a first overlapping speech signal; performing a first sparsification on the first overlapping speech signal, mapping intensity information of a time-frequency unit of the first overlapping speech signal to preset D intensity levels, and performing a second sparsification on the first overlapping speech signal based on information of the preset D intensity levels to obtain a second overlapping speech signal; converting the second overlapping speech signal into a pulse signal by a time coding method; extracting a target pulse from the pulse signal by a trained target pulse extraction network; converting the target pulse into a time-frequency representation of the target speech to obtain the target speech by an inverse short-time Fourier transform.

Patent Agency Ranking