Speech recognition and codec method and apparatus, electronic device and storage medium

    公开(公告)号:US12183324B2

    公开(公告)日:2024-12-31

    申请号:US17738651

    申请日:2022-05-06

    Abstract: The present disclosure provides speech recognition and codec methods and apparatuses, an electronic device and a storage medium, and relates to the field of artificial intelligence such as intelligent speech, deep learning and natural language processing. The speech recognition method may include: acquiring an audio feature of to-be-recognized speech; encoding the audio feature to obtain an encoding feature; truncating the encoding feature to obtain continuous N feature fragments, N being a positive integer greater than one; and acquiring, for any one of the feature segments, corresponding historical feature abstraction information, encoding the feature segment in combination with the historical feature abstraction information, and decoding an encoding result to obtain a recognition result corresponding to the feature segment, wherein the historical feature abstraction information is information obtained by feature abstraction of recognized historical feature fragments.

    Method of processing audio data, electronic device and storage medium

    公开(公告)号:US11984134B2

    公开(公告)日:2024-05-14

    申请号:US18071187

    申请日:2022-11-29

    CPC classification number: G10L25/18 G10L25/30

    Abstract: A method of processing audio data, an electronic device, and a storage medium, which relates to a field of artificial intelligence, in particular to a field of speech processing technology. The method includes: processing spectral data of the audio data to obtain a first feature information; obtaining a fundamental frequency indication information according to the first feature information, wherein the fundamental frequency indication information indicates valid audio data of the first feature information and invalid audio data of the first feature information; obtaining a fundamental frequency information and a spectral energy information according to the first feature information and the fundamental frequency indication information; and obtaining a harmonic structure information of the audio data according to the fundamental frequency information and the spectral energy information.

    AUDIO RECOGNIZING METHOD, APPARATUS, DEVICE, MEDIUM AND PRODUCT

    公开(公告)号:US20230206943A1

    公开(公告)日:2023-06-29

    申请号:US17891596

    申请日:2022-08-19

    CPC classification number: G10L25/93 G10L25/03

    Abstract: An audio recognizing method, including: performing acoustic feature prediction on the audio to be recognized to obtain first audio prediction result and an acoustic feature reference quantity for predicting an audio recognition result; obtaining second audio prediction result based on the acoustic feature reference quantity; and determining the audio recognition result of the audio to be recognized based on the first audio prediction result and the second audio prediction result, the audio recognition result including unvoiced sound or voiced sound. When determining that the audio is unvoiced sound or voiced sound, the first audio prediction result obtained by performing acoustic feature prediction on the audio to be recognized is used, and the second audio prediction result is obtained in combination with other acoustic feature reference quantities, thereby making the determination result of unvoiced sound or voiced sound of the audio more accurate, to improve the audio quality in speech processing.

Patent Agency Ranking