Speech recognition and codec method and apparatus, electronic device and storage medium

    公开(公告)号:US12183324B2

    公开(公告)日:2024-12-31

    申请号:US17738651

    申请日:2022-05-06

    Abstract: The present disclosure provides speech recognition and codec methods and apparatuses, an electronic device and a storage medium, and relates to the field of artificial intelligence such as intelligent speech, deep learning and natural language processing. The speech recognition method may include: acquiring an audio feature of to-be-recognized speech; encoding the audio feature to obtain an encoding feature; truncating the encoding feature to obtain continuous N feature fragments, N being a positive integer greater than one; and acquiring, for any one of the feature segments, corresponding historical feature abstraction information, encoding the feature segment in combination with the historical feature abstraction information, and decoding an encoding result to obtain a recognition result corresponding to the feature segment, wherein the historical feature abstraction information is information obtained by feature abstraction of recognized historical feature fragments.

    Method for training a linguistic model and electronic device

    公开(公告)号:US11900918B2

    公开(公告)日:2024-02-13

    申请号:US17451380

    申请日:2021-10-19

    CPC classification number: G10L15/063 G06F40/253 G06F40/30

    Abstract: The present disclosure provides a method for training a linguistic model, related to fields of speech, natural language processing, deep learning technologies. A method includes: obtaining grammars corresponding to a plurality of sample texts and a slot value of a slot in each grammar by using semantic analysis; generating a grammar graph corresponding to each grammar based on the corresponding grammar and the slot value of the slot in the corresponding grammar; obtaining a weight of each grammar, a weight of each slot, and a weight of each slot value in each grammar graph based on the sample texts; determining at least one grammar frequency of each order based on the weight of each grammar, the weight of each slot, and the weight of each slot value in each grammar graph; and training the linguistic model based on the at least one grammar frequency of each order.

    METHOD FOR TRAINING A LINGUISTIC MODEL AND ELECTRONIC DEVICE

    公开(公告)号:US20220036880A1

    公开(公告)日:2022-02-03

    申请号:US17451380

    申请日:2021-10-19

    Abstract: The present disclosure provides a method for training a linguistic model, related to fields of speech, natural language processing, deep learning technologies. A method includes: obtaining grammars corresponding to a plurality of sample texts and a slot value of a slot in each grammar by using semantic analysis; generating a grammar graph corresponding to each grammar based on the corresponding grammar and the slot value of the slot in the corresponding grammar; obtaining a weight of each grammar, a weight of each slot, and a weight of each slot value in each grammar graph based on the sample texts; determining at least one grammar frequency of each order based on the weight of each grammar, the weight of each slot, and the weight of each slot value in each grammar graph; and training the linguistic model based on the at least one grammar frequency of each order.

    Speech recognition method and apparatus

    公开(公告)号:US12067977B2

    公开(公告)日:2024-08-20

    申请号:US17684681

    申请日:2022-03-02

    CPC classification number: G10L15/183 G06N5/048

    Abstract: The present disclosure discloses a speech recognition method and apparatus, and relates to the field of speech and deep learning technologies. A specific implementation scheme involves: acquiring candidate recognition results with first N recognition scores outputted by a speech recognition model for to-be-recognized speech, N being a positive integer greater than 1; scoring the N candidate recognition results based on pronunciation similarities between candidate recognition results and pre-collected popular entities, to obtain similarity scores of the candidate recognition results; and integrating the recognition scores and the similarity scores of the candidate recognition results to determine a recognition result corresponding to the to-be-recognized speech from the N candidate recognition results. The present disclosure can improve recognition accuracy.

Patent Agency Ranking