METHOD FOR DISPLAYING STREAMING SPEECH RECOGNITION RESULT, ELECTRONIC DEVICE, AND STORAGE MEDIUM

    公开(公告)号:US20220068265A1

    公开(公告)日:2022-03-03

    申请号:US17521473

    申请日:2021-11-08

    Abstract: The disclosure discloses a method for displaying a streaming speech recognition result, relates to a field of speech technologies, deep learning technologies and natural language processing technologies. The method includes: obtaining a plurality of continuous speech segments of an input audio stream, and simulating an end of a target speech segment in the plurality of continuous speech segments as a sentence ending, performing feature extraction on a current speech segment to be recognized based on a first feature extraction mode when the current speech segment is the target speech segment; performing feature extraction on the current speech segment based on a second feature extraction mode when the current speech segment is not the target speech segment; and obtaining a real-time recognition result by inputting a feature sequence extracted from the current speech segment into a streaming multi-layer truncated attention model, and displaying the real-time recognition result.

    METHOD FOR TRAINING SPEECH RECOGNITION MODEL, DEVICE AND STORAGE MEDIUM

    公开(公告)号:US20220310064A1

    公开(公告)日:2022-09-29

    申请号:US17571805

    申请日:2022-01-10

    Abstract: A method for training a speech recognition model, a device and a storage medium, which relate to the field of computer technologies, and particularly to the fields of speech recognition technologies, deep learning technologies, or the like, are disclosed. The method for training a speech recognition model includes: obtaining a fusion probability of each of at least one candidate text corresponding to a speech based on an acoustic decoding model and a language model; selecting a preset number of one or more candidate texts based on the fusion probability of each of the at least one candidate text, and determining a predicted text based on the preset number of one or more candidate texts; and obtaining a loss function based on the predicted text and a standard text corresponding to the speech, and training the speech recognition model based on the loss function.

    METHOD FOR RECOGNIZING CHINESE-ENGLISH MIXED SPEECH, ELECTRONIC DEVICE, AND STORAGE MEDIUM

    公开(公告)号:US20220139369A1

    公开(公告)日:2022-05-05

    申请号:US17530276

    申请日:2021-11-18

    Abstract: A method for recognizing a Chinese-English mixed speech, includes: determining pronunciation information and scores of a language model, of speech information, in response to receiving the speech information; determining whether an English word exists in content of the speech information based on the pronunciation information; determining a Chinese word corresponding to the English word based on a preset Chinese-English mapping table in response to the English word existing in the content of the speech information, in which the Chinese-English mapping table includes a mapping relationship of at least one pair of English word and Chinese word; determining a score of the Chinese word corresponding to the English word; replacing a score of the English word in the scores of the language model with the score of the Chinese word; and obtaining a speech recognition result for the speech information based on the replaced scores of the language model.

    METHOD AND APPARATUS OF PERFORMING VOICE INTERACTION, ELECTRONIC DEVICE, AND READABLE STORAGE MEDIUM

    公开(公告)号:US20220068277A1

    公开(公告)日:2022-03-03

    申请号:US17522985

    申请日:2021-11-10

    Abstract: The present disclosure provides a method and apparatus of performing a voice interaction, an electronic device and a readable storage medium, which relates to technical fields of voice processing and deep learning. The method of performing the voice interaction includes: acquiring an audio to be recognized; obtaining a recognition result for the audio to be recognized, by using an audio recognition model, and extracting an input of an output layer of the audio recognition model in a recognition process as a recognition feature; obtaining a response confidence level according to the recognition feature; and responding to the audio to be recognized, in response to determining that the response confidence level meets a preset response condition.

Patent Agency Ranking