-
公开(公告)号:US20220108684A1
公开(公告)日:2022-04-07
申请号:US17644749
申请日:2021-12-16
Inventor: Xiaoyin FU , Mingxin LIANG , Zhijie CHEN , Qiguang ZANG , Zhengxiang JIANG , Liao ZHANG , Qi ZHANG , Lei JIA
IPC: G10L15/02 , G10L15/16 , G10L19/032
Abstract: The present disclosure provides a method of recognizing speech offline, electronic device, and a storage medium, relating to a field of artificial intelligence such as speech recognition, natural language processing, and deep learning. The method may include: decoding speech data to be recognized into a syllable recognition result; transforming the syllable recognition result into a corresponding text as a speech recognition result of the speech data.
-
2.
公开(公告)号:US20230195998A1
公开(公告)日:2023-06-22
申请号:US17952556
申请日:2022-09-26
Inventor: Yunze GAO , Xiaoping WANG , Penghao RAO , Fenfen SHENG , Mingxin LIANG
IPC: G06F40/129 , G06F40/123 , G06N5/02
CPC classification number: G06F40/129 , G06F40/123 , G06N5/022
Abstract: Disclosed are a sample generation method, a model training method, a trajectory recognition method, a device, and a medium. The method is: determining a code result of a training Chinese character according to a preset code library, where the preset code library is generated based on code characters in a five-stroke code corpus; taking the code result as a training label of the training Chinese character; and generating a training sample according to both a writing trajectory and the training label of the training Chinese character. The amount of information carried in the training sample is enriched.
-
公开(公告)号:US20230090590A1
公开(公告)日:2023-03-23
申请号:US17738651
申请日:2022-05-06
Inventor: Xiaoyin FU , Zhijie CHEN , Mingxin LIANG , Mingshun YANG , Lei JIA , Haifeng WANG
IPC: G10L15/02 , G10L15/26 , G10L15/187 , G06F16/683
Abstract: The present disclosure provides speech recognition and codec methods and apparatuses, an electronic device and a storage medium, and relates to the field of artificial intelligence such as intelligent speech, deep learning and natural language processing. The speech recognition method may include: acquiring an audio feature of to-be-recognized speech; encoding the audio feature to obtain an encoding feature; truncating the encoding feature to obtain continuous N feature fragments, N being a positive integer greater than one; and acquiring, for any one of the feature segments, corresponding historical feature abstraction information, encoding the feature segment in combination with the historical feature abstraction information, and decoding an encoding result to obtain a recognition result corresponding to the feature segment, wherein the historical feature abstraction information is information obtained by feature abstraction of recognized historical feature fragments.
-
公开(公告)号:US20220310064A1
公开(公告)日:2022-09-29
申请号:US17571805
申请日:2022-01-10
Inventor: Junyao SHAO , Xiaoyin FU , Qiguang ZANG , Zhijie CHEN , Mingxin LIANG , Huanxin ZHENG , Sheng QIAN
IPC: G10L15/06 , G10L15/183 , G10L15/16 , G10L15/28
Abstract: A method for training a speech recognition model, a device and a storage medium, which relate to the field of computer technologies, and particularly to the fields of speech recognition technologies, deep learning technologies, or the like, are disclosed. The method for training a speech recognition model includes: obtaining a fusion probability of each of at least one candidate text corresponding to a speech based on an acoustic decoding model and a language model; selecting a preset number of one or more candidate texts based on the fusion probability of each of the at least one candidate text, and determining a predicted text based on the preset number of one or more candidate texts; and obtaining a loss function based on the predicted text and a standard text corresponding to the speech, and training the speech recognition model based on the loss function.
-
-
-