SPEECH RECOGNITION METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM

    公开(公告)号:US20220172707A1

    公开(公告)日:2022-06-02

    申请号:US17671548

    申请日:2022-02-14

    Abstract: A speech recognition method includes: obtaining first sample speech data corresponding to a target user and a first reference speech recognition result corresponding to the first sample speech data; obtaining a pre-update target model; inputting the first sample speech data into the pre-update target model, and performing speech recognition by using a target speech extraction model, a target feature extraction model, and a target speech recognition model, to obtain a first model output result; obtaining a target model loss value corresponding to the target feature extraction model according to the first model output result and the first reference speech recognition result; and updating a model parameter of the target feature extraction model in the pre-update target model according to the target model loss value, to obtain a post-update target model.

    METHOD AND APPARATUS FOR DISPLAYING GEOGRAPHIC LOCATION
    3.
    发明申请
    METHOD AND APPARATUS FOR DISPLAYING GEOGRAPHIC LOCATION 有权
    用于显示地理位置的方法和装置

    公开(公告)号:US20150134235A1

    公开(公告)日:2015-05-14

    申请号:US14600987

    申请日:2015-01-20

    CPC classification number: G01C21/26 G01C21/3679 G06F17/30241

    Abstract: The present disclosure relates to a method and an apparatus for displaying a geographic location. The method comprises providing a terminal device to a user, wherein the terminal device includes a processor and a screen. Through a processor of the terminal device, the method comprises receiving a positioning instruction from the user; acquiring a first location based on the positioning instruction; acquiring information of at least one point of interest (POI) associated with the first location; displaying the first location on a map displayed in a first display area on the screen; and displaying a first POI list in a second display area on the screen, wherein the first POI list includes at least one entry being displayed in a first order, each entry includes the information of a POI in the at least one POI.

    Abstract translation: 本公开涉及一种用于显示地理位置的方法和装置。 该方法包括向用户提供终端设备,其中终端设备包括处理器和屏幕。 通过终端设备的处理器,该方法包括从用户接收定位指令; 基于定位指令获取第一位置; 获取与所述第一位置相关联的至少一个兴趣点(POI)的信息; 在显示在屏幕上的第一显示区域中的地图上显示第一位置; 以及在所述屏幕上的第二显示区域中显示第一POI列表,其中所述第一POI列表包括以第一顺序显示的至少一个条目,每个条目包括所述至少一个POI中的POI的信息。

    VOICE IDENTITY FEATURE EXTRACTOR AND CLASSIFIER TRAINING

    公开(公告)号:US20220238117A1

    公开(公告)日:2022-07-28

    申请号:US17720876

    申请日:2022-04-14

    Inventor: Na LI Jun WANG

    Abstract: A voice identity feature extractor training method includes extracting a voice feature vector of training voice. The method may include determining a corresponding I-vector according to the voice feature vector of the training voice. The method may include adjusting a weight of a neural network model by using the I-vector as a first target output of the neural network model, to obtain a first neural network model. The method may include obtaining a voice feature vector of target detecting voice and determining an output result of the first neural network model for the voice feature vector of the target detecting voice. The method may include determining an I-vector latent variable. The method may include estimating a posterior mean of the I-vector latent variable, and adjusting a weight of the first neural network model using the posterior mean as a second target output, to obtain a voice identity feature extractor.

    SPEECH SEPARATION MODEL TRAINING METHOD AND APPARATUS, STORAGE MEDIUM AND COMPUTER DEVICE

    公开(公告)号:US20220172708A1

    公开(公告)日:2022-06-02

    申请号:US17672565

    申请日:2022-02-15

    Abstract: A speech separation model training method and apparatus, a computer-readable storage medium, and a computer device are provided, the method including: obtaining first audio and second audio, the first audio including target audio and having corresponding labeled audio, and the second audio including noise audio. obtaining an encoding model, an extraction model, and an initial estimation model; performing unsupervised training on the encoding model, the extraction model, and the estimation model according to the second audio, and adjusting model parameters of the extraction model and the estimation model; performing supervised training on the encoding model and the extraction model according to the first audio and the labeled audio corresponding to the first audio, and adjusting a model parameter of the encoding model; continuously performing the unsupervised training and the supervised training, so that the unsupervised training and the supervised training overlap, and the training is not finished until a training stop condition is met.

    SPEECH RECOGNITION METHOD AND APPARATUS, AND NEURAL NETWORK TRAINING METHOD AND APPARATUS

    公开(公告)号:US20220004870A1

    公开(公告)日:2022-01-06

    申请号:US17476345

    申请日:2021-09-15

    Abstract: This application provides a speech recognition and apparatus and a neural network training method and apparatus, and relates to the field of Artificial Intelligence (AI) technologies. The neural network training method is performed by an electronic device and includes: obtaining sample data, the sample data including a mixed speech spectrum and a labeled phoneme thereof; extracting a target speech spectrum from the mixed speech spectrum by using a first subnetwork; adaptively transforming the target speech spectrum by using a second subnetwork, to obtain an intermediate transition representation; performing phoneme recognition based on the intermediate transition representation by using a third subnetwork; and updating parameters of the first subnetwork, the second subnetwork, and the third subnetwork according to a result of the phoneme recognition and the labeled phoneme.

    AUDIO RECOGNITION METHOD AND SYSTEM AND MACHINE DEVICE

    公开(公告)号:US20210233513A1

    公开(公告)日:2021-07-29

    申请号:US17230515

    申请日:2021-04-14

    Abstract: A neural network training method is provided. The method includes obtaining an audio data stream, performing, for different audio data of each time frame in the audio data stream, feature extraction in each layer of a neural network, to obtain a depth feature outputted by a corresponding time frame, fusing, for a given label in labeling data, an inter-class confusion measurement index and an intra-class distance penalty value relative to the given label in a set loss function for the audio data stream through the depth feature, and updating a parameter in the neural network by using a loss function value obtained through fusion.

    MIXED SPEECH RECOGNITION METHOD AND APPARATUS, AND COMPUTER-READABLE STORAGE MEDIUM

    公开(公告)号:US20200372905A1

    公开(公告)日:2020-11-26

    申请号:US16989844

    申请日:2020-08-10

    Abstract: A mixed speech recognition method, a mixed speech recognition apparatus, and a computer-readable storage medium are provided. The mixed speech recognition method includes: monitoring an input of speech input and detecting an enrollment speech and a mixed speech; acquiring speech features of a target speaker based on the enrollment speech; and determining speech belonging to the target speaker in the mixed speech based on the speech features of the target speaker. The enrollment speech includes preset speech information, and the mixed speech is non-enrollment speech inputted after the enrollment speech.

Patent Agency Ranking