Abstract:
An electronic device includes one or more processors and a memory storing instructions configured to, when executed by the one or more processors, cause the one or more processors to: implement a machine learning-based conditional generative model configured to reconstruct target data from latent vectors, the conditional generative model trained based on an existing data set for a target task; determine an extrapolation weight; generate an augmented latent vector and augmented condition data by extrapolating, based on the extrapolation weight, from a latent vector corresponding to the existing dataset and from existing condition data corresponding to the existing dataset; and generate a new dataset comprising augmented target data generated by the conditional generative model based on the augmented condition data and based on the augmented latent vector.
Abstract:
A processor-implemented model training method and apparatus are provided. The method calculates an entropy of each of a plurality of previously trained models based on training data, selects a previously trained model from the plurality of previously trained models based on the calculated entropy, and trains a target model, distinguished from the plurality of previously trained models, based on the training data and the selected previously trained model.
Abstract:
A processor implemented operating method of a mobile device includes: verifying whether the mobile device is docked with a docking device while the mobile device is performing a personal assistance service (PAS); and continuously providing the PAS being performed using the docking device in response to the verifying indicating that the mobile device is docked with the docking device while the mobile device is performing the PAS.
Abstract:
A speech recognizing method and apparatus is provided. A speech recognizing method, implementing a speech recognizing model neural network for recognition of a speech, includes determining an attention weight based on an output value output by at least one layer of the speech recognizing model neural network at a previous time of the recognition of the speech, applying the determined attention weight to a speech signal corresponding to a current time of the recognition of the speech, and recognizing the speech signal to which the attention weight is applied, using the speech recognizing model neural network.
Abstract:
A decoding method, the method including: receiving an input sequence corresponding to an input speech at a current time; and in a neural network (NN) for speech recognition, generating an encoded vector sequence by encoding the input sequence, determining reuse tokens from candidate beams of two or more previous times by comparing the candidate beams of the previous times, and decoding one or more tokens subsequent to the reuse tokens based on the reuse tokens and the encoded vector sequence.
Abstract:
A method and computing device with classification verification is provided. A processor-implemented method includes implementing a classification neural network to generate a classification result of data input to the classification neural network by generating, with respect to the input data, intermediate hidden values of one or more hidden layers of the classification neural network, generating the classification result of the input data based on the generated intermediate hidden values, and generating a determination of a reliability of the classification result by implementing a verification neural network, input the intermediate hidden values, to generate the determination of the reliability.
Abstract:
A speech recognition method includes generating pieces of candidate text data from a speech signal of a user, determining a decoding condition corresponding to an utterance type of the user, and determining target text data among the pieces of candidate text data by performing decoding based on the determined decoding condition.
Abstract:
A method performed by a speech recognizing apparatus to recognize speech includes: obtaining a distance from the speech recognizing apparatus to a user generating a speech signal; determining a normalization value for the speech signal based on the distance; normalizing a feature vector extracted from the speech signal based on the normalization value; and performing speech recognition based on the normalized feature vector.
Abstract:
A method and apparatus for training a language model, include generating a first training feature vector sequence and a second training feature vector sequence from training data. The method is configured to perform forward estimation of a neural network based on the first training feature vector sequence, and perform backward estimation of the neural network based on the second training feature vector sequence. The method is further configured to train a language model based on a result of the forward estimation and a result of the backward estimation.
Abstract:
A processor-implemented electronic device input method includes: identifying input items configured to receive information on a displayed screen, by performing either one or both of a layout analysis and an image analysis with respect to the screen; tagging one of the input items with a text token extracted from a speech recognition result of a speech signal; and inputting the tagged text token into the one of the input items.