Abstract:
A computer-implemented technique can include receiving, at a server, labeled training data including a plurality of groups of words, each group of words having a predicate word, each word having generic word embeddings. The technique can include extracting, at the server, the plurality of groups of words in a syntactic context of their predicate words. The technique can include concatenating, at the server, the generic word embeddings to create a high dimensional vector space representing features for each word. The technique can include obtaining, at the server, a model having a learned mapping from the high dimensional vector space to a low dimensional vector space and learned embeddings for each possible semantic frame in the low dimensional vector space. The technique can also include outputting, by the server, the model for storage, the model being configured to identify a specific semantic frame for an input.
Abstract:
A computer-implemented method can include obtaining (i) an aligned bi-text for a source language and a target language, and (ii) a supervised sequence model for the source language. The method can include labeling a source side of the aligned bi-text using the supervised sequence model and projecting labels from the labeled source side to a target side of the aligned bi-text to obtain a labeled target side of the aligned bi-text. The method can include filtering the labeled target side based on a task of a natural language processing (NLP) system configured to utilize a sequence model for the target language to obtain a filtered target side of the aligned bi-text. The method can also include training the sequence model for the target language using posterior regularization with soft constraints on the filtered target side to obtain a trained sequence model for the target language.
Abstract:
A computer-implemented method can include obtaining (i) an aligned bi-text for a source language and a target language, and (ii) a supervised sequence model for the source language. The method can include labeling a source side of the aligned bi-text using the supervised sequence model and projecting labels from the labeled source side to a target side of the aligned bi-text to obtain a labeled target side of the aligned bi-text. The method can include filtering the labeled target side based on a task of a natural language processing (NLP) system configured to utilize a sequence model for the target language to obtain a filtered target side of the aligned bi-text. The method can also include training the sequence model for the target language using posterior regularization with soft constraints on the filtered target side to obtain a trained sequence model for the target language.
Abstract:
A computer-implemented technique can include receiving, at a server, labeled training data including a plurality of groups of words, each group of words having a predicate word, each word having generic word embeddings. The technique can include extracting, at the server, the plurality of groups of words in a syntactic context of their predicate words. The technique can include concatenating, at the server, the generic word embeddings to create a high dimensional vector space representing features for each word. The technique can include obtaining, at the server, a model having a learned mapping from the high dimensional vector space to a low dimensional vector space and learned embeddings for each possible semantic frame in the low dimensional vector space. The technique can also include outputting, by the server, the model for storage, the model being configured to identify a specific semantic frame for an input.