Abstract:
Methods, and systems, including computer programs encoded on computer storage media for generating compressed representations from a co-occurrence matrix. A method includes obtaining a set of sub matrices of a co-occurrence matrix, where each row of the co-occurrence matrix corresponds to a feature from a first feature vocabulary and each column of the co-occurrence matrix corresponds to a feature from a second feature vocabulary; selecting a sub matrix, wherein the sub matrix is associated with a particular row block and column block of the co-occurrence matrix; assigning respective d-dimensional initial row and column embedding vectors to each row and column from the particular row and column blocks, respectively; and determining a final row embedding vector and a final column embedding vector by iteratively adjusting the initial row embedding vectors and the initial column embedding vectors using the co-occurrence matrix.
Abstract:
Methods and apparatus related to word sense disambiguation utilizing hypernyms. In some implementations, one or more senses of a word are determined based on hypernyms for the word and an association of the word to the one or more senses is stored. In some implementations, a target word in a textual segment is identified and a word sense to assign to the target word is determined based on hypernyms that are associated with the target word.