Abstract:
Methods and apparatus related to determining coreference resolution using distributed word representations. Distributed word representations, indicative of syntactic and semantic features, may be identified for one or more noun phrases. For each of the one or more noun phrases, a referring feature representation and an antecedent feature representation may be determined, where the referring feature representation includes the distributed word representation, and the antecedent feature representation includes the distributed word representation augmented by one or more antecedent features. In some implementations the referring feature representation may be augmented by one or more referring features. Coreference embeddings of the referring and antecedent feature representations of the one or more noun phrases may be learned. Distance measures between two noun phrases may be determined based on the coreference embeddings.
Abstract:
Systems and methods are disclosed for using an additive context model for entity disambiguation. An example method may include receiving a span of text from a document and a phrase vector for the span. The phrase vector may have a quantity of features and represent a context for the span. The method also includes determining a quantity of candidate entities from a knowledge base that have been referred to by the span. For each of the quantity of candidate entities, the method may include determining a support score for the candidate entity for each feature in the phrase vector, combining the support scores additively, and computing a probability that the span resolves to the candidate entity given the context. The method may also include resolving the span to a candidate entity with a highest probability.