Abstract:
Systems and techniques are provided for training a natural language processing model with information retrieval model annotations. A natural language processing model may be trained, through machine learning, using training examples that include part-of-speech tagging and annotations added by an information retrieval model. The natural language processing model may generate part-of-speech, parse-tree, beginning, inside, and outside label, mention chunking, and named-entity recognition predictions with confidence scores for text in the training examples. The information retrieval model annotations and part-of-speech tagging in the training example may be used to determine the accuracy of the predictions, and the natural language processing model may be adjusted. After training, the natural language processing model may be used to make predictions for novel input, such as search queries and potential search results. The search queries and potential search results may have information retrieval model annotations.
Abstract:
Methods, systems, and apparatus, including computer programs are encoded on a computer storage medium, for fake skip evaluation of synonyms. In one aspect, a method includes determining, using query log data, that a particular search result selected by a user includes a query term included in an initial search query and a particular synonym that was generated for the query term using a particular synonym rule. The particular search result is selected by the user from among search results that were generated using an initial search query and one or more revised search queries that include the particular synonym. The method further includes determining, using the query log data, that a first search result is ranked above the particular search result, and includes the particular synonym for the query term. In response to these determinations, a fake skip count is incremented for the synonym rule that corresponds to the particular synonym.