-
公开(公告)号:US20100179929A1
公开(公告)日:2010-07-15
申请号:US12351013
申请日:2009-01-09
申请人: Xiaoxin Yin , Vijay Ravindran Nair , Ryan Frederick Stewart , Fang Liu , Junhua Wang , Tiffany Kumi Dohzen , Yi-Min Wang
发明人: Xiaoxin Yin , Vijay Ravindran Nair , Ryan Frederick Stewart , Fang Liu , Junhua Wang , Tiffany Kumi Dohzen , Yi-Min Wang
CPC分类号: G06F17/30864
摘要: Systems and methodologies for improved query classification and processing are provided herein. As described herein, a query prediction model can be constructed from a set of training data (e.g., diagnostic data obtained from an automatic diagnostic system and/or other suitable data) using a machine learning-based technique. Subsequently upon receiving a query, a set of features corresponding to the query, such as the length and/or frequency of the query, unigram probabilities of respective words and/or groups of words in the query, presence of pre-designated words or phrases in the query, or the like, can be generated. The generated features can then be analyzed in combination with the query prediction model to classify the query by predicting whether the query is aimed at a head Uniform Resource Locator (URL) or a tail URL. Based on this prediction, an appropriate index or combination of indexes can be assigned to answer the query.
摘要翻译: 本文提供了改进的查询分类和处理的系统和方法。 如本文所述,可以使用基于机器学习的技术从一组训练数据(例如,从自动诊断系统获得的诊断数据和/或其他合适的数据)来构建查询预测模型。 随后在接收到查询后,查询对应的一组特征,诸如查询的长度和/或频率,查询中各个单词和/或单词组的单位概率,预先指定的单词或短语的存在 在查询等中可以生成。 然后可以结合查询预测模型分析生成的特征,以通过预测查询是针对头统一资源定位符(URL)还是尾URL来对查询进行分类。 基于该预测,可以分配适当的索引或索引组合来回答查询。
-
公开(公告)号:US08145622B2
公开(公告)日:2012-03-27
申请号:US12351013
申请日:2009-01-09
申请人: Xiaoxin Yin , Vijay Ravindran Nair , Ryan Frederick Stewart , Fang Liu , Junhua Wang , Tiffany Kumi Dohzen , Yi-Min Wang
发明人: Xiaoxin Yin , Vijay Ravindran Nair , Ryan Frederick Stewart , Fang Liu , Junhua Wang , Tiffany Kumi Dohzen , Yi-Min Wang
IPC分类号: G06F7/00
CPC分类号: G06F17/30864
摘要: Systems and methodologies for improved query classification and processing are provided herein. As described herein, a query prediction model can be constructed from a set of training data (e.g., diagnostic data obtained from an automatic diagnostic system and/or other suitable data) using a machine learning-based technique. Subsequently upon receiving a query, a set of features corresponding to the query, such as the length and/or frequency of the query, unigram probabilities of respective words and/or groups of words in the query, presence of pre-designated words or phrases in the query, or the like, can be generated. The generated features can then be analyzed in combination with the query prediction model to classify the query by predicting whether the query is aimed at a head Uniform Resource Locator (URL) or a tail URL. Based on this prediction, an appropriate index or combination of indexes can be assigned to answer the query.
摘要翻译: 本文提供了改进的查询分类和处理的系统和方法。 如本文所述,可以使用基于机器学习的技术从一组训练数据(例如,从自动诊断系统获得的诊断数据和/或其他合适的数据)来构建查询预测模型。 随后在接收到查询后,查询对应的一组特征,诸如查询的长度和/或频率,查询中各个单词和/或单词组的单位概率,预先指定的单词或短语的存在 在查询等中可以生成。 然后可以结合查询预测模型分析生成的特征,以通过预测查询是针对头统一资源定位符(URL)还是尾URL来对查询进行分类。 基于该预测,可以分配适当的索引或索引组合来回答查询。
-