Systems and Methods to Facilitate Local Searches via Location Disambiguation
    1.
    发明申请
    Systems and Methods to Facilitate Local Searches via Location Disambiguation 有权
    通过位置消歧来促进本地搜索的系统和方法

    公开(公告)号:US20120117007A1

    公开(公告)日:2012-05-10

    申请号:US12939898

    申请日:2010-11-04

    IPC分类号: G06F15/18 G06F17/30

    摘要: Systems and methods use machine learning techniques to resolve location ambiguity in search queries. In one aspect, a dataset generator generates a training dataset using query logs of a search engine. A training engine applies a machine learning technique to the training dataset to generate a location disambiguation model. A location disambiguation engine uses the location disambiguation model to resolve location ambiguity in subsequent search queries.

    摘要翻译: 系统和方法使用机器学习技术来解决搜索查询中的位置歧义。 一方面,数据集生成器使用搜索引擎的查询日志生成训练数据集。 培训引擎将机器学习技术应用于训练数据集,以产生位置消歧模型。 位置消歧引擎使用位置消歧模型来解决后续搜索查询中的位置模糊性。

    Systems and methods to facilitate local searches via location disambiguation
    2.
    发明授权
    Systems and methods to facilitate local searches via location disambiguation 有权
    通过位置消歧来促进本地搜索的系统和方法

    公开(公告)号:US08473433B2

    公开(公告)日:2013-06-25

    申请号:US12939898

    申请日:2010-11-04

    IPC分类号: G06F15/18

    摘要: Systems and methods use machine learning techniques to resolve location ambiguity in search queries. In one aspect, a dataset generator generates a training dataset using query logs of a search engine. A training engine applies a machine learning technique to the training dataset to generate a location disambiguation model. A location disambiguation engine uses the location disambiguation model to resolve location ambiguity in subsequent search queries.

    摘要翻译: 系统和方法使用机器学习技术来解决搜索查询中的位置歧义。 一方面,数据集生成器使用搜索引擎的查询日志生成训练数据集。 培训引擎将机器学习技术应用于训练数据集,以产生位置消歧模型。 位置消歧引擎使用位置消歧模型来解决后续搜索查询中的位置模糊性。

    Query parser derivation computing device and method for making a query parser for parsing unstructured search queries
    3.
    发明授权
    Query parser derivation computing device and method for making a query parser for parsing unstructured search queries 有权
    查询解析器导出计算设备和用于制作用于解析非结构化搜索查询的查询解析器的方法

    公开(公告)号:US09218390B2

    公开(公告)日:2015-12-22

    申请号:US13194887

    申请日:2011-07-29

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30401 G06F17/3087

    摘要: A system and method is provided which may comprise parsing an unstructured geographic web-search query into a field-based format, by utilizing conditional random fields, learned by semi-supervised automated learning, to parse structured information from the unstructured geographic web-search query. The system and method may also comprise establishing semi-supervised conditional random fields utilizing one of a rule-based finite state machine model and a statistics-based conditional random field model. Systematic geographic parsing may be used with the one of the rule-based finite state machine model and the statistics-based conditional random field model. Parsing an unstructured local geographical web-based query in local domain may be done by applying a learned model parser to the query, using at least one class-based query log from a form-based query system. The learned model parser may comprise at least one class-level n-gram language model-based feature harvested from a structured query log.

    摘要翻译: 提供了一种系统和方法,其可以包括通过利用通过半监督自动化学习学习的条件随机字段将非结构化地理网络搜索查询解析为基于字段的格式来从非结构化地理网络搜索查询中解析结构化信息 。 系统和方法还可以包括利用基于规则的有限状态机模型和基于统计的条件随机场模型之一建立半监督条件随机场。 系统地理解析可以与基于规则的有限状态机模型和基于统计的条件随机场模型之一一起使用。 在本地域中解析非结构化的本地地理网络查询可以通过使用基于表单的查询系统中至少一个基于类的查询日志将学习的模型解析器应用于查询来完成。 所学习的模型解析器可以包括从结构化查询日志中收集的至少一个基于类级别的基于n-gram语言模型的特征。

    Query Parser Derivation Computing Device and Method for Making a Query Parser for Parsing Unstructured Search Queries
    4.
    发明申请
    Query Parser Derivation Computing Device and Method for Making a Query Parser for Parsing Unstructured Search Queries 有权
    查询解析器推导计算设备和方法用于分析非结构化搜索查询的查询解析器

    公开(公告)号:US20130031113A1

    公开(公告)日:2013-01-31

    申请号:US13194887

    申请日:2011-07-29

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30401 G06F17/3087

    摘要: A system and method is provided which may comprise parsing an unstructured geographic web-search query into a field-based format, by utilizing conditional random fields, learned by semi-supervised automated learning, to parse structured information from the unstructured geographic web-search query. The system and method may also comprise establishing semi-supervised conditional random fields utilizing one of a rule-based finite state machine model and a statistics-based conditional random field model. Systematic geographic parsing may be used with the one of the rule-based finite state machine model and the statistics-based conditional random field model. Parsing an unstructured local geographical web-based query in local domain may be done by applying a learned model parser to the query, using at least one class-based query log from a form-based query system. The learned model parser may comprise at least one class-level n-gram language model-based feature harvested from a structured query log.

    摘要翻译: 提供了一种系统和方法,其可以包括通过利用通过半监督自动化学习学习的条件随机字段将非结构化地理网络搜索查询解析为基于字段的格式来从非结构化地理网络搜索查询中解析结构化信息 。 系统和方法还可以包括利用基于规则的有限状态机模型和基于统计的条件随机场模型之一建立半监督条件随机场。 系统地理解析可以与基于规则的有限状态机模型和基于统计的条件随机场模型之一一起使用。 在本地域中解析非结构化的本地地理网络查询可以通过使用基于表单的查询系统中至少一个基于类的查询日志将学习的模型解析器应用于查询来完成。 所学习的模型解析器可以包括从结构化查询日志中收集的至少一个基于类级别的基于n-gram语言模型的特征。

    Fuzzy text categorizer
    5.
    发明授权
    Fuzzy text categorizer 有权
    模糊文本分类器

    公开(公告)号:US06868411B2

    公开(公告)日:2005-03-15

    申请号:US09928619

    申请日:2001-08-13

    申请人: James G. Shanahan

    发明人: James G. Shanahan

    CPC分类号: G06F17/30707

    摘要: A text categorizer classifies a text object into one or more classes. The text categorizer includes a pre-processing module, a knowledge base, and an approximate reasoning module. The pre-processing module performs feature extraction, feature reduction, and fuzzy set generation to represent an unlabelled text object in terms of one or more fuzzy sets. The approximate reasoning module uses a measured degree of match between the one or more fuzzy set and categories represented by fuzzy rules in the knowledge base to assign labels of those categories that satisfy a selected decision making rule.

    摘要翻译: 文本分类器将文本对象分类为一个或多个类。 文本分类器包括预处理模块,知识库和近似推理模块。 预处理模块执行特征提取,特征缩减和模糊集生成,以根据一个或多个模糊集来表示未标记的文本对象。 近似推理模块使用知识库中的一个或多个模糊集和由模糊规则表示的类别之间的测量的匹配度来分配满足选择的决策规则的那些类别的标签。

    System for automatically generating queries
    7.
    发明授权
    System for automatically generating queries 无效
    用于自动生成查询的系统

    公开(公告)号:US06778979B2

    公开(公告)日:2004-08-17

    申请号:US09683235

    申请日:2001-12-05

    IPC分类号: G06F1730

    摘要: A system generates a query using an entity extractor, a categorizer, a query generator, and a short run aspect vector. The entity extractor identifies a set of entities in selected document content for searching information related thereto using an information retrieval system. The categorizer defines an organized classification of document content with each class in the organization of content having associated therewith a classification label that corresponds to a category of information in the information retrieval system. The categorizer assigns the selected document content a classification label from the organized classification of content. A query generator formulates a query that restricts a search at the information retrieval system to the category of information in the information retrieval system identified by the assigned classification label. The short length aspect vector generator generates terms for further refining the query using context information surrounding the set of entities in the selected document content.

    摘要翻译: 系统使用实体提取器,分类程序,查询生成器和短期方面向量生成查询。 实体提取器使用信息检索系统来识别所选择的文档内容中的一组实体来搜索与之相关的信息。 分类器定义文档内容的有组织分类,其中内容组织中的每个类别具有与信息检索系统中的信息类别对应的分类标签。 分类器从有组织的内容分类中分配所选择的文档内容分类标签。 查询生成器制定将信息检索系统的搜索限制为由所分配标签识别的信息检索系统中的信息类别的查询。 短长度方向矢量生成器生成用于使用围绕所选择的文档内容中的实体集合的上下文信息来进一步细化查询的术语。

    Meta-document management system with user definable personalities
    8.
    发明授权
    Meta-document management system with user definable personalities 有权
    具有用户可定义个性的元文件管理系统

    公开(公告)号:US06732090B2

    公开(公告)日:2004-05-04

    申请号:US09683236

    申请日:2001-12-05

    IPC分类号: G06F1730

    摘要: A system operates using meta-documents which include document content associated with one or more personalities. Each personality is associated with a set of document service requests. Users are provided different techniques for creating personalities and modifying existing personalities. These techniques include: the use of an algebra to tailor existing personalities, the use of a list of links or documents to create a personality, the use of predefined personalities and knowledge levels in a field to create new personalities, the use of question answering techniques, and the use of learning personalities. Specified personalities are then used to enrich document content by integrating into corresponding meta-documents the results received from their document service requests.

    摘要翻译: 系统使用包括与一个或多个个性相关联的文档内容的元文档进行操作。 每个人格都与一组文档服务请求相关联。 为用户提供了不同的技巧,用于创建个性和修改现有的个性。 这些技术包括:使用代数来定制现有人格,使用链接或文档列表创建个性,在现场使用预定义的个性和知识水平来创建新的个性,使用问答技巧 ,以及学习人士的使用。 然后,通过将文档服务请求中收到的结果集成到相应的元文档中,将指定的个性用于丰富文档内容。

    Method and apparatus for constructing a compact similarity structure and for using the same in analyzing document relevance
    9.
    发明授权
    Method and apparatus for constructing a compact similarity structure and for using the same in analyzing document relevance 失效
    用于构建紧凑型相似度结构并用于分析文档相关性的方法和装置

    公开(公告)号:US07949644B2

    公开(公告)日:2011-05-24

    申请号:US12152522

    申请日:2008-05-15

    IPC分类号: G06F7/00

    摘要: A computer-readable medium comprises data structure for providing information about levels of similarity between pairs of N documents. The data structure comprises a plurality of entries of similarity values representing levels of similarity for a plurality of pairs of the documents. Each of the similarity values represents a level of similarity of one document of a given pair relative to the other document of the given pair. The similarity value of each entry is greater than a threshold similarity value that is greater than zero. The plurality of similarity-value entries are fewer than N2−N in number if the similarity values are asymmetric with regard to document pairing, and the plurality of similarity-value entries are fewer than N 2 - N 2 in number if the similarity values are symmetric with regard to document pairing. A method and apparatus for generating the data structure are described.

    摘要翻译: 计算机可读介质包括用于提供关于N个文档对之间的相似性级别的信息的数据结构。 数据结构包括表示多对文档对象的相似度级的多个相似度条目。 每个相似度值表示给定对的一个文档相对于给定对的另一个文档的相似度级别。 每个条目的相似度值大于大于零的阈值相似度值。 如果相似性值对于文档配对是不对称的,则多个相似值条目数量少于数目中的N2-N,并且如果相似度值是相似度值,则多个相似值条目数量少于N 2 -N 2 对于文件配对。 描述了用于生成数据结构的方法和装置。

    Method and apparatus for document filtering using ensemble filters
    10.
    发明授权
    Method and apparatus for document filtering using ensemble filters 失效
    使用集成滤波器进行文档过滤的方法和装置

    公开(公告)号:US07398269B2

    公开(公告)日:2008-07-08

    申请号:US10713592

    申请日:2003-11-14

    IPC分类号: G06F7/00

    摘要: A technique for representing an information need and employing one or more filters to select documents that satisfy the represented information need, including a technique of creating filters that involves (a) dividing a set of documents into one or more subsets such that each subset can be used as the source of features for creating a filtering profile or used to set or validate the score threshold for the profile and (b) determining whether multiple profiles are required and how to combine them to create an effective filter. Multiple profiles can be incorporated into an individual filter and the individual filters combined to create an ensemble filter. Ensemble filters can then be further combined to create meta filters.

    摘要翻译: 用于表示信息的技术需要并采用一个或多个过滤器来选择满足所表示的信息的文档,包括创建过滤器的技术,该技术涉及(a)将一组文档划分成一个或多个子集,使得每个子集可以是 用作创建过滤配置文件或用于设置或验证配置文件的分数阈值的功能的来源,以及(b)确定是否需要多个配置文件,以及如何组合它们以创建有效的过滤器。 多个配置文件可以并入到单个过滤器中,并且各个过滤器组合以创建整体过滤器。 然后可以将组合过滤器进一步组合以创建元过滤器。